Recently, I needed to parse the JSON that the Chrome web browser produces when you record events in its dev tools, and get some timing data out of it. Chrome can produce a pretty large amount of data in a small amount of time, so the Ruby parser I originally built was quite slow.
The simplest possible form of the JSON file is what I have in this Gist. It contains an event representing the request sent to fetch a page, and the event representing the response. Typically, there's a huge amount of extra data to sift through. That's its own problem, but not what I'm worried about in this question.
Time with a 119Mb JSON file in Go:
$ time ./parse data.json = 22 Requests Min Time: 0.77 Max Time: 0.77 Average Time: 0.77 ./gm data.json 4.54s user 0.16s system 99% cpu 4.705 total
$ time node parse.js data.json = 22 Requests Min Time: 0.77 Max Time: 0.77 Avg Time: 0.77 node jm.js data.json 1.73s user 0.24s system 100% cpu 1.959 total
(The min/max/average times are all identical in this example because I duplicated JSON objects so as to have a very large data set, but that's irrelevant.)
Note that while these two scripts do more than parsing, it's definitely
json.Unmarshal() in Go that is adding lots of time in the program.
I added a Ruby script:
$ ruby parse.rb = 22 Requests Min Time: 0.77 Max Time: 0.77 Avg Time: 0.77 ruby parse.rb 4.82s user 0.82s system 99% cpu 5.658 total
With Go, you are parsing the JSON into statically-typed structures. With JS and Ruby, you are parsing it into hash tables.
In order to parse JSON into the structures that you defined, the json package needs to find out the names and types of their fields. To do this, it uses the reflect package, which is much slower than accessing those fields directly.
Depending on what you do with the data after you parse it, the extra parsing time may pay for itself. The Go data structures use less memory than hash tables, and they are much faster to access. So if you do a lot with the data, the savings on processing time may outweigh the extra parsing time.
There is currently no discussion for this recipe.
This recipe can be found in it's original form on Stack Over Flow.