File: JensAlfkePerformanceNotes.txt

package info (click to toggle)
sbjson 2.3.2-6
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 776 kB
  • sloc: objc: 1,035; sh: 73; makefile: 10
file content (103 lines) | stat: -rw-r--r-- 4,078 bytes parent folder | download | duplicates (15)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
A couple of emails accompanying the patches from Jens Alfke.


From: Jens Alfke <jens@mooseyard.com>
Date: 13 January 2008 01:01:10 GMT
To: stig@brautaset.org
Subject: Cocoa JSON optimizations

Stig,

Thanks for writing the JSON framework for Cocoa!

> (Note that both our libraries puts up a really bad show compared to
> all the Perl modules Marc measured. A bit embarrassing that...)

I took a look at optimizing the generation of JSON. With a couple of
tweaks I made it about 11x faster on the large test string from Yahoo.
The patch (from SVN top-of-tree) is enclosed.

The main problem was that the way the code was structured, with each
level of recursion returning its result as a string, results in large
numbers of temporary NSStrings being generated and appended to each
other. A more optimal pattern for this is to instead pass an
NSMutableString to each generator method, which it can append to. That
way a single mutable string gets re-used.

The next bottleneck was the way -[NSString JSONFragmentWithOptions:]
iterates over every character in the string. Getting each character is
slow, and appending the characters one at a time to the output is even
slower. Fortunately this process is only necessary if the string has
characters that need escaping, so I built a static NSCharacterSet of
those characters, and then I test every string to see if it contains
any of them. If not, it can simply be appended as-is.

A lot of strings were still being escaped because they contained "/"
characters. I looked at the JSON RFC, and it says that only control
characters, double-quotes and backslashes need to be escaped, so I
took out the special case for "/". That helped.

Finally, I changed the array iterations in the NSArray and
NSDictionary methods to use the new Leopard "for...in" syntax (but
only if being compiled for Leopard.) That shaved a few percent off the
time, since for...in uses a new, more efficient iteration mechanism.

I didn't benchmark with any of the pretty-printing options turned on.
The code for those could be optimized to reduce the number of string
operations; but on the other hand, any client of the library wanting
maximum performance is probably going to turn off pretty-printing
anyway!

If my enthusiasm persists, I might look at the JSON parsing code too,
but no promises :)

--Jens


From: Jens Alfke <jens@mooseyard.com>
Date: 13 January 2008 06:15:19 GMT
To: stig@brautaset.org
Subject: Re: Cocoa JSON optimizations

Stig,

As promised, I looked at the parser as well. This was trickier to
optimize, but by throwing several tricks at it I got a nice 5x
speedup.

The main problem was just that NSScanner itself is really slow, for
some reason. So part of what I did was just to call -scanString: fewer
times. I wrote a faster -scanJSONChar: method that scans for a single
character, since that's what most of the -scanString: calls were
doing.

I also changed the top-level scanning sequence so that -scanJSONValue
gets the next non-whitespace character and then tests that to see
which type of entity to scan. That avoids a bunch of repeated scans
for a '{', a '[', a '"', etc.

I re-ordered some of the logic in -scanJSONObject: to reduce the
number of strings that get scanned for.

I restructured -scanJSONString: to append substrings in chunks instead
of character-by-character. I also used a lower-level string iterator
from CFString that's faster than -characterAtIndex, as well as another
CFString function to append unichars to a string.

This was fun, actually! Kind of like solving a logic puzzle. I kept on
running 'sample' and figuring there must be one more thing I could do
to shave off more time...

I think there's a bit more room, but it would involve rewriting the
code to stop using NSScanner at all. Instead, a couple of fast scan
functions based on CFStringInlineBuffer would do the job. I think this
would also make the code clearer, since I've left it as a mishmash of
NSScanner and direct string access.

The source changes were extensive, so the entire file is probably
clearer than a diff:




--Jens