Hello
This is a note how I upload Json documents to a couchDB using Webclient http://www.squeaksource.com/WebClient and a modified version of http://www.squeaksource.com/JSON JSON-jrd.28 The test case is |d r| d := Dictionary new. d at: 'title' put: 'The title of this card'. d at: 'body' put: (8820 asCharacter asString, 'aäbc', Character cr). r := WriteStream on: String new. (JsonObject newFrom: d) jsonWriteOn: r. "r contents" WebClient httpPut: 'http://192.168.0.121:5984/test/myDoc6' content: r contents type: ''. WebClient gives back as contents '{"ok":true,"id":"myDoc6","rev":"1-cef58e13534fc0fcf7f38262bc086d12"} ' To make this work I had to patch the method Json escapeForCharacter: aCharacter This was necessary because for example Json escapeForCharacter: 228 asCharacter gave back '\uE4' instead of '\u00E4' I changed the method Json escapeForCharacter to escapeForCharacter: c | index nnnn | ^ (index := c asciiValue + 1) <= escapeArray size ifTrue: [ ^ escapeArray at: index ] ifFalse: [nnnn := ((c asciiValue bitAnd: 16rFFFF) printStringBase: 16) . [nnnn size < 4] whileTrue: [nnnn := '0', nnnn]. ^ '\u', nnnn] My question: Is there a nicer way of doing nnnn := ((c asciiValue bitAnd: 16rFFFF) printStringBase: 16) . [nnnn size < 4] whileTrue: [nnnn := '0', nnnn]. ^ '\u', nnnn Basically I have to patch zeros in front of a ByteString which is too short. Regards Hannes P.S. This note does not address the issue that it would be nice to NOT escape characters which have a code >127 at all but rather keep them as Unicode characters as the Json spec allows for this. http://www.json.org/ Any Unicode character except " and \ is allowed. This is about fixing an error which comes up when posting Json objects to CouchDB. |
http://wiki.squeak.org/squeak/512
The first link is to the Databases page, which now lists CouchDB. The second page is Hannes's post made into a swiki page. This is useful stuff, so I thought I'd put it here for reference, where I and others can find it later. If Hannes finds this an appropriation of his post, I can take it down.
Chris
|
In reply to this post by Hannes Hirzel
On Wed, 12 May 2010, Hannes Hirzel wrote: > My question: > > Is there a nicer way of doing > > nnnn := ((c asciiValue bitAnd: 16rFFFF) printStringBase: 16) . > [nnnn size < 4] whileTrue: [nnnn := '0', nnnn]. > ^ '\u', nnnn > Yes there is. As I said before, check the JSON package in the SCouchDB repository (Igor's link from few days ago), where I fixed this bug. I'm kinda surprised at your insistence to use a buggy/unmaintained JSON code when you have been told several times there's one that's tested to work with CouchDB (I use it in production). rado |
On 5/12/10, radoslav hodnicak <[hidden email]> wrote:
> > > On Wed, 12 May 2010, Hannes Hirzel wrote: > >> My question: >> >> Is there a nicer way of doing >> >> nnnn := ((c asciiValue bitAnd: 16rFFFF) printStringBase: 16) . >> [nnnn size < 4] whileTrue: [nnnn := '0', nnnn]. >> ^ '\u', nnnn >> > > Yes there is. As I said before, check the JSON package in the SCouchDB > repository (Igor's link from few days ago), where I fixed this bug. I'm > kinda surprised at your insistence to use a buggy/unmaintained JSON code > when you have been told several times there's one that's tested to work > with CouchDB (I use it in production). > > rado > > Hello Rado Yes, your version of the method is nicer escapeForCharacter: c | index | ^ (index := c asciiValue + 1) <= escapeArray size ifTrue: [ ^ escapeArray at: index ] "THIS IS WROOONG!!! unicode is not 16bit wide!" ifFalse: [ ^ '\u', (((c asciiValue bitAnd: 16rFFFF) printStringBase: 16) padded: #left to: 4 with: $0) ] However your comment leads me to the non-urgent question: How would we deal with a code point >65536? Thank you for insisting that I check out your copy of the JSON package which you maintain in the SCouchDB project. The surprise on my side is that you went for creating a copy instead of putting your changes into the JSON project as it is open for everybody to write. Your copy is actually pretty hidden whereas the general JSON package is easy to find. I went through all the changes you and Igor did in the SCouchDB project and decided to fold part of them back to the JSON package http://www.squeaksource.org/JSON . I documented it on the wiki page which goes along with the JSON project. I copy it in below.*** So the updated test case for working with WebClient, JSON and the couchDB is the following. |json couchDBurl | json := JsonObject new. json title: 'The title of my note card'. json body: 'The body test text of my note card with some Unicode test characters ', (8450 asCharacter asString, 'ä.', Character cr). "Note: JsonObject behaves like a JavaScript object insofar that you can add properties to instances without the necessity that they have been declared as instance variables. But you might just as well use JsonObject like a Dictionary instead as it is a subclass of Dictionary." "create couchDB instance" couchDBurl := 'http://localhost:5984/notes'. WebClient httpPut: couchDBurl content: '' type: 'text/plain'. "Store first document" WebClient httpPut: couchDBurl, '/myNote1' content: json asJsonString type: 'text/plain'. "You get the document back with" WebClient httpGet: couchDBurl, '/myNote1' . So far so good. This solution however still escapes code points > 127. See a note on this below and more on this in an upcoming post. Regards Hannes ---------------------------------------------------------------------------------------------------------- *** JSON-hjh.32 Author: Hannes Hirzel Ancestors: JSON-rh.31 In the project SCouchDB a copy of JSON is maintained by Igor Stasenko and Radoslav Hodnicak. This merges part of the changes back, in particular SCouchDB project * JSON-Igor.Stasenko.28 * JSON-Igor.Stasenko.29 * JSON-rh.30 * JSON-rh.31 Main changes 1. JsonObject is now a subclass of Dictionary instead of Object. So there is no need to implement the Dictionary interface. 2. Fix for converting Unicode characters to \uNNNN format (missing padding to 4 characters) No further changes The SCouchDB project contains more changes in the copy of the JSON package. I did not go further in merging because in SCouchDB / JSON-rh.32 Radoslav Hodnicak introduces an instance variable 'converter' which is initialized to converter := UTF8TextConverter new Igor Stasenko, Levente Uzonyi and Hannes Hirzel agreed that the UTF8 conversion does not belong into the JSON package http://lists.squeakfoundation.org/pipermail/squeak-dev/2010-May/150497.html Levente Uzonyi: You only need to convert the characters to UTF-8, because you're sending them over the network to a server, and Unicode characters have to be converted to bytes someway. So the JSON printer shouldn't do any conversion by default except for escaping. The only problem is that escaping is not done as the spec requires it, but that's easy to fix. http://www.json.org/ A string is a collection of zero or more Unicode characters, wrapped in double quotes, using backslash escapes. A character is represented as a single character string. A string is very much like a C or Java string. About escaping Unicode characters Actually escaping Unicode characters to \uNNNN is not necessary for characters with codes >127 in case of an upload to a CouchDB. But this version does it. In case you want to patch this change method Json class escapeForCharacter: c |
Hannes, if you would be so kind, please merge your changes
with SCouchDB version of JSON and save them in SCouchDB repository. I will take a time to review them an fully integrate & fix all of the issues you mentioned, including a proper utf-8 output encoding. On 13 May 2010 03:16, Hannes Hirzel <[hidden email]> wrote: > On 5/12/10, radoslav hodnicak <[hidden email]> wrote: >> >> >> On Wed, 12 May 2010, Hannes Hirzel wrote: >> >>> My question: >>> >>> Is there a nicer way of doing >>> >>> nnnn := ((c asciiValue bitAnd: 16rFFFF) printStringBase: 16) . >>> [nnnn size < 4] whileTrue: [nnnn := '0', nnnn]. >>> ^ '\u', nnnn >>> >> >> Yes there is. As I said before, check the JSON package in the SCouchDB >> repository (Igor's link from few days ago), where I fixed this bug. I'm >> kinda surprised at your insistence to use a buggy/unmaintained JSON code >> when you have been told several times there's one that's tested to work >> with CouchDB (I use it in production). >> >> rado >> >> > > > > Hello Rado > > Yes, your version of the method is nicer > escapeForCharacter: c > > | index | > ^ (index := c asciiValue + 1) <= escapeArray size > ifTrue: [ ^ escapeArray at: index ] > > > "THIS IS WROOONG!!! unicode is not 16bit wide!" > ifFalse: [ ^ '\u', (((c asciiValue bitAnd: 16rFFFF) printStringBase: > 16) padded: #left to: 4 with: $0) ] > > However your comment leads me to the non-urgent question: How would we > deal with a code point >65536? > > Thank you for insisting that I check out your copy of the JSON package > which you maintain in the SCouchDB project. The surprise on my side is > that you went for creating a copy instead of putting your changes into > the JSON project as it is open for everybody to write. Your copy is > actually pretty hidden whereas the general JSON package is easy to > find. > > I went through all the changes you and Igor did in the SCouchDB > project and decided to fold part of them back to the JSON package > http://www.squeaksource.org/JSON . > > I documented it on the wiki page which goes along with the JSON > project. I copy it in below.*** > > > So the updated test case for working with WebClient, JSON and the > couchDB is the following. > > |json couchDBurl | > json := JsonObject new. > json title: 'The title of my note card'. > json body: 'The body test text of my note card with some Unicode > test characters ', > (8450 asCharacter asString, 'ä.', Character cr). > > "Note: JsonObject behaves like a JavaScript object insofar that you > can add properties to instances without the necessity that they have > been declared as instance variables. But you might just as well use > JsonObject like a Dictionary instead as it is a subclass of > Dictionary." > > "create couchDB instance" > couchDBurl := 'http://localhost:5984/notes'. > > WebClient httpPut: couchDBurl > content: '' > type: 'text/plain'. > > "Store first document" > WebClient httpPut: couchDBurl, '/myNote1' > content: json asJsonString > type: 'text/plain'. > > "You get the document back with" > > WebClient httpGet: couchDBurl, '/myNote1' . > > So far so good. This solution however still escapes code points > 127. > See a note on this below and more on this in an upcoming post. > > Regards > > Hannes > > > ---------------------------------------------------------------------------------------------------------- > > *** > JSON-hjh.32 > > Author: Hannes Hirzel > > Ancestors: JSON-rh.31 > > In the project SCouchDB a copy of JSON is maintained by Igor Stasenko > and Radoslav Hodnicak. > > This merges part of the changes back, in particular > > SCouchDB project > > * JSON-Igor.Stasenko.28 > * JSON-Igor.Stasenko.29 > * JSON-rh.30 > * JSON-rh.31 > > Main changes > > 1. JsonObject is now a subclass of Dictionary instead of Object. So > there is no need to implement the Dictionary interface. > 2. Fix for converting Unicode characters to \uNNNN format (missing > padding to 4 characters) > > No further changes > > The SCouchDB project contains more changes in the copy of the JSON package. > > I did not go further in merging because in SCouchDB / JSON-rh.32 > Radoslav Hodnicak introduces an instance variable 'converter' > > which is initialized to > > converter := UTF8TextConverter new > > Igor Stasenko, Levente Uzonyi and Hannes Hirzel agreed that the UTF8 > conversion does not belong into the JSON package > > http://lists.squeakfoundation.org/pipermail/squeak-dev/2010-May/150497.html > > Levente Uzonyi: > > You only need to convert the characters to UTF-8, because you're > sending them over the network to a server, and Unicode characters have > to be converted to bytes someway. So the JSON printer shouldn't do any > conversion by default except for escaping. The only problem is that > escaping is not done as the spec requires it, but that's easy to fix. > > http://www.json.org/ > > A string is a collection of zero or more Unicode characters, wrapped > in double quotes, using backslash escapes. A character is represented > as a single character string. A string is very much like a C or Java > string. > About escaping Unicode characters > > Actually escaping Unicode characters to > > \uNNNN > > is not necessary for characters with codes >127 in case of an upload > to a CouchDB. But this version does it. > > In case you want to patch this change method > > Json class escapeForCharacter: c > > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Hannes Hirzel
My comment on
SqS/JSON/JSON-hjh.32 guys, can you give me any idea, why you replaced back the stream peek / stream next by self peek / self next removed all uses of #peekFor: and added: next ^ self stream next peek ^ self stream peek you're seem little concerned with speed? |
In reply to this post by Igor Stasenko
On 5/13/10, Igor Stasenko <[hidden email]> wrote:
> Hannes, if you would be so kind, please merge your changes > with SCouchDB version of JSON > and save them in SCouchDB repository. > > I will take a time to review them an fully integrate & fix all of the issues > you > mentioned, including a proper utf-8 output encoding. Igor, I think you misunderstood me I took your changes SCouchDB project * JSON-Igor.Stasenko.28 * JSON-Igor.Stasenko.29 * JSON-rh.30 * JSON-rh.31 and folded them back into SqueakSource/JSON project --Hannes |
On 13 May 2010 03:49, Hannes Hirzel <[hidden email]> wrote:
> On 5/13/10, Igor Stasenko <[hidden email]> wrote: >> Hannes, if you would be so kind, please merge your changes >> with SCouchDB version of JSON >> and save them in SCouchDB repository. >> >> I will take a time to review them an fully integrate & fix all of the issues >> you >> mentioned, including a proper utf-8 output encoding. > > Igor, I think you misunderstood me > > I took your changes > > SCouchDB project > > * JSON-Igor.Stasenko.28 > * JSON-Igor.Stasenko.29 > * JSON-rh.30 > * JSON-rh.31 > > and folded them back into SqueakSource/JSON project > I will merge your fixes then with my package, but i won't switch to JSON repository, since i think that reverting 'stream peek' to 'self peek' is bad idea. > --Hannes > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Igor Stasenko
On 5/13/10, Igor Stasenko <[hidden email]> wrote:
> My comment on > SqS/JSON/JSON-hjh.32 > > guys, can you give me any idea, why you replaced back the > > stream peek / stream next > > by > > self peek / self next > > removed all uses of #peekFor: > and added: > > next > ^ self stream next > > peek > ^ self stream peek > > > you're seem little concerned with speed? The reason is that this change is in the following version Name: JSON-Igor.Stasenko.34 Author: Igor.Stasenko Time: 7 April 2010, 1:58:24.739 am UUID: 4a92f912-177d-5941-9a4f-a773cb11f659 Ancestors: JSON-Igor.Stasenko.33 And I did not include this version 34 yet in http://www.squeaksource.com/JSON. Thank you for pointing this out. Yes, I realized that a local upload for 10000 records resulting in a 7MB compacted couchDB was a bit slow. I did not measure it though. My estimate is that it took 50 seconds. And maybe this it not slow - I don't know. Hannes P.S. Regarding UTF8, please read my post carefully. It should not be in the JSON package. Currently all the code values > 128 are escaped so there is no need for it in case of storing documents. However I do not know yet how an elegant interface for getting it back properly should look like. |
In reply to this post by Hannes Hirzel
On Thu, 13 May 2010, Hannes Hirzel wrote:
> On 5/12/10, radoslav hodnicak <[hidden email]> wrote: >> >> >> On Wed, 12 May 2010, Hannes Hirzel wrote: >> >>> My question: >>> >>> Is there a nicer way of doing >>> >>> nnnn := ((c asciiValue bitAnd: 16rFFFF) printStringBase: 16) . >>> [nnnn size < 4] whileTrue: [nnnn := '0', nnnn]. >>> ^ '\u', nnnn >>> >> >> Yes there is. As I said before, check the JSON package in the SCouchDB >> repository (Igor's link from few days ago), where I fixed this bug. I'm >> kinda surprised at your insistence to use a buggy/unmaintained JSON code >> when you have been told several times there's one that's tested to work >> with CouchDB (I use it in production). >> >> rado >> >> > > > > Hello Rado > > Yes, your version of the method is nicer > escapeForCharacter: c > > | index | > ^ (index := c asciiValue + 1) <= escapeArray size > ifTrue: [ ^ escapeArray at: index ] > > > "THIS IS WROOONG!!! unicode is not 16bit wide!" > ifFalse: [ ^ '\u', (((c asciiValue bitAnd: 16rFFFF) printStringBase: > 16) padded: #left to: 4 with: $0) ] > > However your comment leads me to the non-urgent question: How would we > deal with a code point >65536? escaped fit into 16 bits (you can find the escaping rule in RFC 4627 if you're interested). So this implementation is wrong, because it's trying to escape everything which asciiValue is greater than 127 and will fail for values greater than 65535. This escaping is totally unnecessary, it just gives a (not so) nice slowdown. From RFC 4627: " ... All Unicode characters may be placed within the quotation marks except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F). Any character may be escaped. ... " So the best to do is: escape only $\ $" and the characters from 0 to 31. Levente > > Thank you for insisting that I check out your copy of the JSON package > which you maintain in the SCouchDB project. The surprise on my side is > that you went for creating a copy instead of putting your changes into > the JSON project as it is open for everybody to write. Your copy is > actually pretty hidden whereas the general JSON package is easy to > find. > > I went through all the changes you and Igor did in the SCouchDB > project and decided to fold part of them back to the JSON package > http://www.squeaksource.org/JSON . > > I documented it on the wiki page which goes along with the JSON > project. I copy it in below.*** > > > So the updated test case for working with WebClient, JSON and the > couchDB is the following. > > |json couchDBurl | > json := JsonObject new. > json title: 'The title of my note card'. > json body: 'The body test text of my note card with some Unicode > test characters ', > (8450 asCharacter asString, 'ä.', Character cr). > > "Note: JsonObject behaves like a JavaScript object insofar that you > can add properties to instances without the necessity that they have > been declared as instance variables. But you might just as well use > JsonObject like a Dictionary instead as it is a subclass of > Dictionary." > > "create couchDB instance" > couchDBurl := 'http://localhost:5984/notes'. > > WebClient httpPut: couchDBurl > content: '' > type: 'text/plain'. > > "Store first document" > WebClient httpPut: couchDBurl, '/myNote1' > content: json asJsonString > type: 'text/plain'. > > "You get the document back with" > > WebClient httpGet: couchDBurl, '/myNote1' . > > So far so good. This solution however still escapes code points > 127. > See a note on this below and more on this in an upcoming post. > > Regards > > Hannes > > > ---------------------------------------------------------------------------------------------------------- > > *** > JSON-hjh.32 > > Author: Hannes Hirzel > > Ancestors: JSON-rh.31 > > In the project SCouchDB a copy of JSON is maintained by Igor Stasenko > and Radoslav Hodnicak. > > This merges part of the changes back, in particular > > SCouchDB project > > * JSON-Igor.Stasenko.28 > * JSON-Igor.Stasenko.29 > * JSON-rh.30 > * JSON-rh.31 > > Main changes > > 1. JsonObject is now a subclass of Dictionary instead of Object. So > there is no need to implement the Dictionary interface. > 2. Fix for converting Unicode characters to \uNNNN format (missing > padding to 4 characters) > > No further changes > > The SCouchDB project contains more changes in the copy of the JSON package. > > I did not go further in merging because in SCouchDB / JSON-rh.32 > Radoslav Hodnicak introduces an instance variable 'converter' > > which is initialized to > > converter := UTF8TextConverter new > > Igor Stasenko, Levente Uzonyi and Hannes Hirzel agreed that the UTF8 > conversion does not belong into the JSON package > > http://lists.squeakfoundation.org/pipermail/squeak-dev/2010-May/150497.html > > Levente Uzonyi: > > You only need to convert the characters to UTF-8, because you're > sending them over the network to a server, and Unicode characters have > to be converted to bytes someway. So the JSON printer shouldn't do any > conversion by default except for escaping. The only problem is that > escaping is not done as the spec requires it, but that's easy to fix. > > http://www.json.org/ > > A string is a collection of zero or more Unicode characters, wrapped > in double quotes, using backslash escapes. A character is represented > as a single character string. A string is very much like a C or Java > string. > About escaping Unicode characters > > Actually escaping Unicode characters to > > \uNNNN > > is not necessary for characters with codes >127 in case of an upload > to a CouchDB. But this version does it. > > In case you want to patch this change method > > Json class escapeForCharacter: c > > |
In reply to this post by Hannes Hirzel
On 13 May 2010 04:02, Hannes Hirzel <[hidden email]> wrote:
> On 5/13/10, Igor Stasenko <[hidden email]> wrote: >> My comment on >> SqS/JSON/JSON-hjh.32 >> >> guys, can you give me any idea, why you replaced back the >> >> stream peek / stream next >> >> by >> >> self peek / self next >> >> removed all uses of #peekFor: >> and added: >> >> next >> ^ self stream next >> >> peek >> ^ self stream peek >> >> >> you're seem little concerned with speed? > > > The reason is that this change is in the following version > > Name: JSON-Igor.Stasenko.34 > Author: Igor.Stasenko > Time: 7 April 2010, 1:58:24.739 am > UUID: 4a92f912-177d-5941-9a4f-a773cb11f659 > Ancestors: JSON-Igor.Stasenko.33 > > And I did not include this version 34 yet in http://www.squeaksource.com/JSON. > > Thank you for pointing this out. Yes, I realized that a local upload > for 10000 records resulting in a 7MB compacted couchDB was a bit slow. > I did not measure it though. My estimate is that it took 50 seconds. > And maybe this it not slow - I don't know. > gives about 5% higher parsing speed: |json | json := JsonObject new title: 'The title of my note card'; body: 'The body test text of my note card with some Unicode test characters '; foo: 10; bar: #( 10 'twenty' #thirty ); bar1: #( 10 'twenty' #thirty ); bar2: #( 10 'twenty' #thirty ); bar3: #( 10 'twenty' #thirty ); bar4: #( 10 'twenty' #thirty ); bar5: #( 10 20 30 22 23 24 56 34 36 34 3 634 346 'twenty' #thirty ); asJsonString. [ 1000 timesRepeat: [ Json readFrom: json readStream ] ] timeToRun with self peek/next: 1500 1485 with stream peek/next: 1431 1415 Also, i found that once you put more data in it, you'll get the more difference. |json | json := JsonObject new bar: ((1 to: 1000) collect: [:i | i odd ifTrue: [i] ifFalse: [ 'x' , i asString ] ]) asJsonString. [ 100 timesRepeat: [ Json readFrom: json readStream ] ] timeToRun self peek/next 3538 stream peek/next 3294 so now its 7% > Hannes > > > P.S. Regarding UTF8, please read my post carefully. It should not be > in the JSON package. Currently all the code values > 128 are escaped > so there is no need for it in case of storing documents. However I do > not know yet how an elegant interface for getting it back properly > should look like. > > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Levente Uzonyi-2
2010/5/13 Levente Uzonyi <[hidden email]>:
>> >> Yes, your version of the method is nicer >> escapeForCharacter: c >> >> | index | >> ^ (index := c asciiValue + 1) <= escapeArray size >> ifTrue: [ ^ escapeArray at: index ] >> >> >> "THIS IS WROOONG!!! unicode is not 16bit wide!" >> ifFalse: [ ^ '\u', (((c asciiValue bitAnd: 16rFFFF) >> printStringBase: >> 16) padded: #left to: 4 with: $0) ] >> >> However your comment leads me to the non-urgent question: How would we >> deal with a code point >65536? > > Noone has to deal with those, since all characters that must be escaped fit > into 16 bits (you can find the escaping rule in RFC 4627 if you're > interested). So this implementation is wrong, because it's trying to escape > everything which asciiValue is greater than 127 and will fail for values > greater than 65535. This escaping is totally unnecessary, it just gives a > (not so) nice slowdown. > > From RFC 4627: > " > ... All Unicode characters may be placed within the > quotation marks except for the characters that must be escaped: > quotation mark, reverse solidus, and the control characters (U+0000 > through U+001F). > > Any character may be escaped. ... > " > > So the best to do is: escape only $\ $" and the characters from 0 to 31. > escapeForCharacter: c | index | ^ (index := c asciiValue + 1) <= escapeArray size ifTrue: [ ^ escapeArray at: index ] ifFalse: [ c ] > > Levente > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Igor Stasenko
Igor,
I measured it as well, see below On 5/13/10, Igor Stasenko <[hidden email]> wrote: > On 13 May 2010 04:02, Hannes Hirzel <[hidden email]> wrote: >> On 5/13/10, Igor Stasenko <[hidden email]> wrote: >>> My comment on >>> SqS/JSON/JSON-hjh.32 >>> >>> guys, can you give me any idea, why you replaced back the >>> >>> stream peek / stream next >>> >>> by >>> >>> self peek / self next >>> >>> removed all uses of #peekFor: >>> and added: >>> >>> next >>> ^ self stream next >>> >>> peek >>> ^ self stream peek >>> >>> >>> you're seem little concerned with speed? >> > I measured , for what it worth, on a given json, a stream peek vesion > gives about 5% higher parsing speed: > > |json | > json := JsonObject new > title: 'The title of my note card'; > body: 'The body test text of my note card with some Unicode test > characters '; > foo: 10; > bar: #( 10 'twenty' #thirty ); > bar1: #( 10 'twenty' #thirty ); > bar2: #( 10 'twenty' #thirty ); > bar3: #( 10 'twenty' #thirty ); > bar4: #( 10 'twenty' #thirty ); > bar5: #( 10 20 30 22 23 24 56 34 36 34 3 634 346 'twenty' #thirty ); > asJsonString. > > [ 1000 timesRepeat: [ Json readFrom: json readStream ] ] timeToRun > > with self peek/next: > 1500 > 1485 > > with stream peek/next: > > 1431 > 1415 > > Also, i found that once you put more data in it, you'll get the more > difference. > |json | > json := JsonObject new > bar: ((1 to: 1000) collect: [:i | i odd ifTrue: [i] ifFalse: [ 'x' , > i asString ] ]) > asJsonString. > > [ 100 timesRepeat: [ Json readFrom: json readStream ] ] timeToRun > > self peek/next > 3538 > stream peek/next > 3294 > > so now its 7% > I did a test with actually uploading data to a couchDB |json couchDBurl | json := JsonObject new. json title: 'The title of my note card'. json body: 'The body test text of my note card with some Unicode test characters ', (8450 asCharacter asString, 'ä.', Character cr). json myTestArray: ((1 to: 1000) collect: [:i | i odd ifTrue: [i] ifFalse: [ 'x' , i asString ] ]). "Note: JsonObject behaves like a JavaScript object insofar that you can add properties to instances without the necessity that they have been declared as instance variables. But you might just as well use JsonObject like a Dictionary instead as it is a subclass of Dictionary." "create couchDB instance" couchDBurl := 'http://localhost:5984/notes'. WebClient httpPut: couchDBurl content: '' type: 'text/plain'. "Store first document" [1 to: 1000 do: [ :i | WebClient httpPut: couchDBurl, '/myNote', i printString content: json asJsonString type: 'text/plain'.]] timeToRun printString. With the speedup 8979 10854 (I measured it two times) Without the speedup 13752 So it is worth going for it. I commited http://www.squeaksource.com/JSON/JSON-hjh.34.mcz It contains stream peek instead of self peek Regards Hannes |
i tried to compare speed of two backends (SCouchDb and WebClient)
to find a winner.. but unfortunately WebClient stops with error, while mine works ok. Here a doit (maybe Andreas could say something about it): -------------- |json couchDBurl | json := JsonObject new. json title: 'The title of my note card'. json body: 'The body test text of my note card with some Unicode test characters ', (8450 asCharacter asString, 'ä.', Character cr). json myTestArray: ((1 to: 1000) collect: [:i | i odd ifTrue: [i] ifFalse: [ 'x' , i asString ] ]). "create couchDB instance" couchDBurl := 'http://192.168.0.11:5984/foo'. WebClient httpPut: couchDBurl content: '' type: 'text/plain'. "Store first document" [1 to: 1000 do: [ :i | WebClient httpPut: couchDBurl, '/myNote', i printString content: json asJsonString type: 'text/plain'.]] timeToRun printString. ------------------- i thought that maybe its because some recent updates to SocketStream, Andreas mentioned. I updated image to recent trunk (to 10143 now), but still i get a walkback, when WebResponse trying to read a response header in #readFrom: aStream status := stream upToAll: String crlf. and got status = ''. i running it on windoze. And here the doit, which works: ------------ |json db time | json := JsonObject new. json title: 'The title of my note card'. json body: 'The body test text of my note card with some Unicode test characters ', (8450 asCharacter asString, 'ä.', Character cr). json myTestArray: ((1 to: 1000) collect: [:i | i odd ifTrue: [i] ifFalse: [ 'x' , i asString ] ]). db := SCouchDBAdaptor new host: '192.168.0.11'; ensureDatabase: 'foo'. time := [1 to: 1000 do: [ :i | db documentAt: 'myNote', i printString put: json ]] timeToRun printString. db adaptor deleteDatabase: 'foo'. time --------------- -- Best regards, Igor Stasenko AKA sig. |
Free forum by Nabble | Edit this page |