On 5/10/2010 5:25 AM, Hannes Hirzel wrote:
> I have a similar problem, I want to post a document to a couchDB > instance. Without German umlauts it works fine but if I add them I get > an error message. > > host :='<a href="http://localhost:5984'">http://localhost:5984'. > r := WriteStream on: (String new: 1000). > d := Dictionary new. d at: 'content' put: 'äs' squeakToUtf8. > (JsonObject newFrom: d) jsonWriteOn: r. > WebClient httpPut: host, '/notes/test5' content: r contents type: > 'text/plain;charset=utf-8'. > > The error message of the WebResponse is > '{"error":"bad_request","reason":"invalid UTF-8 JSON"} > ' > > The same thing happens if I omit #squeakToUtf8 > > If I inspect > r contents > I get > '{"content":"\uC3\uA4s"}' That's interesting. If this happens when you *omit* squeakToUtf8 then it's already being encoded: 'äs' squeakToUtf8 asByteArray hex => 'c3a473' So 16rC3 16rA4 are the correct values for utf8-encoded 'ä'. > How is this conversion supposed to be done? Basically yes, but only if your input isn't already encoded. Whether that's the case I don't know, you might check with the JSON library author. Cheers, - Andreas |
In reply to this post by Hannes Hirzel
On Mon, 10 May 2010, Hannes Hirzel wrote:
> On 5/10/10, Levente Uzonyi <[hidden email]> wrote: >> On Mon, 10 May 2010, Hannes Hirzel wrote: >> >>> Unfortunately UTF8TextConverter cannot deal with non-Latin1 >>> characters. So it's usefulness is limited. >> >> UTF8TextConverter can deal with non-latin1 characters. I >> think you're trying to pass a WideString to #encodeByteString: which >> obviously doesn't work. >> >> >> Levente >> > > Yes I am passing aWideString to > #encodeByteString: > > as this is the only conversion method UTF8TextConverter. > > And you're right I should pass a ByteString. > > However as the case > ('ä', 8220 asCharacter asString) asByteString "A" > shows in comparison to > ('ä', 65 asCharacter asString) asByteString "B" > > I get only in case "B" a ByteString, in case "A" it remains a WideString. > > So the question is: How do I convert a WideString to UTF8 as > UTF8TextConverter is limited to code points from 0...255 and I want > the full Unicode range? 'äbc' squeakToUtf8. 'äbc' convertToEncoding: 'utf-8'. 'äbc' convertToWithConverter: UTF8TextConverter new. UTF8TextConverter new encodeString: 'äbc'. Levente > > Or put the question otherwise: Is there a textconverter which > implements the following algorithm > http://dsc.sun.com/dev/gadc/technicalpublications/articles/utf8.html > > -Hannes > > |
In reply to this post by Hannes Hirzel
Which JSON package/version are you using? I fixed a bug in the one distributed with SCouchDB few weeks ago, where it didn't encode utf8 characters properly - the correct escaped form is \uNNNN - always padded to 4 Ns. that's why you get that warning, yours is only 2-3 rado On Mon, 10 May 2010, Hannes Hirzel wrote: > The test case made simpler > > WebClient httpPut: host, '/notes/test7' content: > '{"content":"\uC3\uA4s"}' type: 'text/plain'. > > gives back as answer: '{"error":"bad_request","reason":"invalid UTF-8 JSON"} > ' > > whereas > > WebClient httpPut: host, '/notes/test8' content: '{"content":"abc"}' > type: 'text/plain'. > > gives back > '{"ok":true,"id":"test8","rev":"1-f40e52919735ae6775af3d388361b3da"} > ' > > --Hannes |
In reply to this post by Andreas.Raab
2010/5/10 Andreas Raab <[hidden email]>:
> Thanks for testing. The issue should be fixed in the repository in > WebClient-Core-ar.18. Thanks for looking into it. The issue seems to be fixed with WebClient-Core-ar.18 I'm sure this is all due some mistake on my part, but with an updated plain 4.1 image I have no trouble to use tinyproxy (on my linux machine) or squid (corporate) as a proxy server. With WebClient installed, only tinyproxy seems to work where squid will give me a bad request error. I can't look into the squid log files, but can compare the log of tinyproxy for a 4.1 image and one image that contains WebClient: With 4.1 image: CONNECT May 11 11:15:39 [1167]: Request (file descriptor 6): GET http://source.squeak.org/trunk/?C=M;O=D HTTP/1.0 With image + WebClient: CONNECT May 11 11:16:25 [1171]: Request (file descriptor 6): GET /trunk/?C=M;O=D HTTP/1.1 INFO May 11 11:16:25 [1171]: process_request: trans Host GET http://source.squeak.org:80/trunk/?C=M;O=D for 6 So maybe tinyproxy is doing the right thing and the squid proxy is mis configured/broken. But I think it would be great if WebClient could cope with such environments as well as the inferior HTTPSocket does. Alex |
> So maybe tinyproxy is doing the right thing and the squid proxy is mis
> configured/broken. But I think it would be great if WebClient could > cope with such environments as well as the inferior HTTPSocket does. I guess the difference comes from tinyproxys ability to behave like a transparent proxy and squid playing by the rules, when I do read 5.1.2 of RFC2616 on HTTP/1.1 correctly: "The absoluteURI form is REQUIRED when the request is being made to a proxy." Alex |
In reply to this post by radoslav hodnicak
On 5/10/10, radoslav hodnicak <[hidden email]> wrote:
> > Which JSON package/version are you using? I fixed a bug in the one > distributed with SCouchDB few weeks ago, where it didn't encode utf8 > characters properly - the correct escaped form is \uNNNN - always padded > to 4 Ns. that's why you get that warning, yours is only 2-3 > > rado I have been using http://www.squeaksource.com/JSON (over 7000 downloads) in combination with WebClient. Thank you Rado, I found http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz and will have a look at it. (Your comment: added handling of utf8 encoded input data - this is necessary for couchdb-lucene which sends results directly in utf8 and not \uNNNN encoded) --Hannes > On Mon, 10 May 2010, Hannes Hirzel wrote: > >> The test case made simpler >> >> WebClient httpPut: host, '/notes/test7' content: >> '{"content":"\uC3\uA4s"}' type: 'text/plain'. >> >> gives back as answer: '{"error":"bad_request","reason":"invalid UTF-8 >> JSON"} >> ' >> >> whereas >> >> WebClient httpPut: host, '/notes/test8' content: '{"content":"abc"}' >> type: 'text/plain'. >> >> gives back >> '{"ok":true,"id":"test8","rev":"1-f40e52919735ae6775af3d388361b3da"} >> ' >> >> --Hannes > > |
In reply to this post by Andreas.Raab
Hi, Andreas,
just one feature request: support streaming in WebClient, i.e. apart from being able to fetch a response content in a single blob, also provide an API, which would enable WebClient user to read content step by step, by using stream protocol, i.e. #next, #next: etc. I see you having a #streamFrom:to:size:progress: in WebMessage, but its a little bit too high level (requires an output stream and includes a progress block) and its end consumer is a #content message. It would be nice to have something like: response := WebClient httpGet: 'http://foo.bar'. content := response content. "read all at once". contentStream := response contentStream. "read content using a stream" [contentStream atEnd] whileFalse: [ c := contentStream next. .... ]. There are some special uses of HTTP protocol which establishing a permanent socket connection and then sending a content in a small portions piece by piece over a time. Obviously, with such kind of connection, if you try to get all content at once (by using #content) you'll never get it to the end, because there's always more to read, if you wait long enough. But if you allow a streaming, then user could read content by portions and handle portions step by step and don't have to wait till all content arrives. -- Best regards, Igor Stasenko AKA sig. |
On 11 May 2010 17:52, Igor Stasenko <[hidden email]> wrote:
> Hi, Andreas, > > just one feature request: support streaming in WebClient, > i.e. apart from being able to fetch a response content in a single blob, > also provide an API, which would enable WebClient user to read content > step by step, > by using stream protocol, i.e. #next, #next: etc. > > I see you having a #streamFrom:to:size:progress: > in WebMessage, but its a little bit too high level (requires an output > stream and includes a progress block) > and its end consumer is a #content message. > > It would be nice to have something like: > > response := WebClient httpGet: 'http://foo.bar'. > > content := response content. "read all at once". > fetch the content in a single blob, while following lines showing how to fetch content using stream. These two ways, obviously, should not be used simultaneously. > contentStream := response contentStream. "read content using a stream" > [contentStream atEnd] whileFalse: [ > c := contentStream next. .... > ]. > > There are some special uses of HTTP protocol which establishing a > permanent socket connection > and then sending a content in a small portions piece by piece over a time. > Obviously, with such kind of connection, if you try to get all content > at once (by using #content) you'll never get it to the end, because > there's always more to read, if you wait long enough. > But if you allow a streaming, then user could read content by portions > and handle portions step by step and don't have to wait till all > content arrives. > > -- > Best regards, > Igor Stasenko AKA sig. > -- Best regards, Igor Stasenko AKA sig. |
It would be also nice to have support of other methods, as specified
in HTTP 1.1 protocol: http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html Then i would be able to use WebClient as a backend in SCouchDB project, which currently using own subset of HTTP. -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Hannes Hirzel
On 11 May 2010 17:44, Hannes Hirzel <[hidden email]> wrote:
> On 5/10/10, radoslav hodnicak <[hidden email]> wrote: >> >> Which JSON package/version are you using? I fixed a bug in the one >> distributed with SCouchDB few weeks ago, where it didn't encode utf8 >> characters properly - the correct escaped form is \uNNNN - always padded >> to 4 Ns. that's why you get that warning, yours is only 2-3 >> >> rado > > I have been using > http://www.squeaksource.com/JSON (over 7000 downloads) > in combination with WebClient. > > Thank you Rado, I found > http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz > and will have a look at it. > (Your comment: added handling of utf8 encoded input data - this is > necessary for couchdb-lucene which sends results directly in utf8 and > not \uNNNN encoded) > SCouchDB repository http://www.squeaksource.com/SCouchDB/JSON-Igor.Stasenko.34.mcz If you looking for that method, it can be found in Json>>unescapeUnicode > --Hannes > > >> On Mon, 10 May 2010, Hannes Hirzel wrote: >> >>> The test case made simpler >>> >>> WebClient httpPut: host, '/notes/test7' content: >>> '{"content":"\uC3\uA4s"}' type: 'text/plain'. >>> >>> gives back as answer: '{"error":"bad_request","reason":"invalid UTF-8 >>> JSON"} >>> ' >>> >>> whereas >>> >>> WebClient httpPut: host, '/notes/test8' content: '{"content":"abc"}' >>> type: 'text/plain'. >>> >>> gives back >>> '{"ok":true,"id":"test8","rev":"1-f40e52919735ae6775af3d388361b3da"} >>> ' >>> >>> --Hannes >> >> > > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by laza
On 5/11/2010 3:04 AM, Alexander Lazarević wrote:
>> So maybe tinyproxy is doing the right thing and the squid proxy is mis >> configured/broken. But I think it would be great if WebClient could >> cope with such environments as well as the inferior HTTPSocket does. > > I guess the difference comes from tinyproxys ability to behave like a > transparent proxy and squid playing by the rules, when I do read 5.1.2 > of RFC2616 on HTTP/1.1 correctly: > > "The absoluteURI form is REQUIRED when the request is being made to a proxy." Oh, wow. There's never a shortage on HTTP surprises. I always thought that the Host header is sufficient (and it probably is :-) but never mind that. In any case, can you test the latest update to WebClient? BTW, thank you *so much* for testing this with Squid. Proxies are weird, finding people who can give this all a good workout is really helpful! Cheers, - Andreas |
In reply to this post by Igor Stasenko
On 5/11/2010 7:52 AM, Igor Stasenko wrote:
> just one feature request: support streaming in WebClient, > i.e. apart from being able to fetch a response content in a single blob, > also provide an API, which would enable WebClient user to read content > step by step, by using stream protocol, i.e. #next, #next: etc. > > I see you having a #streamFrom:to:size:progress: > in WebMessage, but its a little bit too high level (requires an output > stream and includes a progress block) > and its end consumer is a #content message. > > It would be nice to have something like: > > response := WebClient httpGet: 'http://foo.bar'. > > content := response content. "read all at once". That works out of the box. > contentStream := response contentStream. "read content using a stream" > [contentStream atEnd] whileFalse: [ > c := contentStream next. .... > ]. And that works out of the box, too :-) With a small modification. The convenience APIs on the class side close the client and prefetch the response. In other words, in order to stream you need to write, e.g, client := WebClient new. [resp := client httpGet:'http://www.squeak.org'. length := resp contentLength. stream := resp contentStream. [length > 0] whileTrue:[ stream next: (length min: 100). ]] ensure:[stream close]. A couple of things to keep in mind though: The above doesn't deal with the server returning an HTTP/1.0 response (no content-length) or other specialties like chunked-encoding content transfer. That's why the high-level functions are advantageous because they deal with all of that. Cheers, - Andreas |
In reply to this post by Igor Stasenko
On 5/11/2010 8:09 AM, Igor Stasenko wrote:
> It would be also nice to have support of other methods, as specified > in HTTP 1.1 protocol: > http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html > > Then i would be able to use WebClient as a backend in SCouchDB project, > which currently using own subset of HTTP. What other methods do you need? There should be no problem adding any I just had no need for them initially. Cheers, - Andreas |
In reply to this post by Andreas.Raab
On 11 May 2010 19:21, Andreas Raab <[hidden email]> wrote:
> On 5/11/2010 7:52 AM, Igor Stasenko wrote: >> >> just one feature request: support streaming in WebClient, >> i.e. apart from being able to fetch a response content in a single blob, >> also provide an API, which would enable WebClient user to read content >> step by step, by using stream protocol, i.e. #next, #next: etc. >> >> I see you having a #streamFrom:to:size:progress: >> in WebMessage, but its a little bit too high level (requires an output >> stream and includes a progress block) >> and its end consumer is a #content message. >> >> It would be nice to have something like: >> >> response := WebClient httpGet: 'http://foo.bar'. >> >> content := response content. "read all at once". > > That works out of the box. > >> contentStream := response contentStream. "read content using a stream" >> [contentStream atEnd] whileFalse: [ >> c := contentStream next. .... >> ]. > > And that works out of the box, too :-) With a small modification. The > convenience APIs on the class side close the client and prefetch the > response. In other words, in order to stream you need to write, e.g, > > client := WebClient new. > [resp := client httpGet:'http://www.squeak.org'. > length := resp contentLength. > stream := resp contentStream. > [length > 0] whileTrue:[ > stream next: (length min: 100). > ]] ensure:[stream close]. > But i'd like to note, that content length is not mandatory field, and your example won't work for a content with undetermined length. > A couple of things to keep in mind though: The above doesn't deal with the > server returning an HTTP/1.0 response (no content-length) or other > specialties like chunked-encoding content transfer. That's why the > high-level functions are advantageous because they deal with all of that. > In SCouchDB I dealing with chunked-encoding content by wrapping the original stream with chunked stream which deals with this transparently, and so the end user don't have to deal with it by itself. Then, similarily, for utf-8 content, i also just wrapping a source stream with own utf8-decoder stream, and so, the content consumer don't have to care about it by itself, all it needs to know to use stream protocol. So, stream-wrappers is a powerful abstraction, alas, the current Stream package lack of good support for 'wrapper stream' such kind of abstraction. :( > Cheers, > - Andreas > > -- Best regards, Igor Stasenko AKA sig. |
On 5/11/2010 10:10 AM, Igor Stasenko wrote:
> On 11 May 2010 19:21, Andreas Raab<[hidden email]> wrote: >> And that works out of the box, too :-) With a small modification. The >> convenience APIs on the class side close the client and prefetch the >> response. In other words, in order to stream you need to write, e.g, >> >> client := WebClient new. >> [resp := client httpGet:'http://www.squeak.org'. >> length := resp contentLength. >> stream := resp contentStream. >> [length> 0] whileTrue:[ >> stream next: (length min: 100). >> ]] ensure:[stream close]. >> > good. Its quite similar to what i did in SCouchDB. > > But i'd like to note, that content length is not mandatory field, and > your example won't work for a content with undetermined length. Of course. It's an example after all. That's why I said that if you need fully featured stream support you should be using the high-level functions provided by WebClient which deal with these issues. Cheers, - Andreas >> A couple of things to keep in mind though: The above doesn't deal with the >> server returning an HTTP/1.0 response (no content-length) or other >> specialties like chunked-encoding content transfer. That's why the >> high-level functions are advantageous because they deal with all of that. >> > > In SCouchDB I dealing with chunked-encoding content by wrapping the > original stream with chunked stream > which deals with this transparently, and so the end user don't have to > deal with it by itself. > Then, similarily, for utf-8 content, i also just wrapping a source > stream with own utf8-decoder stream, > and so, the content consumer don't have to care about it by itself, > all it needs to know to use stream protocol. > > So, stream-wrappers is a powerful abstraction, alas, the current > Stream package lack of good support for 'wrapper stream' such kind of > abstraction. :( > >> Cheers, >> - Andreas >> >> > |
In reply to this post by Igor Stasenko
1) UFT8 conversion
2) Change to JSON package of Tony Garnock-Jones 3) My updated Test case 4) Conclusion 1) UFT8 conversion My question was: How do I convert a WideString to UTF8? Levente answered: There are various possibilities: 'äbc' squeakToUtf8. 'äbc' convertToEncoding: 'utf-8'. 'äbc' convertToWithConverter: UTF8TextConverter new. UTF8TextConverter new encodeString: 'äbc'. 2) Change to JSON package of Tony Garnock-Jones As CouchDB stores UTF8 values I did not want to escape them with \uNNNN as the forked JSON package in SCouchDB does. But instead I wanted to keep UTF8 in the db. As Rado pointed out the UFT8 conversion is not correct in the original JSON package. So I did the following correction. In the class String - category *JSON-writing (from package http://www.squeaksource.com/JSON) I replaced jsonWriteOn: aStream | replacement | aStream nextPut: $". self do: [ :ch | (replacement := Json escapeForCharacter: ch) "***" ifNil: [ aStream nextPut: ch ] ifNotNil: [ aStream nextPutAll: replacement ] ]. aStream nextPut: $". WITH jsonWriteOn: aStream aStream nextPut: $". aStream nextPutAll: (UTF8TextConverter new encodeString: self). aStream nextPut: $". "*** NOTE: escapeForCharacter is incorrectly implemented in http://www.squeaksource.com/JSON and is corrected by Rado in the SCouchDB fork of the package JSON http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz" 3) My updated Test case myWideString := ('ä', 8220 asCharacter asString, Character cr, 'b'). d := Dictionary new. d at: 'title' put: 'aTitle'. d at: 'body' put: myWideString. r := WriteStream on: String new. (JsonObject newFrom: d) jsonWriteOn: r. WebClient httpPut: host, '/notes/test24' content: r contents type: 'text/plain'. RESULT: OK. 4) Conclusion With the change to the JSON package I am now fine in using WebClient for storing objects in a couchdB. However I did not commit my change to http://www.squeaksource.com/JSON as I do not (yet) understand the full impact of it. Thank you Andreas Raab, Levente Uzony and Rado Hodnicak for your help --Hannes On 5/11/10, Igor Stasenko <[hidden email]> wrote: > On 11 May 2010 17:44, Hannes Hirzel <[hidden email]> wrote: >> On 5/10/10, radoslav hodnicak <[hidden email]> wrote: >>> >>> Which JSON package/version are you using? I fixed a bug in the one >>> distributed with SCouchDB few weeks ago, where it didn't encode utf8 >>> characters properly - the correct escaped form is \uNNNN - always padded >>> to 4 Ns. that's why you get that warning, yours is only 2-3 >>> >>> rado >> >> I have been using >> http://www.squeaksource.com/JSON (over 7000 downloads) >> in combination with WebClient. >> >> Thank you Rado, I found >> http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz >> and will have a look at it. >> (Your comment: added handling of utf8 encoded input data - this is >> necessary for couchdb-lucene which sends results directly in utf8 and >> not \uNNNN encoded) >> > SCouchDB using a forked version of JSON package, which you can find in > SCouchDB repository > http://www.squeaksource.com/SCouchDB/JSON-Igor.Stasenko.34.mcz > > If you looking for that method, it can be found in Json>>unescapeUnicode > > >> --Hannes >> >> >>> On Mon, 10 May 2010, Hannes Hirzel wrote: >>> >>>> The test case made simpler >>>> >>>> WebClient httpPut: host, '/notes/test7' content: >>>> '{"content":"\uC3\uA4s"}' type: 'text/plain'. >>>> >>>> gives back as answer: '{"error":"bad_request","reason":"invalid UTF-8 >>>> JSON"} >>>> ' >>>> >>>> whereas >>>> >>>> WebClient httpPut: host, '/notes/test8' content: '{"content":"abc"}' >>>> type: 'text/plain'. >>>> >>>> gives back >>>> '{"ok":true,"id":"test8","rev":"1-f40e52919735ae6775af3d388361b3da"} >>>> ' >>>> >>>> --Hannes >>> >>> >> >> > > > > -- > Best regards, > Igor Stasenko AKA sig. > > |
P.S. And the note of Igor pointing me to
http://www.squeaksource.com/SCouchDB/JSON-Igor.Stasenko.34.mcz Json>>unescapeUnicode was helpful as well. Maybe it would be an idea that people could choose between \uNNNN escaping and UTF8 conversion. On 5/11/10, Hannes Hirzel <[hidden email]> wrote: > 1) UFT8 conversion > 2) Change to JSON package of Tony Garnock-Jones > 3) My updated Test case > 4) Conclusion > > > 1) UFT8 conversion > > My question was: > How do I convert a WideString to UTF8? > > > Levente answered: > > There are various possibilities: > 'äbc' squeakToUtf8. > 'äbc' convertToEncoding: 'utf-8'. > 'äbc' convertToWithConverter: UTF8TextConverter new. > UTF8TextConverter new encodeString: 'äbc'. > > > > 2) Change to JSON package of Tony Garnock-Jones > > As CouchDB stores UTF8 values I did not want to escape them with > \uNNNN as the forked JSON package in SCouchDB does. But instead I > wanted to keep UTF8 in the db. As Rado pointed out the UFT8 conversion > is not correct in the original JSON package. > > So I did the following correction. > > In the class > String - category *JSON-writing > (from package http://www.squeaksource.com/JSON) > I replaced > > jsonWriteOn: aStream > | replacement | > aStream nextPut: $". > self do: [ :ch | > (replacement := Json escapeForCharacter: ch) "***" > ifNil: [ aStream nextPut: ch ] > ifNotNil: [ aStream nextPutAll: replacement ] ]. > aStream nextPut: $". > > > WITH > > jsonWriteOn: aStream > aStream nextPut: $". > aStream nextPutAll: (UTF8TextConverter new encodeString: self). > aStream nextPut: $". > > > "*** NOTE: escapeForCharacter is incorrectly implemented in > http://www.squeaksource.com/JSON > and is corrected by Rado in the SCouchDB fork of the package JSON > http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz" > > > > 3) My updated Test case > > myWideString := ('ä', 8220 asCharacter asString, Character cr, 'b'). > d := Dictionary new. d at: 'title' put: 'aTitle'. d at: 'body' put: > myWideString. > r := WriteStream on: String new. > (JsonObject newFrom: d) jsonWriteOn: r. > WebClient httpPut: host, '/notes/test24' content: r contents type: > 'text/plain'. > > RESULT: OK. > > > > 4) Conclusion > > With the change to the JSON package I am now fine in using WebClient > for storing objects in a couchdB. > > However I did not commit my change to > http://www.squeaksource.com/JSON > as I do not (yet) understand the full impact of it. > > > Thank you Andreas Raab, Levente Uzony and Rado Hodnicak for your help > > --Hannes > > On 5/11/10, Igor Stasenko <[hidden email]> wrote: >> On 11 May 2010 17:44, Hannes Hirzel <[hidden email]> wrote: >>> On 5/10/10, radoslav hodnicak <[hidden email]> wrote: >>>> >>>> Which JSON package/version are you using? I fixed a bug in the one >>>> distributed with SCouchDB few weeks ago, where it didn't encode utf8 >>>> characters properly - the correct escaped form is \uNNNN - always >>>> padded >>>> to 4 Ns. that's why you get that warning, yours is only 2-3 >>>> >>>> rado >>> >>> I have been using >>> http://www.squeaksource.com/JSON (over 7000 downloads) >>> in combination with WebClient. >>> >>> Thank you Rado, I found >>> http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz >>> and will have a look at it. >>> (Your comment: added handling of utf8 encoded input data - this is >>> necessary for couchdb-lucene which sends results directly in utf8 and >>> not \uNNNN encoded) >>> >> SCouchDB using a forked version of JSON package, which you can find in >> SCouchDB repository >> http://www.squeaksource.com/SCouchDB/JSON-Igor.Stasenko.34.mcz >> >> If you looking for that method, it can be found in Json>>unescapeUnicode >> >> >>> --Hannes >>> >>> >>>> On Mon, 10 May 2010, Hannes Hirzel wrote: >>>> >>>>> The test case made simpler >>>>> >>>>> WebClient httpPut: host, '/notes/test7' content: >>>>> '{"content":"\uC3\uA4s"}' type: 'text/plain'. >>>>> >>>>> gives back as answer: '{"error":"bad_request","reason":"invalid UTF-8 >>>>> JSON"} >>>>> ' >>>>> >>>>> whereas >>>>> >>>>> WebClient httpPut: host, '/notes/test8' content: '{"content":"abc"}' >>>>> type: 'text/plain'. >>>>> >>>>> gives back >>>>> '{"ok":true,"id":"test8","rev":"1-f40e52919735ae6775af3d388361b3da"} >>>>> ' >>>>> >>>>> --Hannes >>>> >>>> >>> >>> >> >> >> >> -- >> Best regards, >> Igor Stasenko AKA sig. >> >> > |
In reply to this post by Levente Uzonyi-2
On 5/10/10, Levente Uzonyi <[hidden email]> wrote:
>> So the question is: How do I convert a WideString to UTF8 > > There are various possibilities: > 'äbc' squeakToUtf8. > 'äbc' convertToEncoding: 'utf-8'. > 'äbc' convertToWithConverter: UTF8TextConverter new. > UTF8TextConverter new encodeString: 'äbc'. > > > Levente > Thank you for your helpful the answer. I summarized the results at http://lists.squeakfoundation.org/pipermail/squeak-dev/2010-May/150486.html under 'UTF8 in JSON' --Hannes |
In reply to this post by Hannes Hirzel
On 12 May 2010 00:09, Hannes Hirzel <[hidden email]> wrote:
> 1) UFT8 conversion > 2) Change to JSON package of Tony Garnock-Jones > 3) My updated Test case > 4) Conclusion > > > 1) UFT8 conversion > > My question was: > How do I convert a WideString to UTF8? > > > Levente answered: > > There are various possibilities: > 'äbc' squeakToUtf8. > 'äbc' convertToEncoding: 'utf-8'. > 'äbc' convertToWithConverter: UTF8TextConverter new. > UTF8TextConverter new encodeString: 'äbc'. > > > > 2) Change to JSON package of Tony Garnock-Jones > > As CouchDB stores UTF8 values I did not want to escape them with > \uNNNN as the forked JSON package in SCouchDB does. i know. But JSON could be used for something else, and also its a part of syntax, so it should be supported there. > But instead I > wanted to keep UTF8 in the db. As Rado pointed out the UFT8 conversion > is not correct in the original JSON package. > Yeah.. SCouchDB having no utf-8 support for output. Yet. > So I did the following correction. > > In the class > String - category *JSON-writing > (from package http://www.squeaksource.com/JSON) > I replaced > > jsonWriteOn: aStream > | replacement | > aStream nextPut: $". > self do: [ :ch | > (replacement := Json escapeForCharacter: ch) "***" > ifNil: [ aStream nextPut: ch ] > ifNotNil: [ aStream nextPutAll: replacement ] ]. > aStream nextPut: $". > > > WITH > > jsonWriteOn: aStream > aStream nextPut: $". > aStream nextPutAll: (UTF8TextConverter new encodeString: self). > aStream nextPut: $". > No, this is WRONG! Json writer methods should output a unicode text, and do not deal with any encoding! Then, a layer which responsible for transferring the data will be free decide how to encode the json output, either using utf-8 encoding or any other appropriate UTF encoding. By putting utf-8 conversions in JSON library routines you limiting JSON library to be used only with utf-8 encoding. I repeat: JSON library is wrong place for dealing with encodings. It should take a unicode text/stream as input and unicode text/stream as output. Any encodings should be up to the outer layers, which responsible for data transmission! > > "*** NOTE: escapeForCharacter is incorrectly implemented in > http://www.squeaksource.com/JSON > and is corrected by Rado in the SCouchDB fork of the package JSON > http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz" > > > > 3) My updated Test case > > myWideString := ('ä', 8220 asCharacter asString, Character cr, 'b'). > d := Dictionary new. d at: 'title' put: 'aTitle'. d at: 'body' put: > myWideString. > r := WriteStream on: String new. > (JsonObject newFrom: d) jsonWriteOn: r. > WebClient httpPut: host, '/notes/test24' content: r contents type: 'text/plain'. > > RESULT: OK. > > > > 4) Conclusion > > With the change to the JSON package I am now fine in using WebClient > for storing objects in a couchdB. > > However I did not commit my change to > http://www.squeaksource.com/JSON > as I do not (yet) understand the full impact of it. > > > Thank you Andreas Raab, Levente Uzony and Rado Hodnicak for your help > > --Hannes > > On 5/11/10, Igor Stasenko <[hidden email]> wrote: >> On 11 May 2010 17:44, Hannes Hirzel <[hidden email]> wrote: >>> On 5/10/10, radoslav hodnicak <[hidden email]> wrote: >>>> >>>> Which JSON package/version are you using? I fixed a bug in the one >>>> distributed with SCouchDB few weeks ago, where it didn't encode utf8 >>>> characters properly - the correct escaped form is \uNNNN - always padded >>>> to 4 Ns. that's why you get that warning, yours is only 2-3 >>>> >>>> rado >>> >>> I have been using >>> http://www.squeaksource.com/JSON (over 7000 downloads) >>> in combination with WebClient. >>> >>> Thank you Rado, I found >>> http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz >>> and will have a look at it. >>> (Your comment: added handling of utf8 encoded input data - this is >>> necessary for couchdb-lucene which sends results directly in utf8 and >>> not \uNNNN encoded) >>> >> SCouchDB using a forked version of JSON package, which you can find in >> SCouchDB repository >> http://www.squeaksource.com/SCouchDB/JSON-Igor.Stasenko.34.mcz >> >> If you looking for that method, it can be found in Json>>unescapeUnicode >> >> >>> --Hannes >>> >>> >>>> On Mon, 10 May 2010, Hannes Hirzel wrote: >>>> >>>>> The test case made simpler >>>>> >>>>> WebClient httpPut: host, '/notes/test7' content: >>>>> '{"content":"\uC3\uA4s"}' type: 'text/plain'. >>>>> >>>>> gives back as answer: '{"error":"bad_request","reason":"invalid UTF-8 >>>>> JSON"} >>>>> ' >>>>> >>>>> whereas >>>>> >>>>> WebClient httpPut: host, '/notes/test8' content: '{"content":"abc"}' >>>>> type: 'text/plain'. >>>>> >>>>> gives back >>>>> '{"ok":true,"id":"test8","rev":"1-f40e52919735ae6775af3d388361b3da"} >>>>> ' >>>>> >>>>> --Hannes >>>> >>>> >>> >>> >> >> >> >> -- >> Best regards, >> Igor Stasenko AKA sig. >> >> > > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Andreas.Raab
On 11 May 2010 19:22, Andreas Raab <[hidden email]> wrote:
> On 5/11/2010 8:09 AM, Igor Stasenko wrote: >> >> It would be also nice to have support of other methods, as specified >> in HTTP 1.1 protocol: >> http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html >> >> Then i would be able to use WebClient as a backend in SCouchDB project, >> which currently using own subset of HTTP. > > What other methods do you need? There should be no problem adding any I just > had no need for them initially. > CouchDB API using PUT, POST, GET, DELETE methods. (http://wiki.apache.org/couchdb/API_Cheatsheet) > Cheers, > - Andreas > > -- Best regards, Igor Stasenko AKA sig. |
Free forum by Nabble | Edit this page |