[ANN] WebClient and WebServer 1.0 for Squeak

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
50 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] WebClient and WebServer 1.0 for Squeak

Andreas.Raab
On 5/10/2010 5:25 AM, Hannes Hirzel wrote:

> I have a similar problem, I want to post a document to a couchDB
> instance. Without German umlauts it works fine but if I add them I get
> an error message.
>
> host :='<a href="http://localhost:5984'">http://localhost:5984'.
> r := WriteStream on: (String new: 1000).
> d := Dictionary new. d at: 'content' put: 'äs' squeakToUtf8.
> (JsonObject newFrom: d) jsonWriteOn: r.
> WebClient httpPut: host, '/notes/test5' content:  r contents  type:
> 'text/plain;charset=utf-8'.
>
> The error message of the WebResponse is
> '{"error":"bad_request","reason":"invalid UTF-8 JSON"}
> '
>
> The same thing happens if I omit #squeakToUtf8
>
> If I inspect
>     r contents
> I get
>    '{"content":"\uC3\uA4s"}'
                   ^^^^^^^^^

That's interesting. If this happens when you *omit* squeakToUtf8 then
it's already being encoded:

        'äs' squeakToUtf8 asByteArray hex => 'c3a473'

So 16rC3 16rA4 are the correct values for utf8-encoded 'ä'.

> How is this conversion supposed to be done?

Basically yes, but only if your input isn't already encoded. Whether
that's the case I don't know, you might check with the JSON library author.

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] WebClient and WebServer 1.0 for Squeak

Levente Uzonyi-2
In reply to this post by Hannes Hirzel
On Mon, 10 May 2010, Hannes Hirzel wrote:

> On 5/10/10, Levente Uzonyi <[hidden email]> wrote:
>> On Mon, 10 May 2010, Hannes Hirzel wrote:
>>
>>> Unfortunately UTF8TextConverter cannot deal with non-Latin1
>>> characters. So it's usefulness is limited.
>>
>> UTF8TextConverter can deal with non-latin1 characters. I
>> think you're trying to pass a WideString to #encodeByteString: which
>> obviously doesn't work.
>>
>>
>> Levente
>>
>
> Yes I am passing aWideString to
>    #encodeByteString:
>
> as this is the only conversion method UTF8TextConverter.
>
> And you're right I should pass a ByteString.
>
> However as the case
>   ('ä', 8220 asCharacter asString) asByteString   "A"
> shows in comparison to
>  ('ä', 65 asCharacter asString) asByteString      "B"
>
> I get only in case "B" a ByteString, in case "A" it remains a WideString.
>
> So the question is: How do I convert a WideString to UTF8 as
> UTF8TextConverter is limited to code points from 0...255 and I want
> the full Unicode range?
There are various possibilities:
'äbc' squeakToUtf8.
'äbc' convertToEncoding: 'utf-8'.
'äbc' convertToWithConverter: UTF8TextConverter new.
UTF8TextConverter new encodeString: 'äbc'.


Levente

>
> Or put the question otherwise: Is there a textconverter which
> implements the following algorithm
> http://dsc.sun.com/dev/gadc/technicalpublications/articles/utf8.html
>
> -Hannes
>
>

Reply | Threaded
Open this post in threaded view
|

Re: UTF8 in JSON (was: Re: [ANN] WebClient and WebServer 1.0 for Squeak)

radoslav hodnicak
In reply to this post by Hannes Hirzel

Which JSON package/version are you using? I fixed a bug in the one
distributed with SCouchDB few weeks ago, where it didn't encode utf8
characters properly - the correct escaped form is \uNNNN - always padded
to 4 Ns. that's why you get that warning, yours is only 2-3

rado

On Mon, 10 May 2010, Hannes Hirzel wrote:

> The test case made simpler
>
> WebClient httpPut: host, '/notes/test7' content:
> '{"content":"\uC3\uA4s"}' type: 'text/plain'.
>
> gives back as answer: '{"error":"bad_request","reason":"invalid UTF-8 JSON"}
> '
>
> whereas
>
> WebClient httpPut: host, '/notes/test8' content: '{"content":"abc"}'
> type: 'text/plain'.
>
> gives back
> '{"ok":true,"id":"test8","rev":"1-f40e52919735ae6775af3d388361b3da"}
> '
>
> --Hannes

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] WebClient and WebServer 1.0 for Squeak

laza
In reply to this post by Andreas.Raab
2010/5/10 Andreas Raab <[hidden email]>:
> Thanks for testing. The issue should be fixed in the repository  in
> WebClient-Core-ar.18.

Thanks for looking into it. The issue seems to be fixed with
WebClient-Core-ar.18

I'm sure this is all due some mistake on my part, but with an updated
plain 4.1 image I have no trouble to use tinyproxy (on my linux
machine) or squid (corporate) as a proxy server. With WebClient
installed, only tinyproxy seems to work where squid will give me a bad
request error. I can't look into the squid log files, but can compare
the log of tinyproxy for a 4.1 image and one image that contains
WebClient:

With 4.1 image:
CONNECT   May 11 11:15:39 [1167]: Request (file descriptor 6): GET
http://source.squeak.org/trunk/?C=M;O=D HTTP/1.0

With image + WebClient:
CONNECT   May 11 11:16:25 [1171]: Request (file descriptor 6): GET
/trunk/?C=M;O=D HTTP/1.1
INFO      May 11 11:16:25 [1171]: process_request: trans Host GET
http://source.squeak.org:80/trunk/?C=M;O=D for 6

So maybe tinyproxy is doing the right thing and the squid proxy is mis
configured/broken. But I think it would be great if WebClient could
cope with such environments as well as the inferior HTTPSocket does.

Alex

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] WebClient and WebServer 1.0 for Squeak

laza
> So maybe tinyproxy is doing the right thing and the squid proxy is mis
> configured/broken. But I think it would be great if WebClient could
> cope with such environments as well as the inferior HTTPSocket does.

I guess the difference comes from tinyproxys ability to behave like a
transparent proxy and squid playing by the rules, when I do read 5.1.2
of RFC2616 on HTTP/1.1 correctly:

"The absoluteURI form is REQUIRED when the request is being made to a proxy."

Alex

Reply | Threaded
Open this post in threaded view
|

Re: UTF8 in JSON (was: Re: [ANN] WebClient and WebServer 1.0 for Squeak)

Hannes Hirzel
In reply to this post by radoslav hodnicak
On 5/10/10, radoslav hodnicak <[hidden email]> wrote:
>
> Which JSON package/version are you using? I fixed a bug in the one
> distributed with SCouchDB few weeks ago, where it didn't encode utf8
> characters properly - the correct escaped form is \uNNNN - always padded
> to 4 Ns. that's why you get that warning, yours is only 2-3
>
> rado

I have been using
http://www.squeaksource.com/JSON (over 7000 downloads)
in combination with WebClient.

Thank you Rado, I found
http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz
and will have a look at it.
(Your comment: added handling of utf8 encoded input data - this is
necessary for couchdb-lucene which sends results directly in utf8 and
not \uNNNN encoded)

--Hannes


> On Mon, 10 May 2010, Hannes Hirzel wrote:
>
>> The test case made simpler
>>
>> WebClient httpPut: host, '/notes/test7' content:
>> '{"content":"\uC3\uA4s"}' type: 'text/plain'.
>>
>> gives back as answer: '{"error":"bad_request","reason":"invalid UTF-8
>> JSON"}
>> '
>>
>> whereas
>>
>> WebClient httpPut: host, '/notes/test8' content: '{"content":"abc"}'
>> type: 'text/plain'.
>>
>> gives back
>> '{"ok":true,"id":"test8","rev":"1-f40e52919735ae6775af3d388361b3da"}
>> '
>>
>> --Hannes
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] WebClient and WebServer 1.0 for Squeak

Igor Stasenko
In reply to this post by Andreas.Raab
Hi, Andreas,

just one feature request: support streaming in WebClient,
i.e. apart from being able to fetch a response content in a single blob,
also provide an API, which would enable WebClient user to read content
step by step,
by using stream protocol, i.e. #next, #next: etc.

I see you having a #streamFrom:to:size:progress:
in WebMessage, but its a little bit too high level (requires an output
stream and includes a progress block)
and its end consumer is a #content message.

It would be nice to have something like:

response := WebClient httpGet: 'http://foo.bar'.

content := response content. "read all at once".

contentStream := response contentStream.  "read content using a stream"
[contentStream atEnd] whileFalse: [
  c := contentStream next.  ....
].

There are some special uses of HTTP protocol which establishing a
permanent socket connection
and then sending a content in a small portions piece by piece over a time.
Obviously, with such kind of connection, if you try to get all content
at once (by using #content) you'll never get it to the end, because
there's always more to read, if you wait long enough.
But if you allow a streaming, then user could read content by portions
and handle portions step by step and don't have to wait till all
content arrives.

--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] WebClient and WebServer 1.0 for Squeak

Igor Stasenko
On 11 May 2010 17:52, Igor Stasenko <[hidden email]> wrote:

> Hi, Andreas,
>
> just one feature request: support streaming in WebClient,
> i.e. apart from being able to fetch a response content in a single blob,
> also provide an API, which would enable WebClient user to read content
> step by step,
> by using stream protocol, i.e. #next, #next: etc.
>
> I see you having a #streamFrom:to:size:progress:
> in WebMessage, but its a little bit too high level (requires an output
> stream and includes a progress block)
> and its end consumer is a #content message.
>
> It would be nice to have something like:
>
> response := WebClient httpGet: 'http://foo.bar'.
>
> content := response content. "read all at once".
>
a little remark (to avoid confusion): the above line shows how to
fetch the content in a single blob,
while following lines showing how to fetch content using stream. These
two ways, obviously, should not be used simultaneously.

> contentStream := response contentStream.  "read content using a stream"
> [contentStream atEnd] whileFalse: [
>  c := contentStream next.  ....
> ].
>
> There are some special uses of HTTP protocol which establishing a
> permanent socket connection
> and then sending a content in a small portions piece by piece over a time.
> Obviously, with such kind of connection, if you try to get all content
> at once (by using #content) you'll never get it to the end, because
> there's always more to read, if you wait long enough.
> But if you allow a streaming, then user could read content by portions
> and handle portions step by step and don't have to wait till all
> content arrives.
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] WebClient and WebServer 1.0 for Squeak

Igor Stasenko
It would be also nice to have support of other methods, as specified
in HTTP 1.1 protocol:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html

Then i would be able to use WebClient as a backend in SCouchDB project,
which currently using own subset of HTTP.


--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: UTF8 in JSON (was: Re: [ANN] WebClient and WebServer 1.0 for Squeak)

Igor Stasenko
In reply to this post by Hannes Hirzel
On 11 May 2010 17:44, Hannes Hirzel <[hidden email]> wrote:

> On 5/10/10, radoslav hodnicak <[hidden email]> wrote:
>>
>> Which JSON package/version are you using? I fixed a bug in the one
>> distributed with SCouchDB few weeks ago, where it didn't encode utf8
>> characters properly - the correct escaped form is \uNNNN - always padded
>> to 4 Ns. that's why you get that warning, yours is only 2-3
>>
>> rado
>
> I have been using
> http://www.squeaksource.com/JSON (over 7000 downloads)
> in combination with WebClient.
>
> Thank you Rado, I found
> http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz
> and will have a look at it.
> (Your comment: added handling of utf8 encoded input data - this is
> necessary for couchdb-lucene which sends results directly in utf8 and
> not \uNNNN encoded)
>
SCouchDB using a forked version of JSON package, which you can find in
SCouchDB repository
http://www.squeaksource.com/SCouchDB/JSON-Igor.Stasenko.34.mcz

If you looking for that method, it can be found in Json>>unescapeUnicode


> --Hannes
>
>
>> On Mon, 10 May 2010, Hannes Hirzel wrote:
>>
>>> The test case made simpler
>>>
>>> WebClient httpPut: host, '/notes/test7' content:
>>> '{"content":"\uC3\uA4s"}' type: 'text/plain'.
>>>
>>> gives back as answer: '{"error":"bad_request","reason":"invalid UTF-8
>>> JSON"}
>>> '
>>>
>>> whereas
>>>
>>> WebClient httpPut: host, '/notes/test8' content: '{"content":"abc"}'
>>> type: 'text/plain'.
>>>
>>> gives back
>>> '{"ok":true,"id":"test8","rev":"1-f40e52919735ae6775af3d388361b3da"}
>>> '
>>>
>>> --Hannes
>>
>>
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] WebClient and WebServer 1.0 for Squeak

Andreas.Raab
In reply to this post by laza
On 5/11/2010 3:04 AM, Alexander Lazarević wrote:
>> So maybe tinyproxy is doing the right thing and the squid proxy is mis
>> configured/broken. But I think it would be great if WebClient could
>> cope with such environments as well as the inferior HTTPSocket does.
>
> I guess the difference comes from tinyproxys ability to behave like a
> transparent proxy and squid playing by the rules, when I do read 5.1.2
> of RFC2616 on HTTP/1.1 correctly:
>
> "The absoluteURI form is REQUIRED when the request is being made to a proxy."

Oh, wow. There's never a shortage on HTTP surprises. I always thought
that the Host header is sufficient (and it probably is :-) but never
mind that. In any case, can you test the latest update to WebClient?

BTW, thank you *so much* for testing this with Squid. Proxies are weird,
finding people who can give this all a good workout is really helpful!

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] WebClient and WebServer 1.0 for Squeak

Andreas.Raab
In reply to this post by Igor Stasenko
On 5/11/2010 7:52 AM, Igor Stasenko wrote:

> just one feature request: support streaming in WebClient,
> i.e. apart from being able to fetch a response content in a single blob,
> also provide an API, which would enable WebClient user to read content
> step by step, by using stream protocol, i.e. #next, #next: etc.
>
> I see you having a #streamFrom:to:size:progress:
> in WebMessage, but its a little bit too high level (requires an output
> stream and includes a progress block)
> and its end consumer is a #content message.
>
> It would be nice to have something like:
>
> response := WebClient httpGet: 'http://foo.bar'.
>
> content := response content. "read all at once".

That works out of the box.

> contentStream := response contentStream.  "read content using a stream"
> [contentStream atEnd] whileFalse: [
>    c := contentStream next.  ....
> ].

And that works out of the box, too :-) With a small modification. The
convenience APIs on the class side close the client and prefetch the
response. In other words, in order to stream you need to write, e.g,

   client := WebClient new.
   [resp := client httpGet:'http://www.squeak.org'.
   length := resp contentLength.
   stream := resp contentStream.
   [length > 0] whileTrue:[
     stream next: (length min: 100).
   ]] ensure:[stream close].

A couple of things to keep in mind though: The above doesn't deal with
the server returning an HTTP/1.0 response (no content-length) or other
specialties like chunked-encoding content transfer. That's why the
high-level functions are advantageous because they deal with all of that.

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] WebClient and WebServer 1.0 for Squeak

Andreas.Raab
In reply to this post by Igor Stasenko
On 5/11/2010 8:09 AM, Igor Stasenko wrote:
> It would be also nice to have support of other methods, as specified
> in HTTP 1.1 protocol:
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
>
> Then i would be able to use WebClient as a backend in SCouchDB project,
> which currently using own subset of HTTP.

What other methods do you need? There should be no problem adding any I
just had no need for them initially.

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] WebClient and WebServer 1.0 for Squeak

Igor Stasenko
In reply to this post by Andreas.Raab
On 11 May 2010 19:21, Andreas Raab <[hidden email]> wrote:

> On 5/11/2010 7:52 AM, Igor Stasenko wrote:
>>
>> just one feature request: support streaming in WebClient,
>> i.e. apart from being able to fetch a response content in a single blob,
>> also provide an API, which would enable WebClient user to read content
>> step by step, by using stream protocol, i.e. #next, #next: etc.
>>
>> I see you having a #streamFrom:to:size:progress:
>> in WebMessage, but its a little bit too high level (requires an output
>> stream and includes a progress block)
>> and its end consumer is a #content message.
>>
>> It would be nice to have something like:
>>
>> response := WebClient httpGet: 'http://foo.bar'.
>>
>> content := response content. "read all at once".
>
> That works out of the box.
>
>> contentStream := response contentStream.  "read content using a stream"
>> [contentStream atEnd] whileFalse: [
>>   c := contentStream next.  ....
>> ].
>
> And that works out of the box, too :-) With a small modification. The
> convenience APIs on the class side close the client and prefetch the
> response. In other words, in order to stream you need to write, e.g,
>
>  client := WebClient new.
>  [resp := client httpGet:'http://www.squeak.org'.
>  length := resp contentLength.
>  stream := resp contentStream.
>  [length > 0] whileTrue:[
>    stream next: (length min: 100).
>  ]] ensure:[stream close].
>
good. Its quite similar to what i did in SCouchDB.

But i'd like to note, that content length is not mandatory field, and
your example
won't work for a content with undetermined length.

> A couple of things to keep in mind though: The above doesn't deal with the
> server returning an HTTP/1.0 response (no content-length) or other
> specialties like chunked-encoding content transfer. That's why the
> high-level functions are advantageous because they deal with all of that.
>

In SCouchDB I dealing with chunked-encoding content by wrapping the
original stream with chunked stream
which deals with this transparently, and so the end user don't have to
deal with it by itself.
Then, similarily, for utf-8 content, i also just wrapping a source
stream with own utf8-decoder stream,
and so, the content consumer don't have to care about it by itself,
all it needs to know to use stream protocol.

So, stream-wrappers is a powerful abstraction, alas, the current
Stream package lack of good support for 'wrapper stream' such kind of
abstraction. :(

> Cheers,
>  - Andreas
>
>

--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] WebClient and WebServer 1.0 for Squeak

Andreas.Raab
On 5/11/2010 10:10 AM, Igor Stasenko wrote:

> On 11 May 2010 19:21, Andreas Raab<[hidden email]>  wrote:
>> And that works out of the box, too :-) With a small modification. The
>> convenience APIs on the class side close the client and prefetch the
>> response. In other words, in order to stream you need to write, e.g,
>>
>>   client := WebClient new.
>>   [resp := client httpGet:'http://www.squeak.org'.
>>   length := resp contentLength.
>>   stream := resp contentStream.
>>   [length>  0] whileTrue:[
>>     stream next: (length min: 100).
>>   ]] ensure:[stream close].
>>
> good. Its quite similar to what i did in SCouchDB.
>
> But i'd like to note, that content length is not mandatory field, and
> your example won't work for a content with undetermined length.

Of course. It's an example after all. That's why I said that if you need
fully featured stream support you should be using the high-level
functions provided by WebClient which deal with these issues.

Cheers,
   - Andreas

>> A couple of things to keep in mind though: The above doesn't deal with the
>> server returning an HTTP/1.0 response (no content-length) or other
>> specialties like chunked-encoding content transfer. That's why the
>> high-level functions are advantageous because they deal with all of that.
>>
>
> In SCouchDB I dealing with chunked-encoding content by wrapping the
> original stream with chunked stream
> which deals with this transparently, and so the end user don't have to
> deal with it by itself.
> Then, similarily, for utf-8 content, i also just wrapping a source
> stream with own utf8-decoder stream,
> and so, the content consumer don't have to care about it by itself,
> all it needs to know to use stream protocol.
>
> So, stream-wrappers is a powerful abstraction, alas, the current
> Stream package lack of good support for 'wrapper stream' such kind of
> abstraction. :(
>
>> Cheers,
>>   - Andreas
>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: UTF8 in JSON (was: Re: [ANN] WebClient and WebServer 1.0 for Squeak)

Hannes Hirzel
In reply to this post by Igor Stasenko
1) UFT8 conversion
2) Change to JSON package of Tony Garnock-Jones
3) My updated Test case
4) Conclusion


1) UFT8 conversion

My question was:
    How do I convert a WideString to UTF8?


Levente answered:

There are various possibilities:
'äbc' squeakToUtf8.
'äbc' convertToEncoding: 'utf-8'.
'äbc' convertToWithConverter: UTF8TextConverter new.
UTF8TextConverter new encodeString: 'äbc'.



2) Change to JSON package of Tony Garnock-Jones

As CouchDB stores UTF8 values I did not want to escape them with
\uNNNN as the forked JSON package in SCouchDB does. But instead I
wanted to keep UTF8 in the db. As Rado pointed out the UFT8 conversion
is not correct in the original JSON package.

So I did the following correction.

In the class
  String  - category *JSON-writing
  (from package http://www.squeaksource.com/JSON)
I replaced

  jsonWriteOn: aStream
        | replacement |
        aStream nextPut: $".
        self do: [ :ch |
                (replacement := Json escapeForCharacter: ch)    "***"
                        ifNil: [ aStream nextPut: ch ]
                        ifNotNil: [ aStream nextPutAll: replacement ] ].
        aStream nextPut: $".


WITH

  jsonWriteOn: aStream
        aStream nextPut: $".
        aStream nextPutAll:  (UTF8TextConverter new encodeString: self).
        aStream nextPut: $".


"*** NOTE: escapeForCharacter is incorrectly implemented in
http://www.squeaksource.com/JSON
and is corrected by Rado in the SCouchDB fork of the package JSON
http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz"



3) My updated Test case

myWideString := ('ä', 8220 asCharacter asString, Character cr, 'b').
d := Dictionary new. d at: 'title' put:   'aTitle'. d at: 'body' put:
myWideString.
r := WriteStream on: String new.
(JsonObject newFrom: d) jsonWriteOn: r.
WebClient httpPut: host, '/notes/test24' content: r contents type: 'text/plain'.

RESULT: OK.



4) Conclusion

With the change to the JSON package I am now fine in using WebClient
for storing objects in a couchdB.

However I did not commit my change to
  http://www.squeaksource.com/JSON
as I do not (yet) understand the full impact of it.


Thank you Andreas Raab, Levente Uzony and Rado Hodnicak for your help

--Hannes

On 5/11/10, Igor Stasenko <[hidden email]> wrote:

> On 11 May 2010 17:44, Hannes Hirzel <[hidden email]> wrote:
>> On 5/10/10, radoslav hodnicak <[hidden email]> wrote:
>>>
>>> Which JSON package/version are you using? I fixed a bug in the one
>>> distributed with SCouchDB few weeks ago, where it didn't encode utf8
>>> characters properly - the correct escaped form is \uNNNN - always padded
>>> to 4 Ns. that's why you get that warning, yours is only 2-3
>>>
>>> rado
>>
>> I have been using
>> http://www.squeaksource.com/JSON (over 7000 downloads)
>> in combination with WebClient.
>>
>> Thank you Rado, I found
>> http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz
>> and will have a look at it.
>> (Your comment: added handling of utf8 encoded input data - this is
>> necessary for couchdb-lucene which sends results directly in utf8 and
>> not \uNNNN encoded)
>>
> SCouchDB using a forked version of JSON package, which you can find in
> SCouchDB repository
> http://www.squeaksource.com/SCouchDB/JSON-Igor.Stasenko.34.mcz
>
> If you looking for that method, it can be found in Json>>unescapeUnicode
>
>
>> --Hannes
>>
>>
>>> On Mon, 10 May 2010, Hannes Hirzel wrote:
>>>
>>>> The test case made simpler
>>>>
>>>> WebClient httpPut: host, '/notes/test7' content:
>>>> '{"content":"\uC3\uA4s"}' type: 'text/plain'.
>>>>
>>>> gives back as answer: '{"error":"bad_request","reason":"invalid UTF-8
>>>> JSON"}
>>>> '
>>>>
>>>> whereas
>>>>
>>>> WebClient httpPut: host, '/notes/test8' content: '{"content":"abc"}'
>>>> type: 'text/plain'.
>>>>
>>>> gives back
>>>> '{"ok":true,"id":"test8","rev":"1-f40e52919735ae6775af3d388361b3da"}
>>>> '
>>>>
>>>> --Hannes
>>>
>>>
>>
>>
>
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
>

Reply | Threaded
Open this post in threaded view
|

Re: UTF8 in JSON (was: Re: [ANN] WebClient and WebServer 1.0 for Squeak)

Hannes Hirzel
P.S. And the note of Igor pointing me to

http://www.squeaksource.com/SCouchDB/JSON-Igor.Stasenko.34.mcz

Json>>unescapeUnicode

was helpful as well. Maybe it would be an idea that people could
choose between \uNNNN escaping and UTF8 conversion.

On 5/11/10, Hannes Hirzel <[hidden email]> wrote:

> 1) UFT8 conversion
> 2) Change to JSON package of Tony Garnock-Jones
> 3) My updated Test case
> 4) Conclusion
>
>
> 1) UFT8 conversion
>
> My question was:
>     How do I convert a WideString to UTF8?
>
>
> Levente answered:
>
> There are various possibilities:
> 'äbc' squeakToUtf8.
> 'äbc' convertToEncoding: 'utf-8'.
> 'äbc' convertToWithConverter: UTF8TextConverter new.
> UTF8TextConverter new encodeString: 'äbc'.
>
>
>
> 2) Change to JSON package of Tony Garnock-Jones
>
> As CouchDB stores UTF8 values I did not want to escape them with
> \uNNNN as the forked JSON package in SCouchDB does. But instead I
> wanted to keep UTF8 in the db. As Rado pointed out the UFT8 conversion
> is not correct in the original JSON package.
>
> So I did the following correction.
>
> In the class
>   String  - category *JSON-writing
>   (from package http://www.squeaksource.com/JSON)
> I replaced
>
>   jsonWriteOn: aStream
> | replacement |
> aStream nextPut: $".
> self do: [ :ch |
> (replacement := Json escapeForCharacter: ch)    "***"
> ifNil: [ aStream nextPut: ch ]
> ifNotNil: [ aStream nextPutAll: replacement ] ].
> aStream nextPut: $".
>
>
> WITH
>
>   jsonWriteOn: aStream
> aStream nextPut: $".
> aStream nextPutAll:  (UTF8TextConverter new encodeString: self).
> aStream nextPut: $".
>
>
> "*** NOTE: escapeForCharacter is incorrectly implemented in
> http://www.squeaksource.com/JSON
> and is corrected by Rado in the SCouchDB fork of the package JSON
> http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz"
>
>
>
> 3) My updated Test case
>
> myWideString := ('ä', 8220 asCharacter asString, Character cr, 'b').
> d := Dictionary new. d at: 'title' put:   'aTitle'. d at: 'body' put:
> myWideString.
> r := WriteStream on: String new.
> (JsonObject newFrom: d) jsonWriteOn: r.
> WebClient httpPut: host, '/notes/test24' content: r contents type:
> 'text/plain'.
>
> RESULT: OK.
>
>
>
> 4) Conclusion
>
> With the change to the JSON package I am now fine in using WebClient
> for storing objects in a couchdB.
>
> However I did not commit my change to
>   http://www.squeaksource.com/JSON
> as I do not (yet) understand the full impact of it.
>
>
> Thank you Andreas Raab, Levente Uzony and Rado Hodnicak for your help
>
> --Hannes
>
> On 5/11/10, Igor Stasenko <[hidden email]> wrote:
>> On 11 May 2010 17:44, Hannes Hirzel <[hidden email]> wrote:
>>> On 5/10/10, radoslav hodnicak <[hidden email]> wrote:
>>>>
>>>> Which JSON package/version are you using? I fixed a bug in the one
>>>> distributed with SCouchDB few weeks ago, where it didn't encode utf8
>>>> characters properly - the correct escaped form is \uNNNN - always
>>>> padded
>>>> to 4 Ns. that's why you get that warning, yours is only 2-3
>>>>
>>>> rado
>>>
>>> I have been using
>>> http://www.squeaksource.com/JSON (over 7000 downloads)
>>> in combination with WebClient.
>>>
>>> Thank you Rado, I found
>>> http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz
>>> and will have a look at it.
>>> (Your comment: added handling of utf8 encoded input data - this is
>>> necessary for couchdb-lucene which sends results directly in utf8 and
>>> not \uNNNN encoded)
>>>
>> SCouchDB using a forked version of JSON package, which you can find in
>> SCouchDB repository
>> http://www.squeaksource.com/SCouchDB/JSON-Igor.Stasenko.34.mcz
>>
>> If you looking for that method, it can be found in Json>>unescapeUnicode
>>
>>
>>> --Hannes
>>>
>>>
>>>> On Mon, 10 May 2010, Hannes Hirzel wrote:
>>>>
>>>>> The test case made simpler
>>>>>
>>>>> WebClient httpPut: host, '/notes/test7' content:
>>>>> '{"content":"\uC3\uA4s"}' type: 'text/plain'.
>>>>>
>>>>> gives back as answer: '{"error":"bad_request","reason":"invalid UTF-8
>>>>> JSON"}
>>>>> '
>>>>>
>>>>> whereas
>>>>>
>>>>> WebClient httpPut: host, '/notes/test8' content: '{"content":"abc"}'
>>>>> type: 'text/plain'.
>>>>>
>>>>> gives back
>>>>> '{"ok":true,"id":"test8","rev":"1-f40e52919735ae6775af3d388361b3da"}
>>>>> '
>>>>>
>>>>> --Hannes
>>>>
>>>>
>>>
>>>
>>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] WebClient and WebServer 1.0 for Squeak

Hannes Hirzel
In reply to this post by Levente Uzonyi-2
On 5/10/10, Levente Uzonyi <[hidden email]> wrote:

>> So the question is: How do I convert a WideString to UTF8
>
> There are various possibilities:
> 'äbc' squeakToUtf8.
> 'äbc' convertToEncoding: 'utf-8'.
> 'äbc' convertToWithConverter: UTF8TextConverter new.
> UTF8TextConverter new encodeString: 'äbc'.
>
>
> Levente
>

Thank you for your helpful the answer. I summarized the results at
http://lists.squeakfoundation.org/pipermail/squeak-dev/2010-May/150486.html
under 'UTF8 in JSON'

--Hannes

Reply | Threaded
Open this post in threaded view
|

Re: UTF8 in JSON (was: Re: [ANN] WebClient and WebServer 1.0 for Squeak)

Igor Stasenko
In reply to this post by Hannes Hirzel
On 12 May 2010 00:09, Hannes Hirzel <[hidden email]> wrote:

> 1) UFT8 conversion
> 2) Change to JSON package of Tony Garnock-Jones
> 3) My updated Test case
> 4) Conclusion
>
>
> 1) UFT8 conversion
>
> My question was:
>    How do I convert a WideString to UTF8?
>
>
> Levente answered:
>
> There are various possibilities:
> 'äbc' squeakToUtf8.
> 'äbc' convertToEncoding: 'utf-8'.
> 'äbc' convertToWithConverter: UTF8TextConverter new.
> UTF8TextConverter new encodeString: 'äbc'.
>
>
>
> 2) Change to JSON package of Tony Garnock-Jones
>
> As CouchDB stores UTF8 values I did not want to escape them with
> \uNNNN as the forked JSON package in SCouchDB does.

i know. But JSON could be used for something else, and also its a part
of syntax,
so it should be supported there.

> But instead I
> wanted to keep UTF8 in the db. As Rado pointed out the UFT8 conversion
> is not correct in the original JSON package.
>
Yeah.. SCouchDB having no utf-8 support for output. Yet.

> So I did the following correction.
>
> In the class
>  String  - category *JSON-writing
>  (from package http://www.squeaksource.com/JSON)
> I replaced
>
>  jsonWriteOn: aStream
>        | replacement |
>        aStream nextPut: $".
>        self do: [ :ch |
>                (replacement := Json escapeForCharacter: ch)    "***"
>                        ifNil: [ aStream nextPut: ch ]
>                        ifNotNil: [ aStream nextPutAll: replacement ] ].
>        aStream nextPut: $".
>
>
> WITH
>
>  jsonWriteOn: aStream
>        aStream nextPut: $".
>        aStream nextPutAll:  (UTF8TextConverter new encodeString: self).
>        aStream nextPut: $".
>

No, this is WRONG!

Json writer methods should output a unicode text, and do not deal with
any encoding!
Then, a layer which responsible for transferring the data will be free
decide how to encode the
json output, either using utf-8 encoding or any other appropriate UTF encoding.

By putting utf-8 conversions in JSON library routines you limiting
JSON library to be used only with utf-8 encoding.

I repeat: JSON library is wrong place for dealing with encodings. It
should take a unicode text/stream as input
and unicode text/stream as output. Any encodings should be up to the
outer layers, which responsible for data transmission!


>
> "*** NOTE: escapeForCharacter is incorrectly implemented in
> http://www.squeaksource.com/JSON
> and is corrected by Rado in the SCouchDB fork of the package JSON
> http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz"
>


>
>
> 3) My updated Test case
>
> myWideString := ('ä', 8220 asCharacter asString, Character cr, 'b').
> d := Dictionary new. d at: 'title' put:   'aTitle'. d at: 'body' put:
> myWideString.
> r := WriteStream on: String new.
> (JsonObject newFrom: d) jsonWriteOn: r.
> WebClient httpPut: host, '/notes/test24' content: r contents type: 'text/plain'.
>
> RESULT: OK.
>
>
>
> 4) Conclusion
>
> With the change to the JSON package I am now fine in using WebClient
> for storing objects in a couchdB.
>
> However I did not commit my change to
>  http://www.squeaksource.com/JSON
> as I do not (yet) understand the full impact of it.
>
>
> Thank you Andreas Raab, Levente Uzony and Rado Hodnicak for your help
>
> --Hannes
>
> On 5/11/10, Igor Stasenko <[hidden email]> wrote:
>> On 11 May 2010 17:44, Hannes Hirzel <[hidden email]> wrote:
>>> On 5/10/10, radoslav hodnicak <[hidden email]> wrote:
>>>>
>>>> Which JSON package/version are you using? I fixed a bug in the one
>>>> distributed with SCouchDB few weeks ago, where it didn't encode utf8
>>>> characters properly - the correct escaped form is \uNNNN - always padded
>>>> to 4 Ns. that's why you get that warning, yours is only 2-3
>>>>
>>>> rado
>>>
>>> I have been using
>>> http://www.squeaksource.com/JSON (over 7000 downloads)
>>> in combination with WebClient.
>>>
>>> Thank you Rado, I found
>>> http://www.squeaksource.com/SCouchDB/SCouchDB-Core-rh.8.mcz
>>> and will have a look at it.
>>> (Your comment: added handling of utf8 encoded input data - this is
>>> necessary for couchdb-lucene which sends results directly in utf8 and
>>> not \uNNNN encoded)
>>>
>> SCouchDB using a forked version of JSON package, which you can find in
>> SCouchDB repository
>> http://www.squeaksource.com/SCouchDB/JSON-Igor.Stasenko.34.mcz
>>
>> If you looking for that method, it can be found in Json>>unescapeUnicode
>>
>>
>>> --Hannes
>>>
>>>
>>>> On Mon, 10 May 2010, Hannes Hirzel wrote:
>>>>
>>>>> The test case made simpler
>>>>>
>>>>> WebClient httpPut: host, '/notes/test7' content:
>>>>> '{"content":"\uC3\uA4s"}' type: 'text/plain'.
>>>>>
>>>>> gives back as answer: '{"error":"bad_request","reason":"invalid UTF-8
>>>>> JSON"}
>>>>> '
>>>>>
>>>>> whereas
>>>>>
>>>>> WebClient httpPut: host, '/notes/test8' content: '{"content":"abc"}'
>>>>> type: 'text/plain'.
>>>>>
>>>>> gives back
>>>>> '{"ok":true,"id":"test8","rev":"1-f40e52919735ae6775af3d388361b3da"}
>>>>> '
>>>>>
>>>>> --Hannes
>>>>
>>>>
>>>
>>>
>>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>>
>>
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] WebClient and WebServer 1.0 for Squeak

Igor Stasenko
In reply to this post by Andreas.Raab
On 11 May 2010 19:22, Andreas Raab <[hidden email]> wrote:

> On 5/11/2010 8:09 AM, Igor Stasenko wrote:
>>
>> It would be also nice to have support of other methods, as specified
>> in HTTP 1.1 protocol:
>> http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
>>
>> Then i would be able to use WebClient as a backend in SCouchDB project,
>> which currently using own subset of HTTP.
>
> What other methods do you need? There should be no problem adding any I just
> had no need for them initially.
>
As far as i can tell,

CouchDB API using PUT, POST, GET, DELETE methods.

(http://wiki.apache.org/couchdb/API_Cheatsheet)

> Cheers,
>  - Andreas
>
>



--
Best regards,
Igor Stasenko AKA sig.

123