Need help to implement pharo encodeForHTTP

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Need help to implement pharo encodeForHTTP

NorbertHartl
Now I've been looking over two hours how to accomplish the following: I'm trying to get the pharo version of encodeForHTTP implemented. But streams are a drag and encoding just makes it worse.
The difference to the existing behaviour is that characters are encoded in utf-8. That means that each individual byte of a unicode character is encoded url safe.

For the german ä this mean

in gemstone it encodes to

%e4

and in pharo to

%C3%A4

I would be glad if someone could help me.

thanks,

Norbert


Reply | Threaded
Open this post in threaded view
|

Re: Need help to implement pharo encodeForHTTP

James Foster
Norbert,

I've found that when I have a GemStone DoubleByteString, I might need to pass it through UTF8Encoding class>>#'encode:' to get UTF-8 and when I have UTF-8 I pass it through UTF8Encoding class>>#'decode:' to get a DoubleByteString. I think Dale has moved some of this into primitives for performance. (I don't have an image open to check the methods.)

James

On Apr 16, 2010, at 11:07 AM, Norbert Hartl wrote:

> Now I've been looking over two hours how to accomplish the following: I'm trying to get the pharo version of encodeForHTTP implemented. But streams are a drag and encoding just makes it worse.
> The difference to the existing behaviour is that characters are encoded in utf-8. That means that each individual byte of a unicode character is encoded url safe.
>
> For the german ä this mean
>
> in gemstone it encodes to
>
> %e4
>
> and in pharo to
>
> %C3%A4
>
> I would be glad if someone could help me.
>
> thanks,
>
> Norbert
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Need help to implement pharo encodeForHTTP

Dale
In reply to this post by NorbertHartl
Norbert,

UTF8 encoding is done in a primitive, but HTTP encoding is not ... the basic algorithms look the same, so my guess is that difference is showing up because the conversion to utf8 is not being done in the GemStone case.

Doing this appears to get right answer:

  'ä' encodeAsUTF8 encodeForHTTP '%c3%a4'

I notice that the Pharo implementation of encodeForHTTP has changed since it was originally imported ...

Adding #encodeForHTTP to MultiByteString in GemStone implemented as:

encodeForHTTP

  ^self encodeAsUTF8 encodeForHTTP

I'll submit a bug and try to get this into the mythical 1.0-beta.8 release.

Dale
----- "Norbert Hartl" <[hidden email]> wrote:

| Now I've been looking over two hours how to accomplish the following:
| I'm trying to get the pharo version of encodeForHTTP implemented. But
| streams are a drag and encoding just makes it worse.
| The difference to the existing behaviour is that characters are
| encoded in utf-8. That means that each individual byte of a unicode
| character is encoded url safe.
|
| For the german ä this mean
|
| in gemstone it encodes to
|
| %e4
|
| and in pharo to
|
| %C3%A4
|
| I would be glad if someone could help me.
|
| thanks,
|
| Norbert