Differences between TextConverters and ZnEncoders

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Differences between TextConverters and ZnEncoders

Guillermo Polito
Hi!

I'm struggling with the TextConverter's API in Pharo :).

I wanted to test the converters in Pharo, and I found the method #convertFromSystemString: that should (from its name) convert a Pharo String into an encoded version of it.

(UTF8TextConverter default convertFromSystemString: 'á').

Funny thing, it does not do that. It does the opposite: it tries to convert an encoded string to a pharo string. And so it fails. Of course, its symmetrical version does what I want and behaves as the ZnUTF8Encoder :)

(UTF8TextConverter default convertToSystemString: 'á') asByteArray
 "#[195 161]"

ZnUTF8Encoder new encodeString: 'á' "#[195 161]"

Question:
- does the convertTo/From methods makes sense to you or It's just me that sees them inverted?
- does it make sense to have the two versions of converters? Zn and the TextConverters do the same in my opinion (or maybe I'm confused :) )

Guille
Reply | Threaded
Open this post in threaded view
|

Re: Differences between TextConverters and ZnEncoders

Sven Van Caekenberghe-2

> On 30 Apr 2015, at 11:13, Guillermo Polito <[hidden email]> wrote:
>
> Hi!
>
> I'm struggling with the TextConverter's API in Pharo :).
>
> I wanted to test the converters in Pharo, and I found the method #convertFromSystemString: that should (from its name) convert a Pharo String into an encoded version of it.
>
> (UTF8TextConverter default convertFromSystemString: 'á').
>
> Funny thing, it does not do that. It does the opposite: it tries to convert an encoded string to a pharo string. And so it fails. Of course, its symmetrical version does what I want and behaves as the ZnUTF8Encoder :)
>
> (UTF8TextConverter default convertToSystemString: 'á') asByteArray
>  "#[195 161]"
>
> ZnUTF8Encoder new encodeString: 'á' "#[195 161]"
>
> Question:
> - does the convertTo/From methods makes sense to you or It's just me that sees them inverted?
> - does it make sense to have the two versions of converters? Zn and the TextConverters do the same in my opinion (or maybe I'm confused :) )
>
> Guille

I am biased, but the Zn converters are better. Why ? Because they work between String <-> Bytes as a text encoder/decoder should. They also have a number of extra features (counting the number of encoded bytes, being able to move backwards on input, being able to be very strict), are faster and have a clearer implementation. Furthermore, they are more correct and implement more encodings.

There is also good documentation http://stfx.eu/EnterprisePharo/Zinc-Encoding-Meta/

ZnCharacter[Read|Write]Stream are also very instructive, compare them to MultiByteStream.

Should we throw the other one out ? Probably. But it is the same problems as with Xtreams: the API is different and adding a compatibility layer would defeat the purpose. So I don't know.

Sven



Reply | Threaded
Open this post in threaded view
|

Re: Differences between TextConverters and ZnEncoders

NorbertHartl
If yours is better there shouldn't be much of a reason to keep the bad one, right? I'd just suggest that pharo does not depend on zinc classes but the encoder moves to pharo with a different name.

Norbert

> Am 30.04.2015 um 11:42 schrieb Sven Van Caekenberghe <[hidden email]>:
>
>
>> On 30 Apr 2015, at 11:13, Guillermo Polito <[hidden email]> wrote:
>>
>> Hi!
>>
>> I'm struggling with the TextConverter's API in Pharo :).
>>
>> I wanted to test the converters in Pharo, and I found the method #convertFromSystemString: that should (from its name) convert a Pharo String into an encoded version of it.
>>
>> (UTF8TextConverter default convertFromSystemString: 'á').
>>
>> Funny thing, it does not do that. It does the opposite: it tries to convert an encoded string to a pharo string. And so it fails. Of course, its symmetrical version does what I want and behaves as the ZnUTF8Encoder :)
>>
>> (UTF8TextConverter default convertToSystemString: 'á') asByteArray
>> "#[195 161]"
>>
>> ZnUTF8Encoder new encodeString: 'á' "#[195 161]"
>>
>> Question:
>> - does the convertTo/From methods makes sense to you or It's just me that sees them inverted?
>> - does it make sense to have the two versions of converters? Zn and the TextConverters do the same in my opinion (or maybe I'm confused :) )
>>
>> Guille
>
> I am biased, but the Zn converters are better. Why ? Because they work between String <-> Bytes as a text encoder/decoder should. They also have a number of extra features (counting the number of encoded bytes, being able to move backwards on input, being able to be very strict), are faster and have a clearer implementation. Furthermore, they are more correct and implement more encodings.
>
> There is also good documentation http://stfx.eu/EnterprisePharo/Zinc-Encoding-Meta/
>
> ZnCharacter[Read|Write]Stream are also very instructive, compare them to MultiByteStream.
>
> Should we throw the other one out ? Probably. But it is the same problems as with Xtreams: the API is different and adding a compatibility layer would defeat the purpose. So I don't know.
>
> Sven
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Differences between TextConverters and ZnEncoders

Guillermo Polito
I'd like to summarize some things

 + 1 to all that sven said.

Taking a look at users of old text converters, there are only 77 of them:

(TextConverter withAllSubclasses gather: [ :each |
SystemNavigation new allReferencesTo: each binding ]) size
 " => 77"

That is maybe a good starting point. And it will step by step remove the leading char magic.

The only point where I see a problem is in the feature "convert magically cr's to some line ending convention" that is managed by a mixture between the text converter and the file stream. I would not like to push this "feature" to the zinc encoders. Actually I don't believe it belongs to the file reading neither.

Guille

El jue., 30 de abr. de 2015 a la(s) 11:46 a. m., Norbert Hartl <[hidden email]> escribió:
If yours is better there shouldn't be much of a reason to keep the bad one, right? I'd just suggest that pharo does not depend on zinc classes but the encoder moves to pharo with a different name.

Norbert

> Am 30.04.2015 um 11:42 schrieb Sven Van Caekenberghe <[hidden email]>:
>
>
>> On 30 Apr 2015, at 11:13, Guillermo Polito <[hidden email]> wrote:
>>
>> Hi!
>>
>> I'm struggling with the TextConverter's API in Pharo :).
>>
>> I wanted to test the converters in Pharo, and I found the method #convertFromSystemString: that should (from its name) convert a Pharo String into an encoded version of it.
>>
>> (UTF8TextConverter default convertFromSystemString: 'á').
>>
>> Funny thing, it does not do that. It does the opposite: it tries to convert an encoded string to a pharo string. And so it fails. Of course, its symmetrical version does what I want and behaves as the ZnUTF8Encoder :)
>>
>> (UTF8TextConverter default convertToSystemString: 'á') asByteArray
>> "#[195 161]"
>>
>> ZnUTF8Encoder new encodeString: 'á' "#[195 161]"
>>
>> Question:
>> - does the convertTo/From methods makes sense to you or It's just me that sees them inverted?
>> - does it make sense to have the two versions of converters? Zn and the TextConverters do the same in my opinion (or maybe I'm confused :) )
>>
>> Guille
>
> I am biased, but the Zn converters are better. Why ? Because they work between String <-> Bytes as a text encoder/decoder should. They also have a number of extra features (counting the number of encoded bytes, being able to move backwards on input, being able to be very strict), are faster and have a clearer implementation. Furthermore, they are more correct and implement more encodings.
>
> There is also good documentation http://stfx.eu/EnterprisePharo/Zinc-Encoding-Meta/
>
> ZnCharacter[Read|Write]Stream are also very instructive, compare them to MultiByteStream.
>
> Should we throw the other one out ? Probably. But it is the same problems as with Xtreams: the API is different and adding a compatibility layer would defeat the purpose. So I don't know.
>
> Sven
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Differences between TextConverters and ZnEncoders

Sven Van Caekenberghe-2

> On 30 Apr 2015, at 11:59, Guillermo Polito <[hidden email]> wrote:
>
> I'd like to summarize some things
>
>  + 1 to all that sven said.
>
> Taking a look at users of old text converters, there are only 77 of them:
>
> (TextConverter withAllSubclasses gather: [ :each |
> SystemNavigation new allReferencesTo: each binding ]) size
>  " => 77"
>
> That is maybe a good starting point. And it will step by step remove the leading char magic.
>
> The only point where I see a problem is in the feature "convert magically cr's to some line ending convention" that is managed by a mixture between the text converter and the file stream. I would not like to push this "feature" to the zinc encoders. Actually I don't believe it belongs to the file reading neither.

That are the things I am talking about, the old code mixed things up way too much.

The main API problem is a key one: the old converters work String <-> String, the new ones String <-> ByteArray (or streams of course).

Also does anyone dare to touch MultiByteFileStream or MultiByteBinaryOrTextStream (the names alone ;-) ?

> Guille
>
> El jue., 30 de abr. de 2015 a la(s) 11:46 a. m., Norbert Hartl <[hidden email]> escribió:
> If yours is better there shouldn't be much of a reason to keep the bad one, right? I'd just suggest that pharo does not depend on zinc classes but the encoder moves to pharo with a different name.
>
> Norbert
>
> > Am 30.04.2015 um 11:42 schrieb Sven Van Caekenberghe <[hidden email]>:
> >
> >
> >> On 30 Apr 2015, at 11:13, Guillermo Polito <[hidden email]> wrote:
> >>
> >> Hi!
> >>
> >> I'm struggling with the TextConverter's API in Pharo :).
> >>
> >> I wanted to test the converters in Pharo, and I found the method #convertFromSystemString: that should (from its name) convert a Pharo String into an encoded version of it.
> >>
> >> (UTF8TextConverter default convertFromSystemString: 'á').
> >>
> >> Funny thing, it does not do that. It does the opposite: it tries to convert an encoded string to a pharo string. And so it fails. Of course, its symmetrical version does what I want and behaves as the ZnUTF8Encoder :)
> >>
> >> (UTF8TextConverter default convertToSystemString: 'á') asByteArray
> >> "#[195 161]"
> >>
> >> ZnUTF8Encoder new encodeString: 'á' "#[195 161]"
> >>
> >> Question:
> >> - does the convertTo/From methods makes sense to you or It's just me that sees them inverted?
> >> - does it make sense to have the two versions of converters? Zn and the TextConverters do the same in my opinion (or maybe I'm confused :) )
> >>
> >> Guille
> >
> > I am biased, but the Zn converters are better. Why ? Because they work between String <-> Bytes as a text encoder/decoder should. They also have a number of extra features (counting the number of encoded bytes, being able to move backwards on input, being able to be very strict), are faster and have a clearer implementation. Furthermore, they are more correct and implement more encodings.
> >
> > There is also good documentation http://stfx.eu/EnterprisePharo/Zinc-Encoding-Meta/
> >
> > ZnCharacter[Read|Write]Stream are also very instructive, compare them to MultiByteStream.
> >
> > Should we throw the other one out ? Probably. But it is the same problems as with Xtreams: the API is different and adding a compatibility layer would defeat the purpose. So I don't know.
> >
> > Sven
> >
> >
> >
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Differences between TextConverters and ZnEncoders

Marcus Denker-4
In reply to this post by Guillermo Polito

On 30 Apr 2015, at 11:59, Guillermo Polito <[hidden email]> wrote:

(TextConverter withAllSubclasses gather: [ :each |
SystemNavigation new allReferencesTo: each binding ]) size

Thanks to first class variables, this can be writing at:

(TextConverter withAllSubclasses gather: [ :each | each binding usingMethods ]) size

#binding returns GlobalVariable that has an API… isn’t that nice?

Marcus
Reply | Threaded
Open this post in threaded view
|

Re: Differences between TextConverters and ZnEncoders

stepharo
In reply to this post by Sven Van Caekenberghe-2
> I am biased, but the Zn converters are better. Why ? Because they work between String <-> Bytes as a text encoder/decoder should. They also have a number of extra features (counting the number of encoded bytes, being able to move backwards on input, being able to be very strict), are faster and have a clearer implementation. Furthermore, they are more correct and implement more encodings.
>
> There is also good documentation http://stfx.eu/EnterprisePharo/Zinc-Encoding-Meta/
>
> ZnCharacter[Read|Write]Stream are also very instructive, compare them to MultiByteStream.
>
> Should we throw the other one out ? Probably. But it is the same problems as with Xtreams: the API is different and adding a compatibility layer would defeat the purpose. So I don't know.
>
> Sven

I would integrate the good ones and deprecate the old ones!

Stef


Reply | Threaded
Open this post in threaded view
|

Re: Differences between TextConverters and ZnEncoders

Guillermo Polito
They already are integrated! it's a matter of rewrite all usages of MultiByteScaryStream and friends...

El jue., 30 de abr. de 2015 a la(s) 4:06 p. m., stepharo <[hidden email]> escribió:
> I am biased, but the Zn converters are better. Why ? Because they work between String <-> Bytes as a text encoder/decoder should. They also have a number of extra features (counting the number of encoded bytes, being able to move backwards on input, being able to be very strict), are faster and have a clearer implementation. Furthermore, they are more correct and implement more encodings.
>
> There is also good documentation http://stfx.eu/EnterprisePharo/Zinc-Encoding-Meta/
>
> ZnCharacter[Read|Write]Stream are also very instructive, compare them to MultiByteStream.
>
> Should we throw the other one out ? Probably. But it is the same problems as with Xtreams: the API is different and adding a compatibility layer would defeat the purpose. So I don't know.
>
> Sven

I would integrate the good ones and deprecate the old ones!

Stef


Reply | Threaded
Open this post in threaded view
|

Re: Differences between TextConverters and ZnEncoders

stepharo
In reply to this post by Sven Van Caekenberghe-2
So could not we spot the "cr" conversion and isolate the users/providers?

Stef

Le 30/4/15 12:40, Sven Van Caekenberghe a écrit :

>> On 30 Apr 2015, at 11:59, Guillermo Polito <[hidden email]> wrote:
>>
>> I'd like to summarize some things
>>
>>   + 1 to all that sven said.
>>
>> Taking a look at users of old text converters, there are only 77 of them:
>>
>> (TextConverter withAllSubclasses gather: [ :each |
>> SystemNavigation new allReferencesTo: each binding ]) size
>>   " => 77"
>>
>> That is maybe a good starting point. And it will step by step remove the leading char magic.
>>
>> The only point where I see a problem is in the feature "convert magically cr's to some line ending convention" that is managed by a mixture between the text converter and the file stream. I would not like to push this "feature" to the zinc encoders. Actually I don't believe it belongs to the file reading neither.
> That are the things I am talking about, the old code mixed things up way too much.
>
> The main API problem is a key one: the old converters work String <-> String, the new ones String <-> ByteArray (or streams of course).
>
> Also does anyone dare to touch MultiByteFileStream or MultiByteBinaryOrTextStream (the names alone ;-) ?
>
>> Guille
>>
>> El jue., 30 de abr. de 2015 a la(s) 11:46 a. m., Norbert Hartl <[hidden email]> escribió:
>> If yours is better there shouldn't be much of a reason to keep the bad one, right? I'd just suggest that pharo does not depend on zinc classes but the encoder moves to pharo with a different name.
>>
>> Norbert
>>
>>> Am 30.04.2015 um 11:42 schrieb Sven Van Caekenberghe <[hidden email]>:
>>>
>>>
>>>> On 30 Apr 2015, at 11:13, Guillermo Polito <[hidden email]> wrote:
>>>>
>>>> Hi!
>>>>
>>>> I'm struggling with the TextConverter's API in Pharo :).
>>>>
>>>> I wanted to test the converters in Pharo, and I found the method #convertFromSystemString: that should (from its name) convert a Pharo String into an encoded version of it.
>>>>
>>>> (UTF8TextConverter default convertFromSystemString: 'á').
>>>>
>>>> Funny thing, it does not do that. It does the opposite: it tries to convert an encoded string to a pharo string. And so it fails. Of course, its symmetrical version does what I want and behaves as the ZnUTF8Encoder :)
>>>>
>>>> (UTF8TextConverter default convertToSystemString: 'á') asByteArray
>>>> "#[195 161]"
>>>>
>>>> ZnUTF8Encoder new encodeString: 'á' "#[195 161]"
>>>>
>>>> Question:
>>>> - does the convertTo/From methods makes sense to you or It's just me that sees them inverted?
>>>> - does it make sense to have the two versions of converters? Zn and the TextConverters do the same in my opinion (or maybe I'm confused :) )
>>>>
>>>> Guille
>>> I am biased, but the Zn converters are better. Why ? Because they work between String <-> Bytes as a text encoder/decoder should. They also have a number of extra features (counting the number of encoded bytes, being able to move backwards on input, being able to be very strict), are faster and have a clearer implementation. Furthermore, they are more correct and implement more encodings.
>>>
>>> There is also good documentation http://stfx.eu/EnterprisePharo/Zinc-Encoding-Meta/
>>>
>>> ZnCharacter[Read|Write]Stream are also very instructive, compare them to MultiByteStream.
>>>
>>> Should we throw the other one out ? Probably. But it is the same problems as with Xtreams: the API is different and adding a compatibility layer would defeat the purpose. So I don't know.
>>>
>>> Sven
>>>
>>>
>>>
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Differences between TextConverters and ZnEncoders

stepharo
In reply to this post by Guillermo Polito


Le 30/4/15 16:07, Guillermo Polito a écrit :
They already are integrated! it's a matter of rewrite all usages of MultiByteScaryStream and friends...

Can you give me just a simple example so that I can understand what has to be done?


El jue., 30 de abr. de 2015 a la(s) 4:06 p. m., stepharo <[hidden email]> escribió:
> I am biased, but the Zn converters are better. Why ? Because they work between String <-> Bytes as a text encoder/decoder should. They also have a number of extra features (counting the number of encoded bytes, being able to move backwards on input, being able to be very strict), are faster and have a clearer implementation. Furthermore, they are more correct and implement more encodings.
>
> There is also good documentation http://stfx.eu/EnterprisePharo/Zinc-Encoding-Meta/
>
> ZnCharacter[Read|Write]Stream are also very instructive, compare them to MultiByteStream.
>
> Should we throw the other one out ? Probably. But it is the same problems as with Xtreams: the API is different and adding a compatibility layer would defeat the purpose. So I don't know.
>
> Sven

I would integrate the good ones and deprecate the old ones!

Stef