Ridiculous we are

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
43 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: Unicode support in Pharo

LogiqueWerks
Must be a thing about European "guys" ...

<quote>

- ByteString is needed for most european/occidental people who don't care about internationalization and should stay because Pharo is an european/english based system , also not to break existing code
(again will be transparent to most users for same reason).

</unquote>

Of all reasons to retain ByteString, let's hope this one is not listed, lest we appear ridiculous in our threads.

r



On 26 September 2014 18:22, Alain Rastoul <[hidden email]> wrote:
Le 26/09/2014 20:47, stepharo a écrit :
I'm not expert and I would like to know what people think.
But I think that we should consider

     - the impact of spur new object format. I would like to have
unicode and clean the leadChar

Stef


Just to start a new thread about that, because it deserves it
and also some people might appreciate :)
(was in: "Ridiculous we are").

I may not be the most qualified to state on that but I give my 2c
(this is a community isn't it ? please do not flame...)

As Sven said, the situation is not so bad (quite good in fact): Pharo has true unicode built-in support with WideString and encoders (great job, IMHO this part of Zn components should be in the base system
or in an internationalization package  not hidden in an htpp components package).
- encoding support has to be in the system : ZnEncoders(?) for utf8, unicode and other encoders for locale support because of existing system basis, web (utf8) and internationalization.
- this will be transparent: Smalltalk is a typeless langage, one do not have to specify String type (ByteString or WideString)
- Spur new character encoding is very interesting (in alpha stage now) and will be transparent for the same reason (though I'm still wondering of the 32bits/64 bits encoding on a 64 bits vm?).
- ByteString is needed for most european/occidental people who don't care about internationalization and should stay because Pharo is an european/english based system , also not to break existing code
(again will be transparent to most users for same reason).
- some parts of the system seems to be not "WideString/unicode aware" :
Pasting a Greek string in a workspace shows hieroglyphs (editors? morph?), but GT Inspector display is ok.
Font plugin seems to require little work : uses fopen instead of _wfopen (utf16 version) on windows and needs utf8 convertion on unix.
may be a check with File plugin (and other system related plugins) is also needed?

In short, some little additionnal work but a very good basis.
(but what is leadChar about?)

Regards,

Alain



Reply | Threaded
Open this post in threaded view
|

Re: Unicode support in Pharo

Sven Van Caekenberghe-2

On 29 Sep 2014, at 19:10, Robert Shiplett <[hidden email]> wrote:

> Must be a thing about European "guys" ...
>
> <quote>
>
> - ByteString is needed for most european/occidental people who don't care about internationalization and should stay because Pharo is an european/english based system , also not to break existing code
> (again will be transparent to most users for same reason).
>
> </unquote>
>
> Of all reasons to retain ByteString, let's hope this one is not listed, lest we appear ridiculous in our threads.

Apart from being incorrect, it is indeed a totally wrong formulation which furthermore gives a bad impression.

ByteString is simply an optimisation covering Strings where all characters use the lower 256 Unicode code points.

> On 26 September 2014 18:22, Alain Rastoul <[hidden email]> wrote:
> Le 26/09/2014 20:47, stepharo a écrit :
> I'm not expert and I would like to know what people think.
> But I think that we should consider
>
>      - the impact of spur new object format. I would like to have
> unicode and clean the leadChar
>
> Stef
>
>
> Just to start a new thread about that, because it deserves it
> and also some people might appreciate :)
> (was in: "Ridiculous we are").
>
> I may not be the most qualified to state on that but I give my 2c
> (this is a community isn't it ? please do not flame...)
>
> As Sven said, the situation is not so bad (quite good in fact): Pharo has true unicode built-in support with WideString and encoders (great job, IMHO this part of Zn components should be in the base system
> or in an internationalization package  not hidden in an htpp components package).
> - encoding support has to be in the system : ZnEncoders(?) for utf8, unicode and other encoders for locale support because of existing system basis, web (utf8) and internationalization.
> - this will be transparent: Smalltalk is a typeless langage, one do not have to specify String type (ByteString or WideString)
> - Spur new character encoding is very interesting (in alpha stage now) and will be transparent for the same reason (though I'm still wondering of the 32bits/64 bits encoding on a 64 bits vm?).
> - ByteString is needed for most european/occidental people who don't care about internationalization and should stay because Pharo is an european/english based system , also not to break existing code
> (again will be transparent to most users for same reason).
> - some parts of the system seems to be not "WideString/unicode aware" :
> Pasting a Greek string in a workspace shows hieroglyphs (editors? morph?), but GT Inspector display is ok.
> Font plugin seems to require little work : uses fopen instead of _wfopen (utf16 version) on windows and needs utf8 convertion on unix.
> may be a check with File plugin (and other system related plugins) is also needed?
>
> In short, some little additionnal work but a very good basis.
> (but what is leadChar about?)
>
> Regards,
>
> Alain
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Unicode support in Pharo

Alain Rastoul-2
Le 29/09/2014 19:15, Sven Van Caekenberghe a écrit :

>
> On 29 Sep 2014, at 19:10, Robert Shiplett <[hidden email]> wrote:
>
>> Must be a thing about European "guys" ...
>>
>> <quote>
>>
>> - ByteString is needed for most european/occidental people who don't care about internationalization and should stay because Pharo is an european/english based system , also not to break existing code
>> (again will be transparent to most users for same reason).
>>
>> </unquote>
>>
>> Of all reasons to retain ByteString, let's hope this one is not listed, lest we appear ridiculous in our threads.
>
> Apart from being incorrect, it is indeed a totally wrong formulation which furthermore gives a bad impression.
>
> ByteString is simply an optimisation covering Strings where all characters use the lower 256 Unicode code points.
>

Indeed, there is something about "europe" : the iso alphabet soup
http://czyborra.com/charsets/iso8859.html
which is/was a pain for years now (thank you unicode).
But you are right, this formulation is not good, yours is far better.
:)



123