Smalltalk › Pharo › Pharo Smalltalk Users

Ridiculous we are

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

43 messages Options

123

LogiqueWerks

Re: Unicode support in Pharo

Must be a thing about European "guys" ...

<quote>

- ByteString is needed for most european/occidental people who don't care about internationalization and should stay because Pharo is an european/english based system , also not to break existing code
(again will be transparent to most users for same reason).

</unquote>

Of all reasons to retain ByteString, let's hope this one is not listed, lest we appear ridiculous in our threads.

On 26 September 2014 18:22, Alain Rastoul <[hidden email]> wrote:

Le 26/09/2014 20:47, stepharo a écrit :

I'm not expert and I would like to know what people think.
But I think that we should consider

- the impact of spur new object format. I would like to have
unicode and clean the leadChar

Stef

Just to start a new thread about that, because it deserves it
and also some people might appreciate :)
(was in: "Ridiculous we are").

I may not be the most qualified to state on that but I give my 2c
(this is a community isn't it ? please do not flame...)

As Sven said, the situation is not so bad (quite good in fact): Pharo has true unicode built-in support with WideString and encoders (great job, IMHO this part of Zn components should be in the base system
or in an internationalization package not hidden in an htpp components package).
- encoding support has to be in the system : ZnEncoders(?) for utf8, unicode and other encoders for locale support because of existing system basis, web (utf8) and internationalization.
- this will be transparent: Smalltalk is a typeless langage, one do not have to specify String type (ByteString or WideString)
- Spur new character encoding is very interesting (in alpha stage now) and will be transparent for the same reason (though I'm still wondering of the 32bits/64 bits encoding on a 64 bits vm?).
- ByteString is needed for most european/occidental people who don't care about internationalization and should stay because Pharo is an european/english based system , also not to break existing code
(again will be transparent to most users for same reason).
- some parts of the system seems to be not "WideString/unicode aware" :
Pasting a Greek string in a workspace shows hieroglyphs (editors? morph?), but GT Inspector display is ok.
Font plugin seems to require little work : uses fopen instead of _wfopen (utf16 version) on windows and needs utf8 convertion on unix.
may be a check with File plugin (and other system related plugins) is also needed?

In short, some little additionnal work but a very good basis.
(but what is leadChar about?)

Regards,

Alain

Sven Van Caekenberghe-2

Re: Unicode support in Pharo

On 29 Sep 2014, at 19:10, Robert Shiplett <[hidden email]> wrote:

> Must be a thing about European "guys" ...
>
> <quote>
>
> - ByteString is needed for most european/occidental people who don't care about internationalization and should stay because Pharo is an european/english based system , also not to break existing code
> (again will be transparent to most users for same reason).
>
> </unquote>
>
> Of all reasons to retain ByteString, let's hope this one is not listed, lest we appear ridiculous in our threads.

Apart from being incorrect, it is indeed a totally wrong formulation which furthermore gives a bad impression.

ByteString is simply an optimisation covering Strings where all characters use the lower 256 Unicode code points.

> On 26 September 2014 18:22, Alain Rastoul <[hidden email]> wrote:
> Le 26/09/2014 20:47, stepharo a écrit :
> I'm not expert and I would like to know what people think.
> But I think that we should consider
>
> - the impact of spur new object format. I would like to have
> unicode and clean the leadChar
>
> Stef
>
>
> Just to start a new thread about that, because it deserves it
> and also some people might appreciate :)
> (was in: "Ridiculous we are").
>
> I may not be the most qualified to state on that but I give my 2c
> (this is a community isn't it ? please do not flame...)
>
> As Sven said, the situation is not so bad (quite good in fact): Pharo has true unicode built-in support with WideString and encoders (great job, IMHO this part of Zn components should be in the base system
> or in an internationalization package not hidden in an htpp components package).
> - encoding support has to be in the system : ZnEncoders(?) for utf8, unicode and other encoders for locale support because of existing system basis, web (utf8) and internationalization.
> - this will be transparent: Smalltalk is a typeless langage, one do not have to specify String type (ByteString or WideString)
> - Spur new character encoding is very interesting (in alpha stage now) and will be transparent for the same reason (though I'm still wondering of the 32bits/64 bits encoding on a 64 bits vm?).
> - ByteString is needed for most european/occidental people who don't care about internationalization and should stay because Pharo is an european/english based system , also not to break existing code
> (again will be transparent to most users for same reason).
> - some parts of the system seems to be not "WideString/unicode aware" :
> Pasting a Greek string in a workspace shows hieroglyphs (editors? morph?), but GT Inspector display is ok.
> Font plugin seems to require little work : uses fopen instead of _wfopen (utf16 version) on windows and needs utf8 convertion on unix.
> may be a check with File plugin (and other system related plugins) is also needed?
>
> In short, some little additionnal work but a very good basis.
> (but what is leadChar about?)
>
> Regards,
>
> Alain
>
>
>

Alain Rastoul-2

Re: Unicode support in Pharo

Le 29/09/2014 19:15, Sven Van Caekenberghe a écrit :

>
> On 29 Sep 2014, at 19:10, Robert Shiplett <[hidden email]> wrote:
>
>> Must be a thing about European "guys" ...
>>
>> <quote>
>>
>> - ByteString is needed for most european/occidental people who don't care about internationalization and should stay because Pharo is an european/english based system , also not to break existing code
>> (again will be transparent to most users for same reason).
>>
>> </unquote>
>>
>> Of all reasons to retain ByteString, let's hope this one is not listed, lest we appear ridiculous in our threads.
>
> Apart from being incorrect, it is indeed a totally wrong formulation which furthermore gives a bad impression.
>
> ByteString is simply an optimisation covering Strings where all characters use the lower 256 Unicode code points.
>

Indeed, there is something about "europe" : the iso alphabet soup
http://czyborra.com/charsets/iso8859.html
which is/was a pain for years now (thank you unicode).
But you are right, this formulation is not good, yours is far better.
:)

123