Issue 3554 in pharo: CP1252TextConverter new does not work

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Issue 3554 in pharo: CP1252TextConverter new does not work

pharo
Status: Accepted
Owner: [hidden email]
Labels: Milestone-1.2

New issue 3554 by [hidden email]: CP1252TextConverter new does not  
work
http://code.google.com/p/pharo/issues/detail?id=3554

when I execute
        CP1252TextConverter new in 1.1 it works
Now in 1.2 I get DNU charcode


CP1252TextConverter>>nextPut: aCharacter toStream: aStream
        aStream isBinary ifTrue: [^ aCharacter storeBinaryOn: aStream].
        aCharacter charCode < 128
                ifTrue: [aStream basicNextPut: aCharacter]
                ifFalse: [aStream basicNextPut: ((Character value: (self fromSqueak:  
aCharacter) charCode))].

aCharacter = $^
(self fromSqueak: aCharacter) = 128


Reply | Threaded
Open this post in threaded view
|

Re: Issue 3554 in pharo: CP1252TextConverter new does not work

pharo
Updates:
        Status: Closed

Comment #1 on issue 3554 by [hidden email]: CP1252TextConverter new  
does not work
http://code.google.com/p/pharo/issues/detail?id=3554

Apparently this is from Soup and not from Pharo.

I fixed it that way


nextPut: aCharacter toStream: aStream
        aStream isBinary ifTrue: [^ aCharacter storeBinaryOn: aStream].
        aCharacter charCode < 128
                ifTrue: [aStream basicNextPut: aCharacter]
                ifFalse: [aStream basicNextPut: ((Character value: (self fromSqueak:  
aCharacter)))].

But not sure.


Reply | Threaded
Open this post in threaded view
|

Re: Issue 3554 in pharo: CP1252TextConverter new does not work

pharo

Comment #2 on issue 3554 by [hidden email]: CP1252TextConverter new does  
not work
http://code.google.com/p/pharo/issues/detail?id=3554

That's not the "right" way to do it anymore, ever since ByteTextConverter  
superclass was introduced. (AFAICT that was before 1.0, but after the soup  
implementation)

The only methods subclasses should implement are class-side methods:
byteToUnicodeSpec
encodingNames
(languageEnvironment)

I'd suggest updating the Soup package to include the implementation found  
in Core, which works in all versions of Pharo.



Reply | Threaded
Open this post in threaded view
|

Re: Issue 3554 in pharo: CP1252TextConverter new does not work

Levente Uzonyi-2
On Thu, 20 Jan 2011, [hidden email] wrote:

>
> Comment #2 on issue 3554 by [hidden email]: CP1252TextConverter new does
> not work
> http://code.google.com/p/pharo/issues/detail?id=3554
>
> That's not the "right" way to do it anymore, ever since ByteTextConverter
> superclass was introduced. (AFAICT that was before 1.0, but after the soup
> implementation)
>
> The only methods subclasses should implement are class-side methods:
> byteToUnicodeSpec
> encodingNames
> (languageEnvironment)
>
> I'd suggest updating the Soup package to include the implementation found in
> Core, which works in all versions of Pharo.

Will it work in Squeak? I'm asking because we're using it in one of our
projects and I'll be unhappy if it becomes uncompatible with Squeak the
way SmaCC, OB, eCompletion, etc did.


Levente

>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Issue 3554 in pharo: CP1252TextConverter new does not work

pharo
In reply to this post by pharo

Comment #3 on issue 3554 by [hidden email]: CP1252TextConverter new  
does not work
http://code.google.com/p/pharo/issues/detail?id=3554

Thanks henrik this is what I saw.
Do you know if there is a difference between Cp1250 and CP1252 because Soup  
used CP1252
now I made the tests work with cp1250 but the parser got problem on real  
code so the tests are not good enough. ;(



Reply | Threaded
Open this post in threaded view
|

Re: Issue 3554 in pharo: CP1252TextConverter new does not work

pharo

Comment #4 on issue 3554 by [hidden email]: CP1252TextConverter new does  
not work
http://code.google.com/p/pharo/issues/detail?id=3554

There's a difference, otherwise they wouldn't have different names :)
http://aspell.net/charsets/codepages.html

The unicode mappings are in the download links.

Basically, 1252 is a latin1 superset, while 1250 is (almost) a latin2  
superset.
Both map visual characters to the C1 control character (16r80 - 16r9F)  
range in the latin sets.


Reply | Threaded
Open this post in threaded view
|

Re: Issue 3554 in pharo: CP1252TextConverter new does not work

Henrik Sperre Johansen
In reply to this post by Levente Uzonyi-2

On Jan 20, 2011, at 2:31 59AM, Levente Uzonyi wrote:

> On Thu, 20 Jan 2011, [hidden email] wrote:
>
>>
>> Comment #2 on issue 3554 by [hidden email]: CP1252TextConverter new does not work
>> http://code.google.com/p/pharo/issues/detail?id=3554
>>
>> That's not the "right" way to do it anymore, ever since ByteTextConverter superclass was introduced. (AFAICT that was before 1.0, but after the soup implementation)
>>
>> The only methods subclasses should implement are class-side methods:
>> byteToUnicodeSpec
>> encodingNames
>> (languageEnvironment)
>>
>> I'd suggest updating the Soup package to include the implementation found in Core, which works in all versions of Pharo.
>
> Will it work in Squeak? I'm asking because we're using it in one of our projects and I'll be unhappy if it becomes uncompatible with Squeak the way SmaCC, OB, eCompletion, etc did.
>
>
> Levente

Almost...
The class variables hold all 256 entries instead of just the non-ascii range, and initialization logic was made a bit different as well when ByteTextConverter was introduced to Squeak :/

It should work with an additional method:
CP1252TextConverter class >> initializeDecodeTable
        ^decodeTable := (0 to: 127) , self byteToUnicodeSpec

(decodeTable would be undefined in Pharo, but the method never used)

and changing the initialize method, unless you want to implement initializeTables in ByteTextConverter as:

initializeTables
        self initializeDecodeTable; initializeEncodeTable; initializeLatin1MapAndEncodings

Looking at initialize implementations in current squeak subclasses that might not be such a bad thing :)

Cheers,
Henry


PS. Is there's a bug in CP1250 in squeak, or is there some hack I don't know about?
AFAIK, € (20AC) should be at 128...







Reply | Threaded
Open this post in threaded view
|

Re: Issue 3554 in pharo: CP1252TextConverter new does not work

Stéphane Ducasse
In reply to this post by Levente Uzonyi-2
I'm doing a configuration so that I get a version running and with green tests in 1.2
You can still use the old one.

Stef

>> Comment #2 on issue 3554 by [hidden email]: CP1252TextConverter new does not work
>> http://code.google.com/p/pharo/issues/detail?id=3554
>>
>> That's not the "right" way to do it anymore, ever since ByteTextConverter superclass was introduced. (AFAICT that was before 1.0, but after the soup implementation)
>>
>> The only methods subclasses should implement are class-side methods:
>> byteToUnicodeSpec
>> encodingNames
>> (languageEnvironment)
>>
>> I'd suggest updating the Soup package to include the implementation found in Core, which works in all versions of Pharo.
>
> Will it work in Squeak? I'm asking because we're using it in one of our projects and I'll be unhappy if it becomes uncompatible with Squeak the way SmaCC, OB, eCompletion, etc did.
>
>
> Levente
>
>>
>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: Issue 3554 in pharo: CP1252TextConverter new does not work

Stéphane Ducasse
In reply to this post by Henrik Sperre Johansen
Do we need 1252 and 1250? What is the difference? I saw 1253 is for greek

Right now I change Soup to work with 1250 instead of an old bogus 1252

Stef

Reply | Threaded
Open this post in threaded view
|

Re: Issue 3554 in pharo: CP1252TextConverter new does not work

Henrik Sperre Johansen

On Jan 20, 2011, at 10:59 10AM, Stéphane Ducasse wrote:

Do we need 1252 and 1250?

? Do we need converters for both? If we want to be able to read files saved in either character set, then yes.

What is the difference? I saw 1253 is for greek

http://en.wikipedia.org/wiki/Cp1250

5-line summaries at the top, you can see the specific differences in the tables.


Right now I change Soup to work with 1250 instead of an old bogus 1252

Stef

I'm not sure what "work with" means in the context of Soup.
Encoding in 1250 "works" fine if the non-english text you're writing is mostly Polish/Hungarian etc.
It's a bit less suited for say, norwegian or french, as it lacks characters like æ, ø, å and Œ,.

Cheers,
Henry

Reply | Threaded
Open this post in threaded view
|

Re: Issue 3554 in pharo: CP1252TextConverter new does not work

Stéphane Ducasse

On Jan 20, 2011, at 11:34 AM, Henrik Johansen wrote:

>
> On Jan 20, 2011, at 10:59 10AM, Stéphane Ducasse wrote:
>
>> Do we need 1252 and 1250?
>
> ? Do we need converters for both? If we want to be able to read files saved in either character set, then yes.

In fact I just check we already have them :)
This is just that soup was reloading an old version of the same class and this was not good.

>
>> What is the difference? I saw 1253 is for greek
>
> http://en.wikipedia.org/wiki/Cp1250
> http://en.wikipedia.org/wiki/Cp1252
>
> 5-line summaries at the top, you can see the specific differences in the tables.
>
>>
>> Right now I change Soup to work with 1250 instead of an old bogus 1252
>>
>> Stef
>
> I'm not sure what "work with" means in the context of Soup.

you can parse the page.

> Encoding in 1250 "works" fine if the non-english text you're writing is mostly Polish/Hungarian etc.
> It's a bit less suited for say, norwegian or french, as it lacks characters like æ, ø, å and Œ,.

thanks for the info.

>
> Cheers,
> Henry
>