About Squeak 4.1 Language Support

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

About Squeak 4.1 Language Support

ericgao

Hi all,

 

Now Squeak 4.1 has supported Chinese from Code level, I have tried to read a file containing Chinese character from attached file, it showed up ?? for chinese character.
 
Note: run
(MultiByteFileStream fileNamed: 'file.txt') contents
in the workspace
Any one know some about it?
 

Thanks and Regards

Eric Gao




file.txt (16 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: About Squeak 4.1 Language Support

K K Subbu
On Thursday 27 Jan 2011 3:22:37 pm Eric wrote:

> Now Squeak 4.1 has supported Chinese from Code level, I have tried to read
> a file containing Chinese character from attached file, it showed up ??
> for chinese character.
>
> Note: run
>
> (MultiByteFileStream fileNamed: 'file.txt') contents
> in the workspace
>
> Any one know some about it?

LanguageEnvironment defaultSystemConverter

determines the text converter used for reading text files for your environment.
Inspect
    (MultiByteFileStream fileNamed: 'file.txt') converter
and make sure it is UTF8TextConverter. If not, you may need to override the
converter before reading its contents. e.g.
        (MultiByteFileStream fileNamed: 'file.txt')
                converter: (UTF8TextConverter new);
                contents.

If the characters are being decoded correctly then you may need specific
language support (fonts, rendering engine etc.) for your dialect. See language
support classes in Multilingual-Languages for more details.

Subbu

Reply | Threaded
Open this post in threaded view
|

Fwd: [squeak-dev] About Squeak 4.1 Language Support

ericgao
FYI

---------- Forwarded message ----------
From: Eric <[hidden email]>
Date: 2011/1/28
Subject: Re: [squeak-dev] About Squeak 4.1 Language Support
To: kksubbu.ml@gmail.com


Thanks Subbu,
 
As you mentioned, I tried the CNGBTextConverter existing in Squeak4.1 and I assume that it will invoke the languageEnvironment to SimplifiedChineseEnvironment, am i right?
 
I tried following code like:
 (MultiByteFileStream fileNamed: 'file.txt')
               converter: (CNGBTextConverter new);
               contents.
 
And it still doesn't work, does it mean i need to import some fonts for it or other ways?
 
Also another question is in Squeak4.1 what could i do to make sure the the language environment I use is chinese or some others?
 
My understanding now in Squeak4.1 that the default language is UTF8 or Unicode, where could i do the switch?
 
I am just a beginner on Squeak Smalltalk, please correct me if I make any mistakes on understanding
 
Eric
2011/1/27 K. K. Subramaniam <kksubbu.ml@gmail.com>

On Thursday 27 Jan 2011 3:22:37 pm Eric wrote:
> Now Squeak 4.1 has supported Chinese from Code level, I have tried to read
> a file containing Chinese character from attached file, it showed up ??
> for chinese character.
>
> Note: run
>
> (MultiByteFileStream fileNamed: 'file.txt') contents
> in the workspace
>
> Any one know some about it?

LanguageEnvironment defaultSystemConverter

determines the text converter used for reading text files for your environment.
Inspect
   (MultiByteFileStream fileNamed: 'file.txt') converter
and make sure it is UTF8TextConverter. If not, you may need to override the
converter before reading its contents. e.g.
       (MultiByteFileStream fileNamed: 'file.txt')
               converter: (UTF8TextConverter new);
               contents.

If the characters are being decoded correctly then you may need specific
language support (fonts, rendering engine etc.) for your dialect. See language
support classes in Multilingual-Languages for more details.

Subbu




Reply | Threaded
Open this post in threaded view
|

Re: About Squeak 4.1 Language Support

K K Subbu
In reply to this post by K K Subbu
On Friday 28 Jan 2011 10:14:43 am Eric wrote:
> And it still doesn't work, does it mean i need to import some fonts for it
> or other ways?
Were you able to inspect the decoded character values? If the decoding is
right then the issue is with fonts. See isFontAvailable and installFont
methods in LanguageEnvironment to see how to load language-specific fonts on
startup. GreekEnvironment and JapaneseEnvironment classes depend on such
external font files.

> Also another question is in Squeak4.1 what could i do to make sure the the
> language environment I use is chinese or some others?
What is your locale setting? Inspect:
    Locale current localeID

SimplifiedChineseEnvironment>>supportedLanguages contains 'zh'

> My understanding now in Squeak4.1 that the default language is UTF8 or
> Unicode, where could i do the switch?
Unfortunately not, because of the need for backward compability across
multiple OSes and platforms. To select the right converter, use class side
methods in SimplifiedChineseEnvironment. See JapaneseEnvironment for help.

> I am just a beginner on Squeak Smalltalk, please correct me if I make any
> mistakes on understanding
Generally, requests to this mailing list receive brief responses. If you need
more detail, you could try the beginners mailing list. Squeak needs people
like to you to increase support for Chinese locales.

Subbu

Reply | Threaded
Open this post in threaded view
|

Re: About Squeak 4.1 Language Support

ericgao
 
>Were you able to inspect the decoded character values? If the decoding is
>right then the issue is with fonts. See isFontAvailable and installFont
>methods in LanguageEnvironment to see how to load language-specific fonts on
>startup. GreekEnvironment and JapaneseEnvironment classes depend on such
>external font files.

The decoding is correct, for example, a chinese character ‘中', the UTF8 decoding is 25185837, but the dispaly in the workspace is ??, 
 I inpsect on :
 (SimplifiedChineseEnvironment new) isFontAvailable,
it return false to me.
 
and i run following to installFont:
 (Locale isoLanguage: 'zh') languageEnvironment fontDownload.
seems not font support on the net, any suggestion on how could i got the font for chinese?
 
And also i have read the chinese support codes developed by Yo, it does work on Squeak 3,8 and 3,9, but it doesn't work on 4.1, I don't understanding what the different feature on Squeak4.1 on Language support side?

Subbu



Reply | Threaded
Open this post in threaded view
|

Re: About Squeak 4.1 Language Support

ericgao
Find another strange thing
 
inspect "TextStyle defaultFont", in Squeak 3 (3.8,3.9,3.10), it was display as StrikeFontSet, but in Squeak 4, it was display as StrikeFont.
 
 
Also for the inputIntepreter, it work in 3.8, 3.9, but doesn't work in 3.10, any suggestion on how to debug such kind of method?
 
For InputIntepreter

'From Squeak3.10.2 of ''5 June 2008'' [latest update: #7179] on 30 January 2011 at 4:58:28 pm'!
KeyboardInputInterpreter subclass: #WinGB2312InputInterpreter
 instanceVariableNames: 'converter'
 classVariableNames: ''
 poolDictionaries: ''
 category: 'Multilingual-TextConversion'!
!WinGB2312InputInterpreter methodsFor: 'all' stamp: '1 1/30/2011 16:55'!
initialize
 converter := CNGBTextConverter new.
! !
!WinGB2312InputInterpreter methodsFor: 'all' stamp: 'yo 6/20/2006 11:11'!
nextCharFrom: sensor firstEvt: evtBuf
 | firstCharacter secondCharacter peekEvent char1Value keyValue pressType type stream multiCharacter |
 keyValue := evtBuf third.
 pressType := evtBuf fourth.
 pressType = EventKeyDown ifTrue: [type := #keyDown].
 pressType = EventKeyUp ifTrue: [type := #keyUp].
 pressType = EventKeyChar ifTrue: [type := #keystroke].
 char1Value := (Character value: keyValue) macToSqueak asciiValue.
 (char1Value > 16rA0 and: [char1Value < 16rF8]) ifFalse: [
  ^ keyValue asCharacter.
 ].
 peekEvent := sensor peekEvent.
 peekEvent ifNotNil: [
  pressType := peekEvent fourth.
 ].
 "peekEvent printString displayAt: [hidden email]."
 (peekEvent notNil and: [(peekEvent at: 4) = EventKeyDown])
  ifTrue: [sensor nextEvent.
   peekEvent := sensor peekEvent.
   peekEvent ifNotNil: [
    pressType := peekEvent fourth.
   ].
  ].
 (type = #keystroke
   and: [peekEvent notNil
     and: [(peekEvent at: 1)
        = EventTypeKeyboard
       and: [(peekEvent at: 4)
         = EventKeyChar]]])
  ifTrue: [
   firstCharacter := char1Value asCharacter.
   secondCharacter := (peekEvent at: 3) asCharacter macToSqueak.
   stream := ReadStream on: (String with: firstCharacter with: secondCharacter).
   multiCharacter := converter nextFromStream: stream.
   multiCharacter isCharacter ifTrue: [
    multiCharacter isOctetCharacter ifFalse: [ sensor nextEvent ].
   ].
   ^ multiCharacter.
  ].
 ^ keyValue asCharacter.
! !
在 2011年1月30日 下午2:14,Eric <[hidden email]>写道:
 
>Were you able to inspect the decoded character values? If the decoding is
>right then the issue is with fonts. See isFontAvailable and installFont
>methods in LanguageEnvironment to see how to load language-specific fonts on
>startup. GreekEnvironment and JapaneseEnvironment classes depend on such
>external font files.

The decoding is correct, for example, a chinese character ‘中', the UTF8 decoding is 25185837, but the dispaly in the workspace is ??, 
 I inpsect on :
 (SimplifiedChineseEnvironment new) isFontAvailable,
it return false to me.
 
and i run following to installFont:
 (Locale isoLanguage: 'zh') languageEnvironment fontDownload.
seems not font support on the net, any suggestion on how could i got the font for chinese?
 
And also i have read the chinese support codes developed by Yo, it does work on Squeak 3,8 and 3,9, but it doesn't work on 4.1, I don't understanding what the different feature on Squeak4.1 on Language support side?

Subbu




Reply | Threaded
Open this post in threaded view
|

Re: About Squeak 4.1 Language Support

K K Subbu
In reply to this post by ericgao
On Sunday 30 Jan 2011 11:44:16 am Eric wrote:
> and i run following to installFont:
>  (Locale isoLanguage: 'zh') languageEnvironment fontDownload.
> seems not font support on the net, any suggestion on how could i got the
> font for chinese?
You need to leave a FontSimplifiedChineseEnvironment.sar in the default
directory for Squeak to load it automatically.

Have you seen http://code.google.com/p/chinesesqueak/ ? I can't read Chinese
and have no idea of what those forums contain :-( but I did see some posters
including the SAR file for Chinese support.

HTH .. Subbu

Reply | Threaded
Open this post in threaded view
|

Re: About Squeak 4.1 Language Support

ericgao
yes, i have downloaded the codes from
regarding the chinese character, the code is working for Squeak 3.8, and 3.9
but using these codes, 3.10 has problem on typed in chinese character.....
 
Also for Squeak 4.1 it seems it has been changed much from based, am I right?


2011/1/30 K. K. Subramaniam <kksubbu.ml@gmail.com>
On Sunday 30 Jan 2011 11:44:16 am Eric wrote:
> and i run following to installFont:
>  (Locale isoLanguage: 'zh') languageEnvironment fontDownload.
> seems not font support on the net, any suggestion on how could i got the
> font for chinese?
You need to leave a FontSimplifiedChineseEnvironment.sar in the default
directory for Squeak to load it automatically.

Have you seen http://code.google.com/p/chinesesqueak/ ? I can't read Chinese
and have no idea of what those forums contain :-( but I did see some posters
including the SAR file for Chinese support.

HTH .. Subbu



Reply | Threaded
Open this post in threaded view
|

Re: About Squeak 4.1 Language Support

K K Subbu
On Sunday 30 Jan 2011 10:23:46 pm Eric wrote:
> yes, i have downloaded the codes from
> svn checkout *http <http://chinesesqueak.googlecode.com/svn/>*://
> chinesesqueak.googlecode.com/svn/
> regarding the chinese character, the code is working for Squeak 3.8, and
> 3.9 but using these codes, 3.10 has problem on typed in chinese
> character.....
This is because of input method conflict between supporting legacy codes and
modern UTF-8 input codes across multiple platforms.

See SQ-554 and its discussion in etoys-dev for the background. You can adapt
the patch to make it work for your case (Squeak 4 on Windows?).

Subbu

Reply | Threaded
Open this post in threaded view
|

Re: About Squeak 4.1 Language Support

ericgao
Thanks Subbu,
 
>This is because of input method conflict between supporting legacy codes and
modern UTF-8 input codes across multiple platforms.
Don't very understand about the conflict you mentioned, could you give some example?
>See SQ-554 and its discussion in etoys-dev for the background. You can adapt
the patch to make it work for your case (Squeak 4 on Windows?).
Also for SQ-554, which defect report URL for this item?  Do you mean the patch in etoys-dev?
 
Eric



Reply | Threaded
Open this post in threaded view
|

Re: About Squeak 4.1 Language Support

K K Subbu
On Tuesday 01 Feb 2011 3:40:10 pm Eric wrote:
> Thanks Subbu,
>
> >This is because of input method conflict between supporting legacy codes
> > and modern UTF-8 input codes across multiple platforms.
>  Don't very understand about the conflict you mentioned, could you give
> some example?
See the threads
 http://lists.squeakland.org/pipermail/etoys-dev/2010-October/005933.html
 http://forum.world.st/New-Win32-VM-m17n-testers-needed-td63730.html

In a multilingual setting, the language of input codes can be different from
what is implied by the locale setting (system language). So one can encounter
Chinese input in a Latin-1 locale. The image has methods to switch locales
manually but not converters :-(. The logic for arriving at language+encoding
is different on different platforms. Part of this logic lies in VM and part of
it in the VM. VMs ports on Windows, Mac and Linux use different logic to deal
with multi-byte input methods, so the code in the image has to dance around
with VM host, locale setting and language codes to collect input.

If you are not likely to read files encoded in pre-Unicode Chinese codes and
you are just interested in getting Squeak to deal with Chinese input on Wintel
VM, just apply a patch to override all this logic for now. Hopefully, better
code will emerge as we get more feedback from Chinese users.


> >See SQ-554 and its discussion in etoys-dev for the background. You can
> > adapt the patch to make it work for your case (Squeak 4 on Windows?).
> Also for SQ-554, which defect report URL for this item?  Do you mean the
> patch in etoys-dev?
Yes. In my patch, I tried to get UTF-8 encoding thru the VM into Squeak where
I could decode uniformly. It worked on Linux VM but not on Win32. I don't have
Win32 development machines around, so I couldn't take it further.

Subbu