UTF-8 in Squeak (Linux, X11)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

UTF-8 in Squeak (Linux, X11)

Matej Kosik-2
Hello,

I have seen that others were able to enter UTF-8 characters text in
Squeak. Does this work on Linux? I am able to enter UTF-8 characters
now, but they do not appear correctly. Is there something special what
must be done (in the out-of-the-box image)? Some special command line
parameters? Should I load some special packages? Should I load some
special fonts? (I have tried FreeMono, it did not help). I am using
Squeak image 3.8 and squeak VM 3.9.7 and UTF-8 in X11 where I need to
deal with non-English characters.

For example, I entered this string:

        ťažisko

(i.e. in English "centre of inertia") to Workspace and I saw bad
characters. You can see the result in the attached figure. Do you do
something special that you can enter and see non-English characters? I
would like to test it :). What is the official way?

Thank you.
--
Matej Kosik



Workspace.png (6K) Download Attachment
signature.asc (264 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 in Squeak (Linux, X11)

Xinyu Liu
Hi,

I don't know if this will work, you can try:
(World menu->help...->set language...)
Then click the right language you want to use.

Regards.
Yours,
Liu.

On 8/6/06, Matej Kosik <[hidden email]> wrote:
Hello,

I have seen that others were able to enter UTF-8 characters text in
Squeak. Does this work on Linux? I am able to enter UTF-8 characters
now, but they do not appear correctly. Is there something special what
must be done (in the out-of-the-box image)? Some special command line
parameters? Should I load some special packages? Should I load some
special fonts? (I have tried FreeMono, it did not help). I am using
Squeak image 3.8 and squeak VM 3.9.7 and UTF-8 in X11 where I need to
deal with non-English characters.

For example, I entered this string:

        ťažisko

(i.e. in English "centre of inertia") to Workspace and I saw bad
characters. You can see the result in the attached figure. Do you do
something special that you can enter and see non-English characters? I
would like to test it :). What is the official way?

Thank you.
--
Matej Kosik









Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 in Squeak (Linux, X11)

tgkuo
Hi Liu

On Tue, 8 Aug 2006 09:29:53 +0800, you wrote:
>Hi,
>
>I don't know if this will work, you can try:
>(World menu->help...->set language...)
>Then click the right language you want to use.
>
   You can get the right language for Chinese if it is available
somewhere?

Best regards.

Tsun

Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 in Squeak (Linux, X11)

Xinyu Liu
Hi,

The standard version available in <a href="http://www.squeak.org/" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">www.squeak.org only supports "Deutsch, English, Espanol, French, Polish" in the set language list.

Some Asian Pacific language version are also available. Yoshiki has built one which supports Chinese input and display. We are translating the menus, titles, labels, etc. into Chinese now. (There are nearly 5000 items need to be translated.) If you are interesting in it, please have a look at:
<a href="http://liuxinyu95.googlepages.com/squeak.dev.chinese" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">http://liuxinyu95.googlepages.com/squeak.dev.chinese
(I am sorry that it is a page in Chinese)

Here is a screen shot:
http://liuxinyu95.googlepages.com/squeak_chn_shot.PNG

Cheers.
Yours,
Liu.


On 8/8/06, tgkuo <[hidden email]> wrote:
Hi Liu

On Tue, 8 Aug 2006 09:29:53 +0800, you wrote:
>Hi,
>
>I don't know if this will work, you can try:
>(World menu->help...->set language...)
>Then click the right language you want to use.
>
   You can get the right language for Chinese if it is available
somewhere?

Best regards.

Tsun




Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 in Squeak (Linux, X11) - Chinese

Brent Pinkney
ni hao,

When we were looking to support Chinese charaters (as class names, not just labels) we found
we needed to implement this method in 3.8.1:

SimplifiedChineseEnvironment class >> leadingChar

        ^ 0
!

Could you comment.

Brent

Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 in Squeak (Linux, X11) - Chinese

Xinyu Liu
Hi Brent,

I am also a newbie to squeak. I think the leadingChar shall return 6, but not 0.
You can refer to here:
http://www.is.titech.ac.jp/~ohshima/squeak/m17npaper/node10.html
It is written by Oshima and Abe.

If 0 returns, the following method, for example: will treat it as Latin1. but not Chinese.
Character >> asUnicode
    | table charset v |
    self leadingChar = 0 ifTrue: [^ value].        "<---- here It will be treated as Latin char"
    "..snip..."

I tried to use Chinese letter as class name. It is OK. please see the attached snapshot

to Yoshiki-san:
When I tried Chinese class name and method name, I really found it is greate! it will remove
the last barrier for kids to understand the scripts.

Have a nice day.
Yours,
Liu.

On 8/9/06, Brent Pinkney <[hidden email]> wrote:
ni hao,

When we were looking to support Chinese charaters (as class names, not just labels) we found
we needed to implement this method in 3.8.1:

SimplifiedChineseEnvironment class >> leadingChar

        ^ 0
!

Could you comment.

Brent





Chinese_class.PNG (24K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 in Squeak (Linux, X11)

Yoshiki Ohshima
In reply to this post by Matej Kosik-2
  Matej,

> I have seen that others were able to enter UTF-8 characters text in
> Squeak. Does this work on Linux? I am able to enter UTF-8 characters
> now, but they do not appear correctly. Is there something special what
> must be done (in the out-of-the-box image)? Some special command line
> parameters? Should I load some special packages? Should I load some
> special fonts? (I have tried FreeMono, it did not help). I am using
> Squeak image 3.8 and squeak VM 3.9.7 and UTF-8 in X11 where I need to
> deal with non-English characters.

  The basic idea is to configure the VM in a way that pass the input
characters with no information lost (but possibly with recoding), and
the image re-interprets and makes up characters with the right
codepoint.

  A suggested way is to configure the VM so that it doesn't do any
conversion, then hooking up a UTF-8 textconverter in a way similar to
WinShiftJISInputInterpreter does.

  Hope this helps...

-- Yoshiki

Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 in Squeak (Linux, X11)

Yoshiki Ohshima
In reply to this post by Xinyu Liu
  Xinyu,

> Some Asian Pacific language version are also available. Yoshiki has built one which supports Chinese input and display.
> We are translating the menus, titles, labels, etc. into Chinese now. (There are nearly 5000 items need to be
> translated.) If you are interesting in it, please have a look at:
> http://liuxinyu95.googlepages.com/squeak.dev.chinese
> (I am sorry that it is a page in Chinese)

  Looks good.  As I said, not all of these 5,000 are used.  One thing
we could do is something like:

  * Do 'r-unsed' (remove unused) in the language editor.
  * File in the fairly complete Japanese translation.
  * Then translate all 'untranslated' items.

The problem is that the untranslated items include 1) the phrases only
used in the external or non-existent packages and 2) the phrases
that are "synthesized" at runtime.  The remove-unused feature removes
both 1 and 2.  To supply the needed ones, using Japanese translation
would be a good candidate.

  The above is just an idea and not proven to work, but I think it is
generally a good one.  It could cut the number of phrases by 1,000 or
such.  (On the other hand, if you don't think that removing 1,000 out
of 5,000 is significant, you can just go ahead and do all.  It might
save some future work.)

-- Yoshiki

Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 in Squeak (Linux, X11)

Xinyu Liu
Hi, Yoshiki,

Thank you! Yes, as you said, I also found that the "r-unsed" ones are nearly double of the used items.
when I clicked "where" button in language editor, it showed an empty list. I also browsed the Japanese
version. (compare 3 languages gives me a deep understanding of the real meaning sometime:)). It covers
the most important part. I thought it is a good sample of localized translation.

Have a nice day.
Yours,
Liu.

On 8/17/06, Yoshiki Ohshima <[hidden email]> wrote:
  Looks good.  As I said, not all of these 5,000 are used.  One thing
we could do is something like:

  * Do 'r-unsed' (remove unused) in the language editor.
  * File in the fairly complete Japanese translation.
  * Then translate all 'untranslated' items.

The problem is that the untranslated items include 1) the phrases only
used in the external or non-existent packages and 2) the phrases
that are "synthesized" at runtime.  The remove-unused feature removes
both 1 and 2.  To supply the needed ones, using Japanese translation
would be a good candidate.

  The above is just an idea and not proven to work, but I think it is
generally a good one.  It could cut the number of phrases by 1,000 or
such.  (On the other hand, if you don't think that removing 1,000 out
of 5,000 is significant, you can just go ahead and do all.  It might
save some future work.)

-- Yoshiki