[squeak-dev] Accented character input

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Accented character input

Yoshiki Ohshima-2
  Hello,

  We finally get around to attack the accented character input problem
(mainly thanks to Hiroshima-san).  At least on OLPC, the "dead-key"
style accented character input seems to work fine with the new VM.

  However, when I switched my Gnome setting on Fedora laptop (it has
keyboard in Japanese layout) to use Spanish layout keyboard, and
switch the language to Spanish, my keyboard went into a strange state
and I needed to do some random key combinations to get to a stuation
where the new VM works fine.

  I do have a feeling that on a properly configured spanish Linux
install with the real spanish keyboard and everything, this you should
be able to type accented characters (and ' and `, etc.) in the way you
like.  Please test it and let us know how it goes.  Right now, you
have to compile the VM by yourself... as soon as we are more
confident, Bert can make a RPM, and Ian will add this to the
mainstream Squeak VM.  (Or, making RPM is almost free... Probably by
the time I wake up tomorrow, Bert may have a precompiled VM^^;)

  Sorry for taking so long to address the problem.

-- Yoshiki

Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Re: Accented character input

Yoshiki Ohshima-2
>   We finally get around to attack the accented character input problem
> (mainly thanks to Hiroshima-san).  At least on OLPC, the "dead-key"
> style accented character input seems to work fine with the new VM.

  Oh, I forgot to mention that you have to use the latest OLPC etoys
image.  To try it, please get:

        http://tinlizzie.org/olpc/etoys-dev-3.0.zip
and
        http://tinlizzie.org/olpc/EtoysV3.sources

and SVN the source code for the VM from:

        http://squeakvm.org/svn/squeak/branches/olpc/

-- Yoshiki

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Accented character input

José Luis Redrejo
Well, it almost work.
Using a non-olpc image it does not work at all (tested with 3.7, 3.8, 3.9 & 3.10).

Using an olpc image, dead  keys work, but not other non-english characters: spanish ñ, german ß,  euro € symbol (but not with the US Dollar $ ;-), etc.

I've also seen in the svn commit that you've moved the  Squeak3D  & FileCopyPlugin  plugins from internals to externals. Why did yo do it?

Anyway, this is the output activating debugging in the vm when typing ñ and €. You can check that for ñ the string symbol returned is wrong, as it returns two symbols, for € it's correct. On the other hand the ucs4 code it returns (241 & 8364) are correct.


X KeyPress      state 0x10 keycode 47
KeyPress window=stWindow
X mod 0 -> Sq mod 0 (default)
keycode 47
lookupKeys: 'ñ�'
x2sqKey XLookupBoth count 2
x2sqKey string 'ñ�' count 2
x2sqKey symbol 0x000000f1 => 0x000000f1
  2 pending key 16=0xc3
signalInputEvent
EVENT: key down  ` ' (195 = 0xc3) ucs4 0
signalInputEvent
EVENT: key char  ` ' (195 = 0xc3) ucs4 0
  1 pending key 15=0xb1
signalInputEvent
EVENT: key down  ` ' (177 = 0xb1) ucs4 0
signalInputEvent
EVENT: key char  ` ' (177 = 0xb1) ucs4 0
keyCode, ucs4: -1, 241
pressed, buffer: 0, 0
multi_key reset
keyCode, ucs4, multi_key_buffer: -1, 241, 0
signalInputEvent
EVENT: key down  ` ' (-1 = 0xffffffff) ucs4 241
signalInputEvent
EVENT: key char  ` ' (-1 = 0xffffffff) ucs4 241

X KeyRelease    state 0x10 keycode 47
KeyRelease window=stWindow
X mod 0 -> Sq mod 0 (default)
X mod 0 -> Sq mod 0 (default)

X KeyPress      state 0x10 keycode 113
KeyPress window=stWindow
X mod 0 -> Sq mod 0 (default)
keycode 113
lookupKeys: ''
x2sqKey XLookupKeySym
SYM fe03 -> -1
keyCode, ucs4: -1, 0
pressed, buffer: 0, 0
multi_key reset
keyCode, ucs4, multi_key_buffer: -1, 0, 0

X KeyPress      state 0x90 keycode 26
KeyPress window=stWindow
X mod 0 -> Sq mod 0 (default)
keycode 26
lookupKeys: '€'
x2sqKey XLookupBoth count 3
x2sqKey string '€' count 3
x2sqKey symbol 0x000020ac => 0x000020ac
  3 pending key 16=0xe2
signalInputEvent
EVENT: key down  ` ' (226 = 0xe2) ucs4 0
signalInputEvent
EVENT: key char  ` ' (226 = 0xe2) ucs4 0
  2 pending key 15=0x82
signalInputEvent
EVENT: key down  ` ' (130 = 0x82) ucs4 0
signalInputEvent
EVENT: key char  ` ' (130 = 0x82) ucs4 0
  1 pending key 14=0xac
signalInputEvent
EVENT: key down  ` ' (172 = 0xac) ucs4 0
signalInputEvent
EVENT: key char  ` ' (172 = 0xac) ucs4 0
keyCode, ucs4: -1, 8364
pressed, buffer: 0, 0
multi_key reset
keyCode, ucs4, multi_key_buffer: -1, 8364, 0
signalInputEvent
EVENT: key down  ` ' (-1 = 0xffffffff) ucs4 8364
signalInputEvent
EVENT: key char  ` ' (-1 = 0xffffffff) ucs4 8364

X KeyRelease    state 0x90 keycode 26
KeyRelease window=stWindow
X mod 0 -> Sq mod 0 (default)

X KeyRelease    state 0x90 keycode 113
KeyRelease window=stWindow
X mod 0 -> Sq mod 0 (default)



Regards.
José L.



2008/3/20, Yoshiki Ohshima <[hidden email]>:
>   We finally get around to attack the accented character input problem
> (mainly thanks to Hiroshima-san).  At least on OLPC, the "dead-key"
> style accented character input seems to work fine with the new VM.


  Oh, I forgot to mention that you have to use the latest OLPC etoys
image.  To try it, please get:

        http://tinlizzie.org/olpc/etoys-dev-3.0.zip
and
        http://tinlizzie.org/olpc/EtoysV3.sources

and SVN the source code for the VM from:

        http://squeakvm.org/svn/squeak/branches/olpc/


-- Yoshiki




Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Re: Accented character input

Bert Freudenberg
In reply to this post by Yoshiki Ohshima-2
On Mar 20, 2008, at 6:26 , Yoshiki Ohshima wrote:

>  Hello,
>
>  We finally get around to attack the accented character input problem
> (mainly thanks to Hiroshima-san).  At least on OLPC, the "dead-key"
> style accented character input seems to work fine with the new VM.
>
>  However, when I switched my Gnome setting on Fedora laptop (it has
> keyboard in Japanese layout) to use Spanish layout keyboard, and
> switch the language to Spanish, my keyboard went into a strange state
> and I needed to do some random key combinations to get to a stuation
> where the new VM works fine.
>
>  I do have a feeling that on a properly configured spanish Linux
> install with the real spanish keyboard and everything, this you should
> be able to type accented characters (and ' and `, etc.) in the way you
> like.  Please test it and let us know how it goes.  Right now, you
> have to compile the VM by yourself... as soon as we are more
> confident, Bert can make a RPM, and Ian will add this to the
> mainstream Squeak VM.  (Or, making RPM is almost free... Probably by
> the time I wake up tomorrow, Bert may have a precompiled VM^^;)


Hehe. Not until next week I guess. We need to switch the image rpm to  
3.0, too, because of the new DBus plugin. For that we need to rebuild  
the guides etc. And uploading that takes ages when I'm out of office  
(I'm almost off to a long Easter weekend).

- Bert -



Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Accented character input

José Luis Redrejo


2008/3/20, Bert Freudenberg <[hidden email]>:
On Mar 20, 2008, at 6:26 , Yoshiki Ohshima wrote:
>  Hello,
>
>  We finally get around to attack the accented character input problem
> (mainly thanks to Hiroshima-san).  At least on OLPC, the "dead-key"
> style accented character input seems to work fine with the new VM.
>
>  However, when I switched my Gnome setting on Fedora laptop (it has
> keyboard in Japanese layout) to use Spanish layout keyboard, and
> switch the language to Spanish, my keyboard went into a strange state
> and I needed to do some random key combinations to get to a stuation
> where the new VM works fine.
>
>  I do have a feeling that on a properly configured spanish Linux
> install with the real spanish keyboard and everything, this you should
> be able to type accented characters (and ' and `, etc.) in the way you
> like.  Please test it and let us know how it goes.  Right now, you
> have to compile the VM by yourself... as soon as we are more
> confident, Bert can make a RPM, and Ian will add this to the
> mainstream Squeak VM.  (Or, making RPM is almost free... Probably by
> the time I wake up tomorrow, Bert may have a precompiled VM^^;)



Hehe. Not until next week I guess. We need to switch the image rpm to
3.0, too, because of the new DBus plugin. For that we need to rebuild
the guides etc. And uploading that takes ages when I'm out of office
(I'm almost off to a long Easter weekend).


- Bert -





A stupid checking:
adding the class UTF32InputInterpreter to the image (if it's not an olpc image that already has this class), and assuring that Latin1Environment>>inputInterpreterClass returns UTF32InputInterpreter , everything works with current mainstream squeak-vm. None of the last changes on the olpc image are needed (at least for spanish language).

It's a very simple fix, I can not understand why this method is nowadays still returning  MacRomanInputInterpreter.
Any reason to justify it?
Even in the olpc image there is a long conditional to assure MacRomanInputInterpreter is returned when sugar is not present: am I loosing something or there is no justification for this behaviour?

Regards.
José L.


Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Re: [Etoys] Accented character input

Walter Bender-3
In reply to this post by Bert Freudenberg
The OLPC Spanish keyboard layout uses dead_keys. The US International
keyboard uses Unicode combining characters. I am serious thinking of
switching to dead_keys for everywhere I can, as I think it makes it
easier for everyone. But I am now troubled by Yoshiki's report.

-walter

On Thu, Mar 20, 2008 at 6:10 AM, Bert Freudenberg <[hidden email]> wrote:

> On Mar 20, 2008, at 6:26 , Yoshiki Ohshima wrote:
>  >  Hello,
>  >
>  >  We finally get around to attack the accented character input problem
>  > (mainly thanks to Hiroshima-san).  At least on OLPC, the "dead-key"
>  > style accented character input seems to work fine with the new VM.
>  >
>  >  However, when I switched my Gnome setting on Fedora laptop (it has
>  > keyboard in Japanese layout) to use Spanish layout keyboard, and
>  > switch the language to Spanish, my keyboard went into a strange state
>  > and I needed to do some random key combinations to get to a stuation
>  > where the new VM works fine.
>  >
>  >  I do have a feeling that on a properly configured spanish Linux
>  > install with the real spanish keyboard and everything, this you should
>  > be able to type accented characters (and ' and `, etc.) in the way you
>  > like.  Please test it and let us know how it goes.  Right now, you
>  > have to compile the VM by yourself... as soon as we are more
>  > confident, Bert can make a RPM, and Ian will add this to the
>  > mainstream Squeak VM.  (Or, making RPM is almost free... Probably by
>  > the time I wake up tomorrow, Bert may have a precompiled VM^^;)
>
>
>  Hehe. Not until next week I guess. We need to switch the image rpm to
>  3.0, too, because of the new DBus plugin. For that we need to rebuild
>  the guides etc. And uploading that takes ages when I'm out of office
>  (I'm almost off to a long Easter weekend).
>
>  - Bert -
>
>
>
>
>  _______________________________________________
>  Etoys mailing list
>  [hidden email]
>  http://lists.laptop.org/listinfo/etoys
>



--
Walter Bender
One Laptop per Child
http://laptop.org

Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Re: [Etoys] Accented character input

Yoshiki Ohshima-2
  Walter,

> The OLPC Spanish keyboard layout uses dead_keys. The US International
> keyboard uses Unicode combining characters. I am serious thinking of
> switching to dead_keys for everywhere I can, as I think it makes it
> easier for everyone. But I am now troubled by Yoshiki's report.

  *Now*, Etoys supports both keyboards more or less, it is ok for OLPC
(and for us) to switching to dead_keys for everywhere.

  There are some other obscure input methods that GTK tends to support
but Etoys still doesn't.  But as long as you switch to dead_key or
multi_key or Unicode combining characters, we can do it without a
(big) problem.

-- Yoshiki

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Accented character input

Yoshiki Ohshima-2
In reply to this post by José Luis Redrejo
  José,

  Thank you for checking!

> A stupid checking:
> adding the class UTF32InputInterpreter to the image (if it's not an olpc image that already has this class), and
> assuring that Latin1Environment>>inputInterpreterClass returns UTF32InputInterpreter , everything works with current
> mainstream squeak-vm. None of the last changes on the olpc image are
> needed (at least for spanish language).

  Everything, including accented character input with "dead_keys"?
Hmm.

> It's a very simple fix, I can not understand why this method is nowadays still returning  MacRomanInputInterpreter.
> Any reason to justify it?
> Even in the olpc image there is a long conditional to assure MacRomanInputInterpreter is returned when sugar is not
> present: am I loosing something or there is no justification for
> this behaviour?

  Wtih update 1925 to the etoys 3.0 stream, I ditched the "long"
conditional and now it simply returns UTF32InputInterpreter.

-- Yoshiki

Actually, we should take care of the case when somebody trys to run
the new image on an old VM (UTF32InputInterpreter should look at the
third entry in the event array when sixth is zero...)

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Accented character input

José Luis Redrejo


2008/3/20, Yoshiki Ohshima <[hidden email]>:
  José,

  Thank you for checking!


> A stupid checking:
> adding the class UTF32InputInterpreter to the image (if it's not an olpc image that already has this class), and
> assuring that Latin1Environment>>inputInterpreterClass returns UTF32InputInterpreter , everything works with current
> mainstream squeak-vm. None of the last changes on the olpc image are
> needed (at least for spanish language).


  Everything, including accented character input with "dead_keys"?
Hmm.


Yes, but there is one problem: clicking on the alt or arrow keys raises a blank square.

> It's a very simple fix, I can not understand why this method is nowadays still returning  MacRomanInputInterpreter.
> Any reason to justify it?
> Even in the olpc image there is a long conditional to assure MacRomanInputInterpreter is returned when sugar is not
> present: am I loosing something or there is no justification for
> this behaviour?


  Wtih update 1925 to the etoys 3.0 stream, I ditched the "long"
conditional and now it simply returns UTF32InputInterpreter.

-- Yoshiki

Actually, we should take care of the case when somebody trys to run
the new image on an old VM (UTF32InputInterpreter should look at the
third entry in the event array when sixth is zero...)


Yes, but the alternative is worst, so I think that's the best solution.

Regards.
José L.


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Accented character input

Yoshiki Ohshima-2
  José,

>       Everything, including accented character input with "dead_keys"?
>     Hmm.
>
> Yes, but there is one problem: clicking on the alt or arrow keys
> raises a blank square.

  The blank square problem is fixed in the OLPC version and should be
merged to the trunk version.  But, I'm still puzzled to hear that dead
key works for you on the VM.  My experimental Spanish setting on my
Fedora might be wrong, but here is my reasoning:

  With a dead key ("'" for example), first time you pressed it, it
doesn't give you the character, right?  Then, you can press "e" to get
e with accent, or you can press "'" again to get a single "'".  This
means that there should be a state kept in the VM to remember the
previous character, but before the patch, there was no such thing.

  Maybe there is some input method in front of the VM that processes
the keyboard input first (like Japanese have)?

>     Actually, we should take care of the case when somebody trys to run
>     the new image on an old VM (UTF32InputInterpreter should look at the
>     third entry in the event array when sixth is zero...)
>
> Yes, but the alternative is worst, so I think that's the best solution.

  Sorry but what was "the alternative"?

  Thank you!

-- Yoshiki

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Accented character input

José Luis Redrejo


2008/3/20, Yoshiki Ohshima <[hidden email]>:
  José,


>       Everything, including accented character input with "dead_keys"?
>     Hmm.
>
> Yes, but there is one problem: clicking on the alt or arrow keys
> raises a blank square.


  The blank square problem is fixed in the OLPC version and should be
merged to the trunk version.

Yes, but the olpc version misses the non-english & non-dead-keys characters as ñ,ß, etc. and with the image patch that works.
 

  But, I'm still puzzled to hear that dead
key works for you on the VM.  My experimental Spanish setting on my
Fedora might be wrong, but here is my reasoning:

  With a dead key ("'" for example), first time you pressed it, it
doesn't give you the character, right?  Then, you can press "e" to get
e with accent, or you can press "'" again to get a single "'".  This
means that there should be a state kept in the VM to remember the
previous character, but before the patch, there was no such thing.

  Maybe there is some input method in front of the VM that processes
the keyboard input first (like Japanese have)?


mmm, if you mean in the X libraries called by the vm, I think that's the case. In fact that works that way whenever you use x2sqKeyInput instead of  x2sqKeyPlain at vm-display-X11/sqUnixX11.c. I.e.: if , calling the vm, you use -nointl or your locale is not UTF-8, you'll use   x2sqKey= x2sqKeyPlain instead of x2sqKey= x2sqKeyInput, then you'll get the accent and the vowel separated. Using x2sqKeyInput you will only get one keycode. That's worked always that way, the problem was that the keycode was wrongly interpreted by the image because it was using the MacRomanInputInterpreter instead of the UTF32InputInterpreter class.

>     Actually, we should take care of the case when somebody trys to run
>     the new image on an old VM (UTF32InputInterpreter should look at the
>     third entry in the event array when sixth is zero...)
>
> Yes, but the alternative is worst, so I think that's the best solution.


  Sorry but what was "the alternative"?


Having a new  vm that does not work correctly typing non-english texts with old and new images. With this change it will, at least, work typing using new images.

Regards.
José L.


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Accented character input

Yoshiki Ohshima-2
  José,

> mmm, if you mean in the X libraries called by the vm, I think that's the case. In fact that works that way whenever you
> use x2sqKeyInput instead of  x2sqKeyPlain at vm-display-X11/sqUnixX11.c. I.e.: if , calling the vm, you use -nointl or
> your locale is not UTF-8, you'll use   x2sqKey= x2sqKeyPlain instead of x2sqKey= x2sqKeyInput, then you'll get the
> accent and the vowel separated. Using x2sqKeyInput you will only get one keycode. That's worked always that way, the
> problem was that the keycode was wrongly interpreted by the image because it was using the MacRomanInputInterpreter
> instead of the UTF32InputInterpreter class.

  Wow, Did I wasted several hours to solve a wrong problem?  I am
*still* in mystery though that Etoys on Spanish OLPC has been in UTF-8
and without -nointl and its had not been working.  More to come...

  Thank you!

-- Yoshiki

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Accented character input

Karl-19
Yoshiki Ohshima wrote:

>   José,
>
>  
>> mmm, if you mean in the X libraries called by the vm, I think that's the case. In fact that works that way whenever you
>> use x2sqKeyInput instead of  x2sqKeyPlain at vm-display-X11/sqUnixX11.c. I.e.: if , calling the vm, you use -nointl or
>> your locale is not UTF-8, you'll use   x2sqKey= x2sqKeyPlain instead of x2sqKey= x2sqKeyInput, then you'll get the
>> accent and the vowel separated. Using x2sqKeyInput you will only get one keycode. That's worked always that way, the
>> problem was that the keycode was wrongly interpreted by the image because it was using the MacRomanInputInterpreter
>> instead of the UTF32InputInterpreter class.
>>    
>
>   Wow, Did I wasted several hours to solve a wrong problem?
Been there, done that ;-)

Karl

>  I am
> *still* in mystery though that Etoys on Spanish OLPC has been in UTF-8
> and without -nointl and its had not been working.  More to come...
>
>   Thank you!
>
> -- Yoshiki
>
>
>  


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: Accented character input

José Luis Redrejo
In reply to this post by Yoshiki Ohshima-2


2008/3/21, Yoshiki Ohshima <[hidden email]>:
  José,


> mmm, if you mean in the X libraries called by the vm, I think that's the case. In fact that works that way whenever you
> use x2sqKeyInput instead of  x2sqKeyPlain at vm-display-X11/sqUnixX11.c. I.e.: if , calling the vm, you use -nointl or
> your locale is not UTF-8, you'll use   x2sqKey= x2sqKeyPlain instead of x2sqKey= x2sqKeyInput, then you'll get the
> accent and the vowel separated. Using x2sqKeyInput you will only get one keycode. That's worked always that way, the
> problem was that the keycode was wrongly interpreted by the image because it was using the MacRomanInputInterpreter
> instead of the UTF32InputInterpreter class.


  Wow, Did I wasted several hours to solve a wrong problem?  I am
*still* in mystery though that Etoys on Spanish OLPC has been in UTF-8
and without -nointl and its had not been working.  More to come...


Well, I have a borrowed OLPC on my desktop. I have done these tests:
- With en_US locale no way to type spanish accents or chars
- Changing .i18n to es_US.UTF-8, squeak starts in spanish but still english keyboard, I can not type any accent yet.
- Changing /etc/sysconfig/keyboard to use spanish keyboard: I can type ñ, but typing accents gives an square and later the vowel, instead of the vowel with the accent.
- About this system gives this information: etoys2.2, latest update #1796

But, in my laptop, with the spanish locales and the change in Latin1Environment>>inputInterpr
eterClass it works perfectly with etoys2.2, update #1916. So, I've copied this image to the olpc and it doesn't work again: I still can type ñ, but accents gives two characters.

So, it seems the olpc-vm has something different from the mainstream vm that makes it fail in this case, but I'm afraid that applying the changes you said at the beginning of this thread you loose the ñ,€,ß, etc. characters, so I think it's better to investigate what are the differences between both vm to make it fail in the olpc-vm.
If you need more details about the vm I'm using, is exactly this one:
http://packages.debian.org/sid/squeak-vm
on a 64 bits machine.
I.e: it's a mainstream svn vm, with some patches to fix 64 bits issues and similar things, but nothing related to the input handling.

Hope this helps.
José L.