Hello, all.
Many modern linux distros now use utf-8 locale as a default setting. Also stock unix VM never seemed to handle unicode keyboard input under this locale. Are there any plans to do it properly or linux users are supposed to fix the VM themselves (by beating heads against of this thread for example: http://www.nabble.com/Unix-UTF8-input-td11050488.html )? May be anyone has got a 'proper' VM already - it is hard to believe that such drawback is not fixed long time ago. cheers, Danil |
I second Danil question. Is there anyone willing to dig into a that
problem? I am willing to help as much as I can, but I don't know VM nor Linux internals much. I jumped over C directly to Smalltalk, you know ... :) Janko danil osipchuk wrote: > Hello, all. > > Many modern linux distros now use utf-8 locale as a default setting. > Also stock unix VM never seemed to handle unicode keyboard input under > this locale. Are there any plans to do it properly or linux users are > supposed to fix the VM themselves (by beating heads against of this > thread for example: > http://www.nabble.com/Unix-UTF8-input-td11050488.html )? May be anyone > has got a 'proper' VM already - it is hard to believe that such drawback > is not fixed long time ago. > > cheers, > Danil > > > ------------------------------------------------------------------------ > > -- Janko Mivšek AIDA/Web Smalltalk Web Application Server http://www.aidaweb.si |
Heh, the silence (I hope I don't look too inpatient - this topic is recurring for years now). Janko, let's face the fact that we are losers with a wrong vm-platform/language combinations :)
I suspect that OLPC team somehow addressed the issue (it is linux based, most probably utf-8 locale there). I also see UnixUTF8JPInputInterpreter class in the image - so japanese people also have a solution. I'm sure that we can do it also, at least by adopting the solution of Martin Kuball mentioned before (although I would prefer the approach taken in recent Mac and windows VMs - to add a unicode point code as the sixth field of event buffer). But I have another concern now. What will happen to the patch? Will it find its way to the core VM? Is the unix VM being maintained? cheers, Danil 2008/1/16, Janko Mivšek <
[hidden email]>: I second Danil question. Is there anyone willing to dig into a that |
Hi folks!
I am quite new to squeak and smalltalk community, so my solution is just of "It works for me" sort. Attached patch fixes utf-8 keyboard input, while clipboard copy paste is still broken. On 17/01/2008, danil osipchuk <[hidden email]> wrote: > Heh, the silence (I hope I don't look too inpatient - this topic is > recurring for years now). Janko, let's face the fact that we are losers with > a wrong vm-platform/language combinations :) > I suspect that OLPC team somehow addressed the issue (it is linux based, > most probably utf-8 locale there). I also see UnixUTF8JPInputInterpreter > class in the image - so japanese people also have a solution. > > I'm sure that we can do it also, at least by adopting the solution of Martin > Kuball mentioned before (although I would prefer the approach taken in > recent Mac and windows VMs - to add a unicode point code as the sixth field > of event buffer). > > But I have another concern now. What will happen to the patch? Will it find > its way to the core VM? Is the unix VM being maintained? > > cheers, > Danil > > 2008/1/16, Janko Mivšek < [hidden email]>: > > I second Danil question. Is there anyone willing to dig into a that > > problem? I am willing to help as much as I can, but I don't know VM nor > > Linux internals much. I jumped over C directly to Smalltalk, you know ... > :) > > > > Janko > > > > danil osipchuk wrote: > > > Hello, all. > > > > > > Many modern linux distros now use utf-8 locale as a default setting. > > > Also stock unix VM never seemed to handle unicode keyboard input under > > > this locale. Are there any plans to do it properly or linux users are > > > supposed to fix the VM themselves (by beating heads against of this > > > thread for example: > > > http://www.nabble.com/Unix-UTF8-input-td11050488.html > )? May be anyone > > > has got a 'proper' VM already - it is hard to believe that such drawback > > > is not fixed long time ago. > > > > > > cheers, > > > Danil > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > -- > > Janko Mivšek > > AIDA/Web > > Smalltalk Web Application Server > > http://www.aidaweb.si > > > > > > > > > -- Best regards Alexander Serkov squeak-unicode-input.patch (3K) Download Attachment |
Hi Alexander, I've applied your patch and it doesn't work. Now the image doesn't raise the error it did in the past (vm returned code -31 instead of 135 for 'á' when not using UTF or just ignored dead-keys when using it
), but I only see strange characters in the image when trying to type á,é, etc.
I've applied it to current svn branches of olpc or trunk , maybe you're using another revision/version. In that case, please, tell it which one you used. This is the output of my locales: LANG=es_ES.UTF-8 LC_CTYPE="es_ES.UTF-8" LC_NUMERIC="es_ES.UTF-8" LC_TIME="es_ES.UTF-8" LC_COLLATE="es_ES.UTF-8" LC_MONETARY="es_ES.UTF-8" LC_MESSAGES="es_ES.UTF-8" LC_PAPER="es_ES.UTF-8" LC_NAME="es_ES.UTF-8" LC_ADDRESS="es_ES.UTF-8" LC_TELEPHONE="es_ES.UTF-8" LC_MEASUREMENT="es_ES.UTF-8" LC_IDENTIFICATION="es_ES.UTF-8" LC_ALL=es_ES.UTF-8 Best Regards. José L. 2008/1/17, Alexander Serkov <[hidden email]>: Hi folks! |
In reply to this post by Alexander Serkov
Hi,
Well, we are three already, so a chance that UTF-8 is finally ND completely adopted in Squeak is a bit bigger :) And I propose that a patch should be done (complete and well tested) regardless of adoption, which will come sooner or later. Such patch will probably break some existing code, but because UTF-8 solve a problem once for ever, I think the existing code should be adapted by its authors to UTF-8. Janko Alexander Serkov wrote: > Hi folks! > > I am quite new to squeak and smalltalk community, > so my solution is just of "It works for me" sort. > > Attached patch fixes utf-8 keyboard input, > while clipboard copy paste is still broken. > > On 17/01/2008, danil osipchuk <[hidden email]> wrote: >> Heh, the silence (I hope I don't look too inpatient - this topic is >> recurring for years now). Janko, let's face the fact that we are losers with >> a wrong vm-platform/language combinations :) >> I suspect that OLPC team somehow addressed the issue (it is linux based, >> most probably utf-8 locale there). I also see UnixUTF8JPInputInterpreter >> class in the image - so japanese people also have a solution. >> >> I'm sure that we can do it also, at least by adopting the solution of Martin >> Kuball mentioned before (although I would prefer the approach taken in >> recent Mac and windows VMs - to add a unicode point code as the sixth field >> of event buffer). >> >> But I have another concern now. What will happen to the patch? Will it find >> its way to the core VM? Is the unix VM being maintained? >> >> cheers, >> Danil >> >> 2008/1/16, Janko Mivšek < [hidden email]>: >>> I second Danil question. Is there anyone willing to dig into a that >>> problem? I am willing to help as much as I can, but I don't know VM nor >>> Linux internals much. I jumped over C directly to Smalltalk, you know ... >> :) >>> Janko >>> >>> danil osipchuk wrote: >>>> Hello, all. >>>> >>>> Many modern linux distros now use utf-8 locale as a default setting. >>>> Also stock unix VM never seemed to handle unicode keyboard input under >>>> this locale. Are there any plans to do it properly or linux users are >>>> supposed to fix the VM themselves (by beating heads against of this >>>> thread for example: >>>> http://www.nabble.com/Unix-UTF8-input-td11050488.html >> )? May be anyone >>>> has got a 'proper' VM already - it is hard to believe that such drawback >>>> is not fixed long time ago. >>>> >>>> cheers, >>>> Danil >>>> >>>> >>>> >> ------------------------------------------------------------------------ >>>> >>> -- >>> Janko Mivšek >>> AIDA/Web >>> Smalltalk Web Application Server >>> http://www.aidaweb.si >>> >>> >> >> >> >> > > > > ------------------------------------------------------------------------ > > -- Janko Mivšek AIDA/Web Smalltalk Web Application Server http://www.aidaweb.si |
2008/1/17, Janko Mivšek <[hidden email]>: Regards.Hi, I'm sure we are many more: there are also people from french spoken countries in this list using linux and they won't feel very happy when they upgrade their vm's and see that à or î can not be typed anymore. And if you think of some of the countries where OLPC is been used as Brazil... how many users will get annoyed? so a chance that UTF-8 is finally ND right, but who does it? Such patch will probably break some existing code, but because UTF-8 Andreas Raab did already it for the windows vm and the world is still going round... |
Hooray - Ian Piumarta (praise him) has already done the major (if not all) bit of work. Commits at 3-4 months ago in unix branch actually implement the sixth field in event buffer - the ucs4 field. One has only to utilize it in corresponding InputInterpreter (and it will be even portable accross platforms in simple cases and one doesn't care about older images/VMs) I just needed to check the svn first (and yes, the unix-vm is maintaned - I take back all my rambling) :) :) :)
|
Hi Danil,
How are the results of your check? Hardly wait good news :) And José's question should be answered too: when a patch will be ready, tested and proven, who will integrate it into mainstream Squeak? Well, let we provide a patch first, put it into Mantis at http://bugs.squeak.org, then make a broader test on our dev images, then on Damien's squeak-dev ones and if all went well, I think a community will be persuaded enough to accept the patch into a mainstream Squeak. Janko danil osipchuk wrote: > > Hooray - Ian Piumarta (praise him) has already done the major (if not > all) bit of work. Commits at 3-4 months ago in unix branch actually > implement the sixth field in event buffer - the ucs4 field. One has > only to utilize it in corresponding InputInterpreter (and it will be > even portable accross platforms in simple cases and one doesn't care > about older images/VMs) > > I just needed to check the svn first (and yes, the unix-vm is maintaned > - I take back all my rambling) > > :) :) :) > > > > so a chance that UTF-8 is finally ND > completely adopted in Squeak is a bit bigger :) And I propose > that a > patch should be done (complete and well tested) regardless of > adoption, > which will come sooner or later. > > > > right, but who does it? > > > > > Such patch will probably break some existing code, but because UTF-8 > solve a problem once for ever, I think the existing code should be > adapted by its authors to UTF-8. > > > > Andreas Raab did already it for the windows vm and the world is > still going round... > > Regards. > > > > > > > ------------------------------------------------------------------------ > > -- Janko Mivšek AIDA/Web Smalltalk Web Application Server http://www.aidaweb.si |
Janko hello,
I guess we don't need a patch because as I said the VM in unix trunk already has needed functionality. I'm at work at the moment and using windows, but yesterday's evening I compiled the trunk VM and it does work in the way exactly I expected (the good news your are asking for :) ) on kubuntu with utf-8 locale. I actually entered Russian text and it was shown in panes. I had no time to try copy-pasting and file name listing but the keyboard input is working for sure. To do this one needs to get fonts (Andrew Tween's and others excellent work makes it trivial for all platforms). Then you need to create a LanguageEnvironment - there are examples in image. LanguageEnvironment provides keyboard InputInterpeter which in turn implements #nextCharFrom:firstEvt: (see the attached picture with the example which works now for all main VMs - unix, mac and windows :)) After switching to configured language environment (Locale switchToID: (LocaleID isoLanguage: 'ru') ) corresponding character handling is installed. There are places where current squeak-environment is not ready - shiny Damien's dev-images may present wallback occasionally to you (usually 'out of bound errors' - easily fixable by changing #at: to obvious in debugger #at:ifAbsent: implementation ) 2008/1/18, Janko Mivšek <[hidden email]>: Hi Danil, Also Ian (unix VM mainaner) turned out to be alive and active - I noticed even a coding style recommendation for contributors added to doc section. So it is not a problem. But currently I'm not sure if other changes are needed. cheers, Danil Well, let we provide a patch first, put it into Mantis at interpreter.jpeg (64K) Download Attachment |
2008/1/18, danil osipchuk <[hidden email]>: Janko hello, Do you mean that freefont packages and plugin must be installed or using current Bitstream fonts available in the current image could be used? Then you need to create a LanguageEnvironment - there are examples in image. LanguageEnvironment provides keyboard InputInterpeter which in turn implements #nextCharFrom:firstEvt: (see the attached picture with the example which works now for all main VMs - unix, mac and windows :)) Spanish LanguageEnvironment is already created in the image, and I can see letters as ñ , ó, etc. if I open images that already contain those characters or if I open a file containing it, but I can not type those characters using the keyboard: Trying to type 'á' all I get is '?a'. So, in brief, I've being testing this everytime a svn changes happens since last september without any success, so, please, could you explain in more details your steps, specially: - image you used - fonts you used - use it without freefont package and freefont plugin and assure that character with dead keys (accents) work in your keyboard? Thanks for your info. |
Jose hi
(sorry I don't have accents on my keyboard to correctly spell name :)) I'm no expert on topic and going home, but hope I can help. First of all all I assume you have unix VM compiled from current trunk.
My guess is that stock bitstream fonts are perfectly ok for Spanish (and other latin languages). For languages with glyphs outside of the first 256 symbol table - you have to make an effort.
Ok, looking at the 'es' environment in KnownEnvironments of LanguageEnvironment I see that Latin1Environment corresponds to it. One of the tasks of the subclassed LanguageEnvironment is to provide a method to interpret keyboard events. For LatinEnvironment it returns MacRomanInputInterpeter unconditionaly. Typically LanguageEnvironment tries to guess the most fitting InputInterpeter (look at the Japanese one for example), but I would not recommend bother with it for a while. So, I suspect that if instead of default MacRomanInputInterpeter you use something along lines I have suggested - keyboard input will work. The most fast but dirty and cruel hack is just to copy-paste the following snippet into: MacRomanInputInterpeter>>#nextCharFrom: sensor firstEvt: evtBuf ^ evtBuf sixth asCharacter and see what happens. Or you may apply the changeset in the attachment and to switch locales back and forth: (Locale switchToID: (LocaleID isoLanguage: 'en'). (Locale switchToID: (LocaleID isoLanguage: 'es') The idea is that LanguageEnvironment should guess the VM and other conditions and set up the best methods for character conversion. (The change set doesn't do it)
3.9 and 3.10 images - both stock and Damien's 'dev'
Andreew's freetype package (because of Russian - you don't have to, I think)
There are other ways, but again I guess you don't need them.
I don't have them (Russian keyboard) - but I may try to enable different layouts at home when I have linux at hand.
hope this helps TestKeysUTF8Latin1.1.cs (792 bytes) Download Attachment |
I reached home - dead keys (if I get it right) seem to be splitted into modifier and the key after. I'm sure this can be handled in the InputInterpeter, but I'm not sure if it is the supposed behaviour. |
Free forum by Nabble | Edit this page |