New Win32 VM [m17n testers needed]

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
124 messages Options
1234 ... 7
Reply | Threaded
Open this post in threaded view
|

New Win32 VM [m17n testers needed]

Andreas.Raab
Hi Folks -

Thanks to some dedicated OLPC-related work done in Greece by Chris
Petsos[1], we now have a Windows VM with Unicode support enabled. This
VM will both generate UTF input from characters as well as support
clipboard, file and directory names in UTF-8. The VM is available here:

http://www.squeakvm.org/win32/release/SqueakVM-Win32-3.10.1-bin.zip
http://www.squeakvm.org/win32/release/SqueakVM-Win32-3.10.1-src.zip

You are invited to test the new work but be advised that this may
require some manual adjustments - for an understanding what needs to be
done, please see [2].

I'm interested in reports, both good and bad about whether the
clipboard, file, directory and input support behaves as expected.

[1]http://lists.squeakfoundation.org/pipermail/vm-dev/2007-May/001194.html
[2]http://lists.squeakfoundation.org/pipermail/vm-dev/2007-June/001306.html

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

Alexander Lazarevic'
I just did a quick test out of curiosity. I did an "update code from
server" on a rather old image (Squeak3.9gamma [latest update: #7066])
and it seems there is an error while creating a cache directory/file.
The primitive "primOpen: fileName writable: writableFlag" fails on the
filename [1] and writableFlag true.
This works with a 3.7.1 VM.

Alex

[1] 'C:\Dokumente und
Einstellungen\laza\Desktop\SqueakVM-Win32-3.10.1-bin\package-cache\ScriptLoader-sd.324.mcz'

Andreas Raab schrieb:

> Hi Folks -
>
> Thanks to some dedicated OLPC-related work done in Greece by Chris
> Petsos[1], we now have a Windows VM with Unicode support enabled. This
> VM will both generate UTF input from characters as well as support
> clipboard, file and directory names in UTF-8. The VM is available here:
>
> http://www.squeakvm.org/win32/release/SqueakVM-Win32-3.10.1-bin.zip
> http://www.squeakvm.org/win32/release/SqueakVM-Win32-3.10.1-src.zip
>
> You are invited to test the new work but be advised that this may
> require some manual adjustments - for an understanding what needs to be
> done, please see [2].
>
> I'm interested in reports, both good and bad about whether the
> clipboard, file, directory and input support behaves as expected.
>
> [1]http://lists.squeakfoundation.org/pipermail/vm-dev/2007-May/001194.html
> [2]http://lists.squeakfoundation.org/pipermail/vm-dev/2007-June/001306.html
>
> Cheers,
>   - Andreas
>

Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

"Martin v. Löwis"
In reply to this post by Andreas.Raab
> I'm interested in reports, both good and bad about whether the
> clipboard, file, directory and input support behaves as expected.

After installing the various fixes and picking fonts, I was able to
enter Cyrillic text through the keyboard, using a Greek environment,
on a German Windows XP.

What I couldn't get working is the listing of file names in the
file browser. To reproduce, create file names using Greek, Cyrillic,
and Chinese letters, and then do "open/file list". With the wrong font,
I get question marks. When I select a font that ought to be able to
represent it correctly, I still get a mix of Latin letters and
square boxes.

What I don't understand is: Why do I have to set the language
environment (*) to make it work? It's Unicode, so Squeak
shouldn't care what the language is. If it needs to know, it
should get the language from the system.

Regards,
Martin

(*) As instructed, I did
Locale currentPlatform: (Locale localeID: (LocaleID isoString: 'el')).

Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

Andreas.Raab
In reply to this post by Alexander Lazarevic'
Yes, I just fixed that. A left-over call to CreateDirectoryA() would
make directory creation impossible and later file creation attempts as
well. Will be fixed in the next version.

Cheers,
   - Andreas

Alexander Lazarevic' wrote:

> I just did a quick test out of curiosity. I did an "update code from
> server" on a rather old image (Squeak3.9gamma [latest update: #7066])
> and it seems there is an error while creating a cache directory/file.
> The primitive "primOpen: fileName writable: writableFlag" fails on the
> filename [1] and writableFlag true.
> This works with a 3.7.1 VM.
>
> Alex
>
> [1] 'C:\Dokumente und
> Einstellungen\laza\Desktop\SqueakVM-Win32-3.10.1-bin\package-cache\ScriptLoader-sd.324.mcz'
>
> Andreas Raab schrieb:
>> Hi Folks -
>>
>> Thanks to some dedicated OLPC-related work done in Greece by Chris
>> Petsos[1], we now have a Windows VM with Unicode support enabled. This
>> VM will both generate UTF input from characters as well as support
>> clipboard, file and directory names in UTF-8. The VM is available here:
>>
>> http://www.squeakvm.org/win32/release/SqueakVM-Win32-3.10.1-bin.zip
>> http://www.squeakvm.org/win32/release/SqueakVM-Win32-3.10.1-src.zip
>>
>> You are invited to test the new work but be advised that this may
>> require some manual adjustments - for an understanding what needs to be
>> done, please see [2].
>>
>> I'm interested in reports, both good and bad about whether the
>> clipboard, file, directory and input support behaves as expected.
>>
>> [1]http://lists.squeakfoundation.org/pipermail/vm-dev/2007-May/001194.html
>> [2]http://lists.squeakfoundation.org/pipermail/vm-dev/2007-June/001306.html
>>
>> Cheers,
>>   - Andreas
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

Andreas.Raab
In reply to this post by "Martin v. Löwis"
Martin v. Löwis wrote:
> What I couldn't get working is the listing of file names in the
> file browser. To reproduce, create file names using Greek, Cyrillic,
> and Chinese letters, and then do "open/file list". With the wrong font,
> I get question marks. When I select a font that ought to be able to
> represent it correctly, I still get a mix of Latin letters and
> square boxes.

Hm ... there isn't any easy way to test this I guess? The code hasn't
changed that much so I would expect this to be working (in particular
considering that it seems to work fine for ascii file names).

> What I don't understand is: Why do I have to set the language
> environment (*) to make it work? It's Unicode, so Squeak
> shouldn't care what the language is. If it needs to know, it
> should get the language from the system.

I don't know. In particular considering that we have now the locale
plugin which can detect these settings easily.

Cheers,
   - Andreas


Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

Andreas.Raab
In reply to this post by "Martin v. Löwis"
Martin v. Löwis wrote:
> What I couldn't get working is the listing of file names in the
> file browser. To reproduce, create file names using Greek, Cyrillic,
> and Chinese letters, and then do "open/file list". With the wrong font,
> I get question marks. When I select a font that ought to be able to
> represent it correctly, I still get a mix of Latin letters and
> square boxes.

Digging in the code it seems that the conversion of file names is broken
(or at least it seems that way). I can't seem to find the place where a
UTF8TextConverter would ever be used (which of course is a requirement
for this to work). It seems that the code still assumes that the VMs
present file names encoded in the corresponding code pages (which also
explains why you'd need to set the language environment etc). The thing
to try would be to go into LanguageEnvironment class and change
defaultFileNameConverter to include:

   "Windows VMs always use UTF8-encoded file names now"
   Smalltalk platformName = 'Win32'
     ifTrue:[^UTF8TextConverter new].

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

Andreas.Raab
In reply to this post by Andreas.Raab
I have fixed the problem with directory creation and updated the VM to
3.10.2 which is up in the usual places:

http://www.squeakvm.org/win32/release/SqueakVM-Win32-3.10.2-bin.zip
http://www.squeakvm.org/win32/release/SqueakVM-Win32-3.10.2-src.zip

Cheers,
   - Andreas

Andreas Raab wrote:

> Hi Folks -
>
> Thanks to some dedicated OLPC-related work done in Greece by Chris
> Petsos[1], we now have a Windows VM with Unicode support enabled. This
> VM will both generate UTF input from characters as well as support
> clipboard, file and directory names in UTF-8. The VM is available here:
>
> http://www.squeakvm.org/win32/release/SqueakVM-Win32-3.10.1-bin.zip
> http://www.squeakvm.org/win32/release/SqueakVM-Win32-3.10.1-src.zip
>
> You are invited to test the new work but be advised that this may
> require some manual adjustments - for an understanding what needs to be
> done, please see [2].
>
> I'm interested in reports, both good and bad about whether the
> clipboard, file, directory and input support behaves as expected.
>
> [1]http://lists.squeakfoundation.org/pipermail/vm-dev/2007-May/001194.html
> [2]http://lists.squeakfoundation.org/pipermail/vm-dev/2007-June/001306.html
>
> Cheers,
>   - Andreas
>
>


Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

Yoshiki Ohshima
> http://www.squeakvm.org/win32/release/SqueakVM-Win32-3.10.2-bin.zip
> http://www.squeakvm.org/win32/release/SqueakVM-Win32-3.10.2-src.zip

  The file path interpretation works on Japanese Windows (see the file
name pane in the attached picture) by changing the
fileNameConverterClass of JapaneseEnvironment.  However, I can't quite
figure out to make keyboard input work...  It doesn't seem that I get
meaningful values when the input is done via an IME.

  I'll take a look at it later...

-- Yoshiki



filelist.png (16K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

Yoshiki Ohshima
In reply to this post by "Martin v. Löwis"
  Martin,

> What I couldn't get working is the listing of file names in the
> file browser. To reproduce, create file names using Greek, Cyrillic,
> and Chinese letters, and then do "open/file list". With the wrong font,
> I get question marks. When I select a font that ought to be able to
> represent it correctly, I still get a mix of Latin letters and
> square boxes.

  As Andreas wrote, defaultFileNameConverter has to be modified (and
the class var in the LanguageEnvironment has to be cleared.)

> What I don't understand is: Why do I have to set the language
> environment (*) to make it work? It's Unicode, so Squeak
> shouldn't care what the language is. If it needs to know, it
> should get the language from the system.

  Read the Unicode standard.

  Because it is Unicode, a mechanism out of scope of Unicode has to
supply language information to do sensible stuff.

-- Yoshiki

Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] Re: New Win32 VM [m17n testers needed]

Chris Petsos
In reply to this post by Yoshiki Ohshima
>  The file path interpretation works on Japanese Windows (see the file
> name pane in the attached picture) by changing the
> fileNameConverterClass of JapaneseEnvironment.  However, I can't quite
> figure out to make keyboard input work...  It doesn't seem that I get
> meaningful values when the input is done via an IME.


Yoshiki,

Don't forget that the VM is sending unicode chars as the sixth data member
of the event buffer.
So advice your input interpreter as

    evtbuf sixth.

instead of

    evtbuf third.

I also tried to copy/paste same japanese text with the new VM and i couldn't
do it...
Japanese locale specific question... does a unicode Japanese character fit
inside a WCHAR struct?

Cause, there have been used Windows functions that convert WCHAR streams

    MultiByteToWideChar
    WideCharToMultiByte

Could the problem be there?

Christos.


Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] Re: New Win32 VM [m17n testers needed]

Yoshiki Ohshima
  Chris,

> Don't forget that the VM is sending unicode chars as the sixth data member
> of the event buffer.

  Sure.  I looked at these values yet didn't see it.

  (If I remember correctly, it got changed (from third to
sixth.  take a look at MacUnicodeInputInterpreter>>initialize.  It
tells you the history^^;))

> Japanese locale specific question... does a unicode Japanese character fit
> inside a WCHAR struct?

  It does.  UTF-16LE without surrogated pairs is almost ok for daily
use.

> Cause, there have been used Windows functions that convert WCHAR streams
>
>     MultiByteToWideChar
>     WideCharToMultiByte
>
> Could the problem be there?

  I can't quite trace the detail (and my memory), but when the macro
UNICODE is defined, the latter should be just used?

-- Yoshiki


Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] Re: New Win32 VM [m17n testers needed]

Andreas.Raab
Yoshiki Ohshima wrote:
>> Cause, there have been used Windows functions that convert WCHAR streams
>>
>>     MultiByteToWideChar
>>     WideCharToMultiByte
>>
>> Could the problem be there?
>
>   I can't quite trace the detail (and my memory), but when the macro
> UNICODE is defined, the latter should be just used?

The VM is not generally compiled with -DUNICODE; the places where we
utilize WCHAR are explicit and we use the explicit *W variants of the
Windows functions.

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] Re: New Win32 VM [m17n testers needed]

Chris Petsos
In reply to this post by Yoshiki Ohshima
>> Japanese locale specific question... does a unicode Japanese character
>> fit
>> inside a WCHAR struct?
>
>  It does.  UTF-16LE without surrogated pairs is almost ok for daily
> use.
>

Right, but take a look at the codepage parameter of the functions

    http://msdn2.microsoft.com/en-us/library/ms776413.aspx
    http://msdn2.microsoft.com/en-us/library/ms776420.aspx

It only supports until UTF8...so, can it work for Japanese chars?

>> Cause, there have been used Windows functions that convert WCHAR streams
>>
>>     MultiByteToWideChar
>>     WideCharToMultiByte
>>

Christos.


Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

"Martin v. Löwis"
In reply to this post by Andreas.Raab
>> What I couldn't get working is the listing of file names in the
>> file browser. To reproduce, create file names using Greek, Cyrillic,
>> and Chinese letters, and then do "open/file list". With the wrong font,
>> I get question marks. When I select a font that ought to be able to
>> represent it correctly, I still get a mix of Latin letters and
>> square boxes.
>
> Hm ... there isn't any easy way to test this I guess?

It's actually very easy.

1. Create a text file.
2. rename it to some Cyrillic (or Japanese, or whatever) name
3. Open it in the listing

To rename it, the easiest way is to start up charmap.exe,
select a few "funny" characters, copy them to the clipboard,
and past them in explorer into the file name.

> The code hasn't
> changed that much so I would expect this to be working (in particular
> considering that it seems to work fine for ascii file names).

That might be the problem. If the code is still using the *A functions
(FindFirstFileA), then this cannot work - but I would expect to see
question marks in that case. If it was changed to use the *W functions,
then the question is how these strings are communicated to the VM.

Regards,
Martin

Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

"Martin v. Löwis"
In reply to this post by Yoshiki Ohshima

>> What I don't understand is: Why do I have to set the language
>> environment (*) to make it work? It's Unicode, so Squeak
>> shouldn't care what the language is. If it needs to know, it
>> should get the language from the system.
>
>   Read the Unicode standard.

I did. What section are you specifically referring to?

>   Because it is Unicode, a mechanism out of scope of Unicode has to
> supply language information to do sensible stuff.

What is the sensible stuff it needs to do?

Regards,
Martin

Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

Andreas.Raab
In reply to this post by "Martin v. Löwis"
Martin v. Löwis wrote:

>> Hm ... there isn't any easy way to test this I guess?
>
> It's actually very easy.
>
> 1. Create a text file.
> 2. rename it to some Cyrillic (or Japanese, or whatever) name
> 3. Open it in the listing
>
> To rename it, the easiest way is to start up charmap.exe,
> select a few "funny" characters, copy them to the clipboard,
> and past them in explorer into the file name.
Ah great. Indeed, there is a series of steps you need to take to make it
work:
1) You need to fix LanguageEnvironment class defaultFileNameConverter
   (make it return UTF8TextConverter new)
2) You need to load a TTF font with the glyphs. For this you need:
    * Load the TTF loading fixes that Christos posted
    * Drag and drop a TTF font with the right glyphs on Squeak (Arial
works fine)
3) Make this font the default font for text and lists.

Once you got all of this the file list shows the correct names.

Cheers,
   - Andreas



Filelist.gif (39K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

Yoshiki Ohshima
In reply to this post by "Martin v. Löwis"
  Martin,

> >> What I don't understand is: Why do I have to set the language
> >> environment (*) to make it work? It's Unicode, so Squeak
> >> shouldn't care what the language is. If it needs to know, it
> >> should get the language from the system.
> >
> >   Read the Unicode standard.
>
> I did. What section are you specifically referring to?

  For example, take a look at this FAQ entry:

http://www.unicode.org/faq/han_cjk.html#3

(and one before this and after).

> >   Because it is Unicode, a mechanism out of scope of Unicode has to
> > supply language information to do sensible stuff.
>
> What is the sensible stuff it needs to do?

  To display strings in an ok way.

http://www.unicode.org/faq/han_cjk.html#2

says that you should select a proper font based on the language you
would like to treat the character in.

  Although the current Squeak implementation is not there yet, you
would like to do different sorting or uppercase/lowercase conversions
based on the language (even within Latin-1 regions).  A segment of
text generally should have more information other than the bare code
point of Unicode.

-- Yoshiki

Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

K. K. Subramaniam
In reply to this post by Andreas.Raab
On Monday 04 June 2007 12:08 am, Andreas Raab wrote:
>    "Windows VMs always use UTF8-encoded file names now"
>    Smalltalk platformName = 'Win32'
>      ifTrue:[^UTF8TextConverter new].
Andreas,

The conditional is incorrect and unnecessary. filename encoding depends on
filesystem, not the VM platform. For instance, I could have a UTF-8 file on a
USB flash and use it across different VMs.

AFAIK, FAT fs does not support UTF-8. NTFS, HPFS (Mac) and all current Linux
filesystems support UTF-8 in filenames.

Regards .. Subbu

Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

Bert Freudenberg

On Jun 4, 2007, at 12:49 , subbukk wrote:

> On Monday 04 June 2007 12:08 am, Andreas Raab wrote:
>>    "Windows VMs always use UTF8-encoded file names now"
>>    Smalltalk platformName = 'Win32'
>>      ifTrue:[^UTF8TextConverter new].
> Andreas,
>
> The conditional is incorrect and unnecessary. filename encoding  
> depends on
> filesystem, not the VM platform. For instance, I could have a UTF-8  
> file on a
> USB flash and use it across different VMs.
>
> AFAIK, FAT fs does not support UTF-8. NTFS, HPFS (Mac) and all  
> current Linux
> filesystems support UTF-8 in filenames.

Wrong. This solely defines what encoding is used to communicate  
between the image and the VM. The VM then translates this to whatever  
encoding the file system uses.

- Bert -



Reply | Threaded
Open this post in threaded view
|

Re: New Win32 VM [m17n testers needed]

K. K. Subramaniam
On Monday 04 June 2007 4:28 pm, Bert Freudenberg wrote:
> > .. filename encoding
> > depends on
> > filesystem, not the VM platform. For instance, I could have a UTF-8
> > file on a
> > USB flash and use it across different VMs...
> Wrong. This solely defines what encoding is used to communicate
> between the image and the VM. The VM then translates this to whatever
> encoding the file system uses.
I presume, this double conversion is transparent to code in the image. The
code assumes that if the VM is win32, then UTF-8 is supported in filenames
instead of querying the VM for a UTF-8 capability. This breaks encapsulation.
In Tim's words :

On Tuesday 15 May 2007 12:46 am, tim Rowledge wrote:
> .. Allowing #fileNamed: to
> attempt to parse mangled platform related strings was a serious
> error. Platform fiddle-faddle for filenames is just horrific.

Regards .. Subbu

1234 ... 7