In VW 7.6, WinProcess>>cshOne: returned a MSCP1252String with understandable text (attachment).
But in 7.7, it returns a TwoByteString with #'UTF-16' encoding (attachment). This looks vaguely like kanji, which I cannot read at all. Can anybody explain this change to me? Is there a way to convert it to something understandable? I don't use the output for anything in the application I'm converting, but I may need it in the future. TIA, Alan _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc cshOne Output (7.6).jpg (119K) Download Attachment cshOne Output (7.7).jpg (113K) Download Attachment |
This is the joy of Windows encodings. Basically, if a
program runs under Windows and sends output to standardout, it will
probably give you output in one of two encodings. Either the oem encoding
of the particular windows locale (that is, the old DOS-type code pages,
something like code page 437 in North America, rather than the 1252 used
for most Windows purposes in North America). The other is UTF-16. There's
no mechanism to identify which it's going to use. For shell commands, and
some other programs that honor the setting, there's a command-line
argument to the shell that can tell it what to use. If you give it /U,
then those programs will return UTF-16. But other programs will return
whatever they feel like - which might even be raw binary data. As the
invoker of the program, you're expected to know what it's going to
return.
So, in 7.6, it was returning an MS437 string, which Smalltalk interpreted as an MS1252 string, which is wrong. But since the two encodings overlap on basic ASCII, you probably never noticed the difference. Since we're providing a generic mechanism there, we don't know what they're going to return. So we could leave it using the OEM encoding, but then there are commands that would outright fail or give wrong results (e.g. 'dir' if you had characters out of range of code page 437, or which were represented differently in code page 1252). Or we could default to Unicode, which is what we do. So when using cshOne:, we do explicitly use the /U option to the Windows shell, and expect UTF-16 back. If the program doesn't respect that flag, or otherwise return UTF-16, then you're probably better off to use fork:arguments: directly. Note that the implementation of shOne: is essentially So you could just remove the encoding call and the /u argument and do it yourself. But if you just want to run the program, there's probably not much reason to go through a command line interpreter anyway. Just say self fork: 'myprogram.exe' arguments: #('one' 'two' 'three'). You may want to set the encoding to be what you expect the program to produce, because it might not actually be 437 either. At 06:57 PM 2010-02-07, Alan Darlington wrote: In VW 7.6, WinProcess>>cshOne: returned a MSCP1252String with understandable text (attachment). --
Alan Knight [|], Engineering Manager, Cincom Smalltalk
_______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Free forum by Nabble | Edit this page |