[bug] ExternalProcess stream encoding

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[bug] ExternalProcess stream encoding

Mike Hales
On 7.7, the windows implementation of ExternalProcess (WinProcess>>shOne:) has changed to invoke the shell in unicode mode (using the /u flag) and sets the stream encoding to utf-16. This breaks badly on many commands. The shell itself seems to work in utf-16, but the results of many of the commands do not. For example on my Windows 7 box:

ExternalProcss shOne: 'dir'

returns the correct directory contents, but:

ExternalProcess shOne: 'ipconfig'

returns buch of chinese characters, or in Windows XP the little black squares. I also get exceptions on some commands, because calling next on the error stream returns nil, which breaks things too.

Mike

Mike Hales
Engineering Manager
KnowledgeScape
www.kscape.com

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [bug] ExternalProcess stream encoding

Mike Hales
I guess it is really in WinProcess>>executeSingleCommand: where this changed, afffecting #shOne: #cshOne:  and #cshBullet and #directoryListing on the class side.

Mike

Mike Hales
Engineering Manager
KnowledgeScape
www.kscape.com


On Mon, Feb 22, 2010 at 12:02 PM, Mike Hales <[hidden email]> wrote:
On 7.7, the windows implementation of ExternalProcess (WinProcess>>shOne:) has changed to invoke the shell in unicode mode (using the /u flag) and sets the stream encoding to utf-16. This breaks badly on many commands. The shell itself seems to work in utf-16, but the results of many of the commands do not. For example on my Windows 7 box:

ExternalProcss shOne: 'dir'

returns the correct directory contents, but:

ExternalProcess shOne: 'ipconfig'

returns buch of chinese characters, or in Windows XP the little black squares. I also get exceptions on some commands, because calling next on the error stream returns nil, which breaks things too.

Mike

Mike Hales
Engineering Manager
KnowledgeScape
www.kscape.com


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [bug] ExternalProcess stream encoding

Boris Popov, DeepCove Labs (SNN)
In reply to this post by Mike Hales
Ya, there was a lot of discussion recently about that. The solution was to use fork:arguments: without command interpreter and control encoding yourself I believe, but I hadn't yet gone through our port to find out the best hands-on approach.

-Boris (via BlackBerry)


From: [hidden email] <[hidden email]>
To: [hidden email] <[hidden email]>
Sent: Mon Feb 22 11:19:57 2010
Subject: Re: [vwnc] [bug] ExternalProcess stream encoding

I guess it is really in WinProcess>>executeSingleCommand: where this changed, afffecting #shOne: #cshOne:  and #cshBullet and #directoryListing on the class side.

Mike

Mike Hales
Engineering Manager
KnowledgeScape
www.kscape.com


On Mon, Feb 22, 2010 at 12:02 PM, Mike Hales <[hidden email]> wrote:
On 7.7, the windows implementation of ExternalProcess (WinProcess>>shOne:) has changed to invoke the shell in unicode mode (using the /u flag) and sets the stream encoding to utf-16. This breaks badly on many commands. The shell itself seems to work in utf-16, but the results of many of the commands do not. For example on my Windows 7 box:

ExternalProcss shOne: 'dir'

returns the correct directory contents, but:

ExternalProcess shOne: 'ipconfig'

returns buch of chinese characters, or in Windows XP the little black squares. I also get exceptions on some commands, because calling next on the error stream returns nil, which breaks things too.

Mike

Mike Hales
Engineering Manager
KnowledgeScape
www.kscape.com


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [bug] ExternalProcess stream encoding

Mike Hales
Ah yes, I guess I didn't read the list very well, as those were just from a week or two ago. I did the same thing, just used fork:arguments: instead.

Mike

Mike Hales
Engineering Manager
KnowledgeScape
www.kscape.com


On Mon, Feb 22, 2010 at 12:34 PM, Boris Popov, DeepCove Labs (YVR) <[hidden email]> wrote:
Ya, there was a lot of discussion recently about that. The solution was to use fork:arguments: without command interpreter and control encoding yourself I believe, but I hadn't yet gone through our port to find out the best hands-on approach.

-Boris (via BlackBerry)


From: [hidden email] <[hidden email]>
To: [hidden email] <[hidden email]>
Sent: Mon Feb 22 11:19:57 2010
Subject: Re: [vwnc] [bug] ExternalProcess stream encoding

I guess it is really in WinProcess>>executeSingleCommand: where this changed, afffecting #shOne: #cshOne:  and #cshBullet and #directoryListing on the class side.

Mike

Mike Hales
Engineering Manager
KnowledgeScape
www.kscape.com


On Mon, Feb 22, 2010 at 12:02 PM, Mike Hales <[hidden email]> wrote:
On 7.7, the windows implementation of ExternalProcess (WinProcess>>shOne:) has changed to invoke the shell in unicode mode (using the /u flag) and sets the stream encoding to utf-16. This breaks badly on many commands. The shell itself seems to work in utf-16, but the results of many of the commands do not. For example on my Windows 7 box:

ExternalProcss shOne: 'dir'

returns the correct directory contents, but:

ExternalProcess shOne: 'ipconfig'

returns buch of chinese characters, or in Windows XP the little black squares. I also get exceptions on some commands, because calling next on the error stream returns nil, which breaks things too.

Mike

Mike Hales
Engineering Manager
KnowledgeScape
www.kscape.com



_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [bug] ExternalProcess stream encoding

Alan Darlington
In reply to this post by Mike Hales
Mike, I recently brought up this issue in "[vwnc] [VW 7.7 NC] How to interpret WinProcess>>cshOne: output".  At least one other person has also reported this change.

From Alan Knight's reply:


This is the joy of Windows encodings. Basically, if a program runs under Windows and sends output to standardout, it will probably give you output in one of two encodings. Either the oem encoding of the particular windows locale (that is, the old DOS-type code pages, something like code page 437 in North America, rather than the 1252 used for most Windows purposes in North America). The other is UTF-16. There's no mechanism to identify which it's going to use. For shell commands, and some other programs that honor the setting, there's a command-line argument to the shell that can tell it what to use. If you give it /U, then those programs will return UTF-16. But other programs will return whatever they feel like - which might even be raw binary data. As the invoker of the program, you're expected to know what it's going to return.

So, in 7.6, it was returning an MS437 string, which Smalltalk interpreted as an MS1252 string, which is wrong. But since the two encodings overlap on basic ASCII, you probably never noticed the difference.

Since we're providing a generic mechanism there, we don't know what they're going to return. So we could leave it using the OEM encoding, but then there are commands that would outright fail or give wrong results (e.g. 'dir' if you had characters out of range of code page 437, or which were represented differently in code page 1252). Or we could default to Unicode, which is what we do. So when using cshOne:, we do explicitly use the /U option to the Windows shell, and expect UTF-16 back. If the program doesn't respect that flag, or otherwise return UTF-16, then you're probably better off to use fork:arguments: directly. Note that the implementation of shOne: is essentially

         self encoding: #'UTF-16'.
         ^self fork: self getCommandLineInterpreter arguments: (Array with: '/u' with: '/c' with: aString).

So you could just remove the encoding call and the /u argument and do it yourself. But if you just want to run the program, there's probably not much reason to go through a command line interpreter anyway. Just say self fork: 'myprogram.exe' arguments: #('one' 'two' 'three'). You may want to set the encoding to be what you expect the program to produce, because it might not actually be 437 either.


From: Mike Hales <[hidden email]>
To: [hidden email]
Sent: Mon, February 22, 2010 11:02:10 AM
Subject: [vwnc] [bug] ExternalProcess stream encoding

On 7.7, the windows implementation of ExternalProcess (WinProcess>>shOne:) has changed to invoke the shell in unicode mode (using the /u flag) and sets the stream encoding to utf-16. This breaks badly on many commands. The shell itself seems to work in utf-16, but the results of many of the commands do not. For example on my Windows 7 box:

ExternalProcss shOne: 'dir'

returns the correct directory contents, but:

ExternalProcess shOne: 'ipconfig'

returns buch of chinese characters, or in Windows XP the little black squares. I also get exceptions on some commands, because calling next on the error stream returns nil, which breaks things too.

Mike

Mike Hales
Engineering Manager
KnowledgeScape
www.kscape.com


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc