[7.7] Interpreting output of #executeSingleCommand:?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[7.7] Interpreting output of #executeSingleCommand:?

Boris Popov, DeepCove Labs (SNN)

Quick question,

 

7.6,

 ExternalProcess defaultClass new executeSingleCommand: 'htpasswd -bmn "marty" "apple"' 'marty:$apr1$nVxObyRc$4iHEt8UD6bpqFDbAXHP6/.'

 

7.7,

 ExternalProcess defaultClass new executeSingleCommand: 'htpasswd -bmn "marty" "apple"' '慭瑲㩹愤牰䱡敏瑭杓䘤捥䩷䩉噲㥯稱牊㍲搯റഊ਍'

 

This leads me to believe that I should go through all uses of external process in our code and review whether or not default UTF16 encoding applies and replace call sites where it causes issues with,

 

ExternalProcess defaultClass fork: 'htpasswd -bmn "marty" "apple"' arguments: #()

 

Or can I assume that #fork:arguments: is functionally equivalent to old #executeSingleCommand:? I hadn't done much digging myself yet.

 

CompositeLocale platformLocaleAndEncoding #('en_IE' 'windows-1252')

 

OS Name:                   Microsoft Windows 7 Ultimate

OS Version:                6.1.7600 N/A Build 7600

 

Regards,

 

-Boris

 

--

DeepCove Labs Ltd.

+1 (604) 689-0322

4th floor, 595 Howe Street

Vancouver, British Columbia

Canada V6C 2T5

http://tinyurl.com/r7uw4

 

PacNet Services (Europe) Ltd.

+353 (0)61 714-360

Shannon Airport House, SFZ

County Clare, Ireland

http://tinyurl.com/y952amr

 

CONFIDENTIALITY NOTICE

 

This email is intended only for the persons named in the message header. Unless otherwise indicated, it contains information that is private and confidential. If you have received it in error, please notify the sender and delete the entire message including any attachments.

 

Thank you.

 


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [7.7] Interpreting output of #executeSingleCommand:?

Alan Darlington
Boris, I recently brought up this issue in "[vwnc] [VW 7.7 NC] How to interpret WinProcess>>cshOne: output".

From
Alan Knight's reply:


This is the joy of Windows encodings. Basically, if a program runs under Windows and sends output to standardout, it will probably give you output in one of two encodings. Either the oem encoding of the particular windows locale (that is, the old DOS-type code pages, something like code page 437 in North America, rather than the 1252 used for most Windows purposes in North America). The other is UTF-16. There's no mechanism to identify which it's going to use. For shell commands, and some other programs that honor the setting, there's a command-line argument to the shell that can tell it what to use. If you give it /U, then those programs will return UTF-16. But other programs will return whatever they feel like - which might even be raw binary data. As the invoker of the program, you're expected to know what it's going to return.

So, in 7.6, it was returning an MS437 string, which Smalltalk interpreted as an MS1252 string, which is wrong. But since the two encodings overlap on basic ASCII, you probably never noticed the difference.

Since we're providing a generic mechanism there, we don't know what they're going to return. So we could leave it using the OEM encoding, but then there are commands that would outright fail or give wrong results (e.g. 'dir' if you had characters out of range of code page 437, or which were represented differently in code page 1252). Or we could default to Unicode, which is what we do. So when using cshOne:, we do explicitly use the /U option to the Windows shell, and expect UTF-16 back. If the program doesn't respect that flag, or otherwise return UTF-16, then you're probably better off to use fork:arguments: directly. Note that the implementation of shOne: is essentially

         self encoding: #'UTF-16'.
         ^self fork: self getCommandLineInterpreter arguments: (Array with: '/u' with: '/c' with: aString).

So you could just remove the encoding call and the /u argument and do it yourself. But if you just want to run the program, there's probably not much reason to go through a command line interpreter anyway. Just say self fork: 'myprogram.exe' arguments: #('one' 'two' 'three'). You may want to set the encoding to be what you expect the program to produce, because it might not actually be 437 either.


From: "Boris Popov, DeepCove Labs (YVR)" <[hidden email]>
To: VW NC <[hidden email]>
Sent: Fri, February 12, 2010 8:26:28 AM
Subject: [vwnc] [7.7] Interpreting output of #executeSingleCommand:?

Quick question,

 

7.6,

 ExternalProcess defaultClass new executeSingleCommand: 'htpasswd -bmn "marty" "apple"' 'marty:$apr1$nVxObyRc$4iHEt8UD6bpqFDbAXHP6/.'

 

7.7,

 ExternalProcess defaultClass new executeSingleCommand: 'htpasswd -bmn "marty" "apple"' '慭瑲㩹愤牰䱡敏瑭杓䘤捥䩷䩉噲㥯稱牊㍲搯റഊ਍'

 

This leads me to believe that I should go through all uses of external process in our code and review whether or not default UTF16 encoding applies and replace call sites where it causes issues with,

 

ExternalProcess defaultClass fork: 'htpasswd -bmn "marty" "apple"' arguments: #()

 

Or can I assume that #fork:arguments: is functionally equivalent to old #executeSingleCommand:? I hadn't done much digging myself yet.

 

CompositeLocale platformLocaleAndEncoding #('en_IE' 'windows-1252')

 

OS Name:                   Microsoft Windows 7 Ultimate

OS Version:                6.1.7600 N/A Build 7600

 

Regards,

 

-Boris

 

--

DeepCove Labs Ltd.

+1 (604) 689-0322

4th floor, 595 Howe Street

Vancouver, British Columbia

Canada V6C 2T5

http://tinyurl.com/r7uw4

 

PacNet Services (Europe) Ltd.

+353 (0)61 714-360

Shannon Airport House, SFZ

County Clare, Ireland

http://tinyurl.com/y952amr

 

CONFIDENTIALITY NOTICE

 

This email is intended only for the persons named in the message header. Unless otherwise indicated, it contains information that is private and confidential. If you have received it in error, please notify the sender and delete the entire message including any attachments.

 

Thank you.

 



_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [7.7] Interpreting output of #executeSingleCommand:?

Boris Popov, DeepCove Labs (SNN)
In reply to this post by Boris Popov, DeepCove Labs (SNN)
Ah fair enough, I saw the original exchange but the details didn't register for me. I shall go through the code next week and get it straightened out. Thanks.

-Boris (via BlackBerry)


From: Alan Darlington <[hidden email]>
To: Boris Popov, DeepCove Labs (YVR); VW NC <[hidden email]>
Sent: Fri Feb 12 09:14:41 2010
Subject: Re: [vwnc] [7.7] Interpreting output of #executeSingleCommand:?

Boris, I recently brought up this issue in "[vwnc] [VW 7.7 NC] How to interpret WinProcess>>cshOne: output".

From
Alan Knight's reply:


This is the joy of Windows encodings. Basically, if a program runs under Windows and sends output to standardout, it will probably give you output in one of two encodings. Either the oem encoding of the particular windows locale (that is, the old DOS-type code pages, something like code page 437 in North America, rather than the 1252 used for most Windows purposes in North America). The other is UTF-16. There's no mechanism to identify which it's going to use. For shell commands, and some other programs that honor the setting, there's a command-line argument to the shell that can tell it what to use. If you give it /U, then those programs will return UTF-16. But other programs will return whatever they feel like - which might even be raw binary data. As the invoker of the program, you're expected to know what it's going to return.

So, in 7.6, it was returning an MS437 string, which Smalltalk interpreted as an MS1252 string, which is wrong. But since the two encodings overlap on basic ASCII, you probably never noticed the difference.

Since we're providing a generic mechanism there, we don't know what they're going to return. So we could leave it using the OEM encoding, but then there are commands that would outright fail or give wrong results (e.g. 'dir' if you had characters out of range of code page 437, or which were represented differently in code page 1252). Or we could default to Unicode, which is what we do. So when using cshOne:, we do explicitly use the /U option to the Windows shell, and expect UTF-16 back. If the program doesn't respect that flag, or otherwise return UTF-16, then you're probably better off to use fork:arguments: directly. Note that the implementation of shOne: is essentially

         self encoding: #'UTF-16'.
         ^self fork: self getCommandLineInterpreter arguments: (Array with: '/u' with: '/c' with: aString).

So you could just remove the encoding call and the /u argument and do it yourself. But if you just want to run the program, there's probably not much reason to go through a command line interpreter anyway. Just say self fork: 'myprogram.exe' arguments: #('one' 'two' 'three'). You may want to set the encoding to be what you expect the program to produce, because it might not actually be 437 either.


From: "Boris Popov, DeepCove Labs (YVR)" <[hidden email]>
To: VW NC <[hidden email]>
Sent: Fri, February 12, 2010 8:26:28 AM
Subject: [vwnc] [7.7] Interpreting output of #executeSingleCommand:?

Quick question,

 

7.6,

 ExternalProcess defaultClass new executeSingleCommand: 'htpasswd -bmn "marty" "apple"' 'marty:$apr1$nVxObyRc$4iHEt8UD6bpqFDbAXHP6/.'

 

7.7,

 ExternalProcess defaultClass new executeSingleCommand: 'htpasswd -bmn "marty" "apple"' '慭瑲㩹愤牰䱡敏瑭杓䘤捥䩷䩉噲㥯稱牊㍲搯റഊ਍'

 

This leads me to believe that I should go through all uses of external process in our code and review whether or not default UTF16 encoding applies and replace call sites where it causes issues with,

 

ExternalProcess defaultClass fork: 'htpasswd -bmn "marty" "apple"' arguments: #()

 

Or can I assume that #fork:arguments: is functionally equivalent to old #executeSingleCommand:? I hadn't done much digging myself yet.

 

CompositeLocale platformLocaleAndEncoding #('en_IE' 'windows-1252')

 

OS Name:                   Microsoft Windows 7 Ultimate

OS Version:                6.1.7600 N/A Build 7600

 

Regards,

 

-Boris

 

--

DeepCove Labs Ltd.

+1 (604) 689-0322

4th floor, 595 Howe Street

Vancouver, British Columbia

Canada V6C 2T5

http://tinyurl.com/r7uw4

 

PacNet Services (Europe) Ltd.

+353 (0)61 714-360

Shannon Airport House, SFZ

County Clare, Ireland

http://tinyurl.com/y952amr

 

CONFIDENTIALITY NOTICE

 

This email is intended only for the persons named in the message header. Unless otherwise indicated, it contains information that is private and confidential. If you have received it in error, please notify the sender and delete the entire message including any attachments.

 

Thank you.

 



_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [7.7] Interpreting output of #executeSingleCommand:?

Alan Knight-3
The biggest difference between the old executeSingleCommand: and fork:arguments: is that executeSingleCommand: under Windows would invoke a command shell and then run the command. The fork:arguments: method runs the program directly. This is probably not a significant difference, unless your arguments rely on things like shell expansion. The only thing I can think offhand are things like interpreting "~" to mean the user's home directory, but I don't even know if that applies under Windows.

If using fork:arguments: you might want to experiment a bit, or look at the program, to figure out what encoding the program thinks it's returning. It's presumably not UTF-16, but I'd guess it's as likely to be 1252 as it is to be 437. But if you really only care about ASCII characters, the difference may not matter.

At 12:20 PM 2010-02-12, Boris Popov, DeepCove Labs \(YVR\) wrote:
Content-Class: urn:content-classes:message
Content-Type: multipart/alternative;
         boundary="----_=_NextPart_001_01CAAC07.B7DD0A95"

Ah fair enough, I saw the original exchange but the details didn't register for me. I shall go through the code next week and get it straightened out. Thanks.

-Boris (via BlackBerry)


From: Alan Darlington <[hidden email]>
To: Boris Popov, DeepCove Labs (YVR); VW NC <[hidden email]>
Sent: Fri Feb 12 09:14:41 2010
Subject: Re: [vwnc] [7.7] Interpreting output of #executeSingleCommand:?

Boris, I recently brought up this issue in "[vwnc] [VW 7.7 NC] How to interpret WinProcess>>cshOne: output".

From Alan Knight's reply:


This is the joy of Windows encodings. Basically, if a program runs under Windows and sends output to standardout, it will probably give you output in one of two encodings. Either the oem encoding of the particular windows locale (that is, the old DOS-type code pages, something like code page 437 in North America, rather than the 1252 used for most Windows purposes in North America). The other is UTF-16. There's no mechanism to identify which it's going to use. For shell commands, and some other programs that honor the setting, there's a command-line argument to the shell that can tell it what to use. If you give it /U, then those programs will return UTF-16. But other programs will return whatever they feel like - which might even be raw binary data. As the invoker of the program, you're expected to know what it's going to return.

So, in 7.6, it was returning an MS437 string, which Smalltalk interpreted as an MS1252 string, which is wrong. But since the two encodings overlap on basic ASCII, you probably never noticed the difference.

Since we're providing a generic mechanism there, we don't know what they're going to return. So we could leave it using the OEM encoding, but then there are commands that would outright fail or give wrong results (e.g. 'dir' if you had characters out of range of code page 437, or which were represented differently in code page 1252). Or we could default to Unicode, which is what we do. So when using cshOne:, we do explicitly use the /U option to the Windows shell, and expect UTF-16 back. If the program doesn't respect that flag, or otherwise return UTF-16, then you're probably better off to use fork:arguments: directly. Note that the implementation of shOne: is essentially

         self encoding: #'UTF-16'.
         ^self fork: self getCommandLineInterpreter arguments: (Array with: '/u' with: '/c' with: aString).

So you could just remove the encoding call and the /u argument and do it yourself. But if you just want to run the program, there's probably not much reason to go through a command line interpreter anyway. Just say self fork: 'myprogram.exe' arguments: #('one' 'two' 'three'). You may want to set the encoding to be what you expect the program to produce, because it might not actually be 437 either.


From: "Boris Popov, DeepCove Labs (YVR)" <[hidden email]>
To: VW NC <[hidden email]>
Sent: Fri, February 12, 2010 8:26:28 AM
Subject: [vwnc] [7.7] Interpreting output of #executeSingleCommand:?

Quick question,
 
7.6,
 ExternalProcess defaultClass new executeSingleCommand: 'htpasswd -bmn "marty" "apple"' 'marty:$apr1$nVxObyRc$4iHEt8UD6bpqFDbAXHP6/.'
 
7.7,
 ExternalProcess defaultClass new executeSingleCommand: 'htpasswd -bmn "marty" "apple"' '慭瑲㩹愤牰␱䱡敏瑭杓䘤捥㉊䩷䩉噲㥯稱牊㍲搯റഊ਍'
 
This leads me to believe that I should go through all uses of external process in our code and review whether or not default UTF16 encoding applies and replace call sites where it causes issues with,
 
ExternalProcess defaultClass fork: 'htpasswd -bmn "marty" "apple"' arguments: #()
 
Or can I assume that #fork:arguments: is functionally equivalent to old #executeSingleCommand:? I hadn't done much digging myself yet.
 
CompositeLocale platformLocaleAndEncoding #('en_IE' 'windows-1252')
 
OS Name:                   Microsoft Windows 7 Ultimate
OS Version:                6.1.7600 N/A Build 7600
 
Regards,
 
-Boris
 
--
DeepCove Labs Ltd.
+1 (604) 689-0322
4th floor, 595 Howe Street
Vancouver, British Columbia
Canada V6C 2T5
http://tinyurl.com/r7uw4
 
PacNet Services (Europe) Ltd.
+353 (0)61 714-360
Shannon Airport House, SFZ
County Clare, Ireland
http://tinyurl.com/y952amr
 
CONFIDENTIALITY NOTICE
 
This email is intended only for the persons named in the message header. Unless otherwise indicated, it contains information that is private and confidential. If you have received it in error, please notify the sender and delete the entire message including any attachments.
 
Thank you.
 

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc

--
Alan Knight [|], Engineering Manager, Cincom Smalltalk

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc