OSProcess command with german umlaut does not work

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

OSProcess command with german umlaut does not work

Sabine Manaa
Hi,

on command line, this works, my file is copied:
cp  /Library/WebServer/Documents/reports/bär.pdf /Library/WebServer/Documents/reports/test.pdf

In Pharo (4+5) this does not work (file not copied, no error message)
OSProcess command: 'cp  /Library/WebServer/Documents/reports/bär.pdf /Library/WebServer/Documents/reports/test-a.pdf'.
                       
This works (file is copied):
OSProcess command: 'cp  /Library/WebServer/Documents/reports/bar.pdf /Library/WebServer/Documents/reports/test-b.pdf'.

Seems that german umlauts don't work.

Is there something, I can I do, I don't want to replace the umlauts in filenames...

Regards
Sabine
Reply | Threaded
Open this post in threaded view
|

Re: OSProcess command with german umlaut does not work

HilaireFernandes
Hello Sabine,

Just a suggestion.
If your string is not pure ascii you may need to convert it to UTF8
first as this is what likely expect your OS host.

Indeed Pharo string are not internally encoded as utf8

Check the UTF8TextConverter class to do so.

Hilaire

Le 05/06/2016 18:39, Sabine Manaa a écrit :

> on command line, this works, my file is copied:
> cp  /Library/WebServer/Documents/reports/bär.pdf
> /Library/WebServer/Documents/reports/test.pdf
>
> In Pharo (4+5) this does not work (file not copied, no error message)
> OSProcess command: 'cp  /Library/WebServer/Documents/reports/bär.pdf
> /Library/WebServer/Documents/reports/test-a.pdf'.
>
> This works (file is copied):
> OSProcess command: 'cp  /Library/WebServer/Documents/reports/bar.pdf
> /Library/WebServer/Documents/reports/test-b.pdf'.
>
> Seems that german umlauts don't work.
>
> Is there something, I can I do, I don't want to replace the umlauts in
> filenames...
>
> Regards
> Sabine
>
>

--
Dr. Geo
http://drgeo.eu


Reply | Threaded
Open this post in threaded view
|

Re: OSProcess command with german umlaut does not work

Sven Van Caekenberghe-2

> On 05 Jun 2016, at 21:07, Hilaire <[hidden email]> wrote:
>
> Hello Sabine,
>
> Just a suggestion.
> If your string is not pure ascii you may need to convert it to UTF8
> first as this is what likely expect your OS host.
>
> Indeed Pharo string are not internally encoded as utf8

If that is the case, that the OS expects a different encoding, then the OS(Sub)Process implementation should deal with it, not the user of the API.

> Check the UTF8TextConverter class to do so.

No, these converters are conceptually wrong, since they encode String to String, while it should be String to ByteArray.

The ZnCharacterEncoder hierarchy should be used instead.

String>>#utf8Encoded is a convenience method that can be used to do a quick conversion.

> Hilaire
>
> Le 05/06/2016 18:39, Sabine Manaa a écrit :
>> on command line, this works, my file is copied:
>> cp  /Library/WebServer/Documents/reports/bär.pdf
>> /Library/WebServer/Documents/reports/test.pdf
>>
>> In Pharo (4+5) this does not work (file not copied, no error message)
>> OSProcess command: 'cp  /Library/WebServer/Documents/reports/bär.pdf
>> /Library/WebServer/Documents/reports/test-a.pdf'.
>>
>> This works (file is copied):
>> OSProcess command: 'cp  /Library/WebServer/Documents/reports/bar.pdf
>> /Library/WebServer/Documents/reports/test-b.pdf'.
>>
>> Seems that german umlauts don't work.
>>
>> Is there something, I can I do, I don't want to replace the umlauts in
>> filenames...
>>
>> Regards
>> Sabine
>>
>>
>
> --
> Dr. Geo
> http://drgeo.eu
>
>


Reply | Threaded
Open this post in threaded view
|

Re: OSProcess command with german umlaut does not work

David T. Lewis
Reply | Threaded
Open this post in threaded view
|

Re: OSProcess command with german umlaut does not work

Sabine Manaa
Hi Dave, 

I get the german ä with:

(Character value: 228) asString

Do you want me to go in it and suggest a solution or do you want to try to fix it and I test it?

Thanks for helping!

Regards Sabine

2016-06-05 23:08 GMT+02:00 David T. Lewis [via Smalltalk] <[hidden email]>:



If you reply to this email, your message will be added to the discussion below:
http://forum.world.st/OSProcess-command-with-german-umlaut-does-not-work-tp4899285p4899301.html
To start a new topic under Pharo Smalltalk Users, email [hidden email]
To unsubscribe from Pharo Smalltalk Users, click here.
NAML

Reply | Threaded
Open this post in threaded view
|

Re: OSProcess command with german umlaut does not work

Sabine Manaa
Hi Sven,

why ByteArray?

does not work (Improper store into indexable object):
OSProcess   command: ('cp  /Library/WebServer/Documents/reports/bär.pdf /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded asString).
 
works:
OSProcess   command: ('cp  /Library/WebServer/Documents/reports/bär.pdf /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded)

Perhaps David can add this here:

command: aCommandString
"Run a command in a shell process. Similar to the system(3) call in the standard C library,
except that aCommandString runs asynchronously in a child process. The command is
run by a ConnectedUnixProcess in order to facilitate command pipelines within Squeak."

"UnixProcess thisOSProcess command: 'ls -l /etc'"

| proc |
pid isNil
ifTrue:
[self class noAccessorAvailable. ^nil]
ifFalse:
[proc := self
forkJob: ExternalUnixOSProcess defaultShellPath
arguments: (Array with: '-c' with: aCommandString utf8Encoded asString) <<<===
environment: nil
descriptors: nil.
proc ifNil: [self class noAccessorAvailable].
^ proc]


regards
Sabine
 
 

2016-06-06 8:41 GMT+02:00 Sabine Manaa <[hidden email]>:
Hi Dave, 

I get the german ä with:

(Character value: 228) asString

Do you want me to go in it and suggest a solution or do you want to try to fix it and I test it?

Thanks for helping!

Regards Sabine

2016-06-05 23:08 GMT+02:00 David T. Lewis [via Smalltalk] <[hidden email]>:



If you reply to this email, your message will be added to the discussion below:
http://forum.world.st/OSProcess-command-with-german-umlaut-does-not-work-tp4899285p4899301.html
To start a new topic under Pharo Smalltalk Users, email [hidden email]
To unsubscribe from Pharo Smalltalk Users, click here.
NAML



View this message in context: Re: OSProcess command with german umlaut does not work
Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: OSProcess command with german umlaut does not work

David T. Lewis
Hi Sabine,

That's great that #utf8Encoded is working, thanks for confirming.

I'll look and see if I can add that to OSProcess (I'm traveling and cannot
look at it right now).

Mariano - this thread probably applies to OSSubProcess also.

Dave

> Hi Sven,
>
> why ByteArray?
>
> does not work (Improper store into indexable object):
> OSProcess   command: ('cp  /Library/WebServer/Documents/reports/bär.pdf
> /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded asString).
>
> works:
> OSProcess   command: ('cp  /Library/WebServer/Documents/reports/bär.pdf
> /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded)
>
> Perhaps David can add this here:
>
> command: aCommandString
> "Run a command in a shell process. Similar to the system(3) call in the
> standard C library,
> except that aCommandString runs asynchronously in a child process. The
> command is
> run by a ConnectedUnixProcess in order to facilitate command pipelines
> within Squeak."
>
> "UnixProcess thisOSProcess command: 'ls -l /etc'"
>
> | proc |
> pid isNil
> ifTrue:
> [self class noAccessorAvailable. ^nil]
> ifFalse:
> [proc := self
> forkJob: ExternalUnixOSProcess defaultShellPath
> arguments: (Array with: '-c' with: aCommandString utf8Encoded asString)
> <<<===
> environment: nil
> descriptors: nil.
> proc ifNil: [self class noAccessorAvailable].
> ^ proc]
>
>
> regards
> Sabine
>
>
>>
>>
>
> 2016-06-06 8:41 GMT+02:00 Sabine Manaa <[hidden email]>:
>
>> Hi Dave,
>>
>> I get the german ä with:
>>
>> (Character value: 228) asString
>>
>> Do you want me to go in it and suggest a solution or do you want to try
>> to
>> fix it and I test it?
>>
>> Thanks for helping!
>>
>> Regards Sabine
>>
>> 2016-06-05 23:08 GMT+02:00 David T. Lewis [via Smalltalk] <[hidden
>> email]
>> <http:///user/SendEmail.jtp?type=node&node=4899318&i=0>>:
>>
>>>
>>>
>>> ------------------------------
>>> If you reply to this email, your message will be added to the
>>> discussion
>>> below:
>>>
>>> http://forum.world.st/OSProcess-command-with-german-umlaut-does-not-work-tp4899285p4899301.html
>>> To start a new topic under Pharo Smalltalk Users, email [hidden email]
>>> <http:///user/SendEmail.jtp?type=node&node=4899318&i=1>
>>> To unsubscribe from Pharo Smalltalk Users, click here.
>>> NAML
>>> <http://forum.world.st/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>>
>>
>>
>> ------------------------------
>> View this message in context: Re: OSProcess command with german umlaut
>> does not work
>> <http://forum.world.st/OSProcess-command-with-german-umlaut-does-not-work-tp4899285p4899318.html>
>> Sent from the Pharo Smalltalk Users mailing list archive
>> <http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html> at
>> Nabble.com.
>>
>



Reply | Threaded
Open this post in threaded view
|

Re: OSProcess command with german umlaut does not work

NorbertHartl
Dave,

> Am 06.06.2016 um 18:13 schrieb David T. Lewis <[hidden email]>:
>
> Hi Sabine,
>
> That's great that #utf8Encoded is working, thanks for confirming.
>
> I'll look and see if I can add that to OSProcess (I'm traveling and cannot
> look at it right now).
>
> Mariano - this thread probably applies to OSSubProcess also.
>
that would just work if the system locale is utf8, right? Wouldn't it be better to making that a setting?

Norbert

> Dave
>
>> Hi Sven,
>>
>> why ByteArray?
>>
>> does not work (Improper store into indexable object):
>> OSProcess   command: ('cp  /Library/WebServer/Documents/reports/bär.pdf
>> /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded asString).
>>
>> works:
>> OSProcess   command: ('cp  /Library/WebServer/Documents/reports/bär.pdf
>> /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded)
>>
>> Perhaps David can add this here:
>>
>> command: aCommandString
>> "Run a command in a shell process. Similar to the system(3) call in the
>> standard C library,
>> except that aCommandString runs asynchronously in a child process. The
>> command is
>> run by a ConnectedUnixProcess in order to facilitate command pipelines
>> within Squeak."
>>
>> "UnixProcess thisOSProcess command: 'ls -l /etc'"
>>
>> | proc |
>> pid isNil
>> ifTrue:
>> [self class noAccessorAvailable. ^nil]
>> ifFalse:
>> [proc := self
>> forkJob: ExternalUnixOSProcess defaultShellPath
>> arguments: (Array with: '-c' with: aCommandString utf8Encoded asString)
>> <<<===
>> environment: nil
>> descriptors: nil.
>> proc ifNil: [self class noAccessorAvailable].
>> ^ proc]
>>
>>
>> regards
>> Sabine
>>
>>
>>>
>>>
>>
>> 2016-06-06 8:41 GMT+02:00 Sabine Manaa <[hidden email]>:
>>
>>> Hi Dave,
>>>
>>> I get the german ä with:
>>>
>>> (Character value: 228) asString
>>>
>>> Do you want me to go in it and suggest a solution or do you want to try
>>> to
>>> fix it and I test it?
>>>
>>> Thanks for helping!
>>>
>>> Regards Sabine
>>>
>>> 2016-06-05 23:08 GMT+02:00 David T. Lewis [via Smalltalk] <[hidden
>>> email]
>>> <http:///user/SendEmail.jtp?type=node&node=4899318&i=0>>:
>>>
>>>>
>>>>
>>>> ------------------------------
>>>> If you reply to this email, your message will be added to the
>>>> discussion
>>>> below:
>>>>
>>>> http://forum.world.st/OSProcess-command-with-german-umlaut-does-not-work-tp4899285p4899301.html
>>>> To start a new topic under Pharo Smalltalk Users, email [hidden email]
>>>> <http:///user/SendEmail.jtp?type=node&node=4899318&i=1>
>>>> To unsubscribe from Pharo Smalltalk Users, click here.
>>>> NAML
>>>> <http://forum.world.st/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>>>
>>>
>>>
>>> ------------------------------
>>> View this message in context: Re: OSProcess command with german umlaut
>>> does not work
>>> <http://forum.world.st/OSProcess-command-with-german-umlaut-does-not-work-tp4899285p4899318.html>
>>> Sent from the Pharo Smalltalk Users mailing list archive
>>> <http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html> at
>>> Nabble.com.
>>>
>>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: OSProcess command with german umlaut does not work

David T. Lewis
Norbert,

You are probably right. I'm not sure the best way to handle it.

Dave

> Dave,
>
>> Am 06.06.2016 um 18:13 schrieb David T. Lewis <[hidden email]>:
>>
>> Hi Sabine,
>>
>> That's great that #utf8Encoded is working, thanks for confirming.
>>
>> I'll look and see if I can add that to OSProcess (I'm traveling and
>> cannot
>> look at it right now).
>>
>> Mariano - this thread probably applies to OSSubProcess also.
>>
> that would just work if the system locale is utf8, right? Wouldn't it be
> better to making that a setting?
>
> Norbert
>
>> Dave
>>
>>> Hi Sven,
>>>
>>> why ByteArray?
>>>
>>> does not work (Improper store into indexable object):
>>> OSProcess   command: ('cp
>>> /Library/WebServer/Documents/reports/bär.pdf
>>> /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded asString).
>>>
>>> works:
>>> OSProcess   command: ('cp
>>> /Library/WebServer/Documents/reports/bär.pdf
>>> /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded)
>>>
>>> Perhaps David can add this here:
>>>
>>> command: aCommandString
>>> "Run a command in a shell process. Similar to the system(3) call in the
>>> standard C library,
>>> except that aCommandString runs asynchronously in a child process. The
>>> command is
>>> run by a ConnectedUnixProcess in order to facilitate command pipelines
>>> within Squeak."
>>>
>>> "UnixProcess thisOSProcess command: 'ls -l /etc'"
>>>
>>> | proc |
>>> pid isNil
>>> ifTrue:
>>> [self class noAccessorAvailable. ^nil]
>>> ifFalse:
>>> [proc := self
>>> forkJob: ExternalUnixOSProcess defaultShellPath
>>> arguments: (Array with: '-c' with: aCommandString utf8Encoded asString)
>>> <<<===
>>> environment: nil
>>> descriptors: nil.
>>> proc ifNil: [self class noAccessorAvailable].
>>> ^ proc]
>>>
>>>
>>> regards
>>> Sabine
>>>
>>>
>>>>
>>>>
>>>
>>> 2016-06-06 8:41 GMT+02:00 Sabine Manaa <[hidden email]>:
>>>
>>>> Hi Dave,
>>>>
>>>> I get the german ä with:
>>>>
>>>> (Character value: 228) asString
>>>>
>>>> Do you want me to go in it and suggest a solution or do you want to
>>>> try
>>>> to
>>>> fix it and I test it?
>>>>
>>>> Thanks for helping!
>>>>
>>>> Regards Sabine
>>>>
>>>> 2016-06-05 23:08 GMT+02:00 David T. Lewis [via Smalltalk] <[hidden
>>>> email]
>>>> <http:///user/SendEmail.jtp?type=node&node=4899318&i=0>>:
>>>>
>>>>>
>>>>>
>>>>> ------------------------------
>>>>> If you reply to this email, your message will be added to the
>>>>> discussion
>>>>> below:
>>>>>
>>>>> http://forum.world.st/OSProcess-command-with-german-umlaut-does-not-work-tp4899285p4899301.html
>>>>> To start a new topic under Pharo Smalltalk Users, email [hidden
>>>>> email]
>>>>> <http:///user/SendEmail.jtp?type=node&node=4899318&i=1>
>>>>> To unsubscribe from Pharo Smalltalk Users, click here.
>>>>> NAML
>>>>> <http://forum.world.st/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>>>>
>>>>
>>>>
>>>> ------------------------------
>>>> View this message in context: Re: OSProcess command with german umlaut
>>>> does not work
>>>> <http://forum.world.st/OSProcess-command-with-german-umlaut-does-not-work-tp4899285p4899318.html>
>>>> Sent from the Pharo Smalltalk Users mailing list archive
>>>> <http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html> at
>>>> Nabble.com.
>>>>
>>>
>>
>>
>>
>
>



Reply | Threaded
Open this post in threaded view
|

Re: OSProcess command with german umlaut does not work

Mariano Martinez Peck
Hi Dave, Sabine, Norbert et all,

Few weeks (months?) ago I was also reviewing this topic of encoding a OS(Sub)Process. After surfing a bit the web, I found out the most simple and accurate answer/solution was indeed to set the correct locale and/or text encoding in the computer in question. Anyway...more answers below.

Now... what I don't understand from Sabine is.. she said this one works:

OSProcess   command: ('cp  /Library/WebServer/Documents/reports/bär.pdf /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded)

But then my question is...does that work only because she's computer locale is UTF8? Or Unix* automatically decodes it and knows it is utf8?
If not...should I adapt the #utf8Encoded to the encoding defined by the terminal? mmm


In my OSX box I do have UTF8 set:

 ❯ locale                                                                                                                   [13:56:49]
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=




On Mon, Jun 6, 2016 at 1:42 PM, David T. Lewis <[hidden email]> wrote:
Norbert,

You are probably right. I'm not sure the best way to handle it.

Dave

> Dave,
>
>> Am 06.06.2016 um 18:13 schrieb David T. Lewis <[hidden email]>:
>>
>> Hi Sabine,
>>
>> That's great that #utf8Encoded is working, thanks for confirming.
>>
>> I'll look and see if I can add that to OSProcess (I'm traveling and
>> cannot
>> look at it right now).
>>
>> Mariano - this thread probably applies to OSSubProcess also.
>>
> that would just work if the system locale is utf8, right? Wouldn't it be
> better to making that a setting?
>
> Norbert
>
>> Dave
>>
>>> Hi Sven,
>>>
>>> why ByteArray?
>>>
>>> does not work (Improper store into indexable object):
>>> OSProcess   command: ('cp
>>> /Library/WebServer/Documents/reports/bär.pdf
>>> /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded asString).
>>>
>>> works:
>>> OSProcess   command: ('cp
>>> /Library/WebServer/Documents/reports/bär.pdf
>>> /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded)
>>>
>>> Perhaps David can add this here:
>>>
>>> command: aCommandString
>>> "Run a command in a shell process. Similar to the system(3) call in the
>>> standard C library,
>>> except that aCommandString runs asynchronously in a child process. The
>>> command is
>>> run by a ConnectedUnixProcess in order to facilitate command pipelines
>>> within Squeak."
>>>
>>> "UnixProcess thisOSProcess command: 'ls -l /etc'"
>>>
>>> | proc |
>>> pid isNil
>>> ifTrue:
>>> [self class noAccessorAvailable. ^nil]
>>> ifFalse:
>>> [proc := self
>>> forkJob: ExternalUnixOSProcess defaultShellPath
>>> arguments: (Array with: '-c' with: aCommandString utf8Encoded asString)
>>> <<<===
>>> environment: nil
>>> descriptors: nil.
>>> proc ifNil: [self class noAccessorAvailable].
>>> ^ proc]
>>>
>>>
>>> regards
>>> Sabine
>>>
>>>
>>>>
>>>>
>>>
>>> 2016-06-06 8:41 GMT+02:00 Sabine Manaa <[hidden email]>:
>>>
>>>> Hi Dave,
>>>>
>>>> I get the german ä with:
>>>>
>>>> (Character value: 228) asString
>>>>
>>>> Do you want me to go in it and suggest a solution or do you want to
>>>> try
>>>> to
>>>> fix it and I test it?
>>>>
>>>> Thanks for helping!
>>>>
>>>> Regards Sabine
>>>>
>>>> 2016-06-05 23:08 GMT+02:00 David T. Lewis [via Smalltalk] <[hidden
>>>> email]
>>>> <http:///user/SendEmail.jtp?type=node&node=4899318&i=0>>:
>>>>
>>>>>
>>>>>
>>>>> ------------------------------
>>>>> If you reply to this email, your message will be added to the
>>>>> discussion
>>>>> below:
>>>>>
>>>>> http://forum.world.st/OSProcess-command-with-german-umlaut-does-not-work-tp4899285p4899301.html
>>>>> To start a new topic under Pharo Smalltalk Users, email [hidden
>>>>> email]
>>>>> <http:///user/SendEmail.jtp?type=node&node=4899318&i=1>
>>>>> To unsubscribe from Pharo Smalltalk Users, click here.
>>>>> NAML
>>>>> <http://forum.world.st/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>>>>
>>>>
>>>>
>>>> ------------------------------
>>>> View this message in context: Re: OSProcess command with german umlaut
>>>> does not work
>>>> <http://forum.world.st/OSProcess-command-with-german-umlaut-does-not-work-tp4899285p4899318.html>
>>>> Sent from the Pharo Smalltalk Users mailing list archive
>>>> <http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html> at
>>>> Nabble.com.
>>>>
>>>
>>
>>
>>
>
>






--
Reply | Threaded
Open this post in threaded view
|

Re: OSProcess command with german umlaut does not work

Sabine Manaa
In reply to this post by Sabine Manaa
Sorry, I did a mistake. I reversed it by mistake.
asString is needed.

does work :
OSProcess   command: ('cp  /Library/WebServer/Documents/reports/bär.pdf /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded asString).
 
does not work (Improper store into indexable object):
OSProcess   command: ('cp  /Library/WebServer/Documents/reports/bär.pdf /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded)

I can try on windows tomorrow if you want.


2016-06-06 17:22 GMT+02:00 Sabine Manaa <[hidden email]>:
Hi Sven,

why ByteArray?

does not work (Improper store into indexable object):
OSProcess   command: ('cp  /Library/WebServer/Documents/reports/bär.pdf /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded asString).
 
works:
OSProcess   command: ('cp  /Library/WebServer/Documents/reports/bär.pdf /Library/WebServer/Documents/reports/test-a.pdf' utf8Encoded)

Perhaps David can add this here:

command: aCommandString
"Run a command in a shell process. Similar to the system(3) call in the standard C library,
except that aCommandString runs asynchronously in a child process. The command is
run by a ConnectedUnixProcess in order to facilitate command pipelines within Squeak."

"UnixProcess thisOSProcess command: 'ls -l /etc'"

| proc |
pid isNil
ifTrue:
[self class noAccessorAvailable. ^nil]
ifFalse:
[proc := self
forkJob: ExternalUnixOSProcess defaultShellPath
arguments: (Array with: '-c' with: aCommandString utf8Encoded asString) <<<===
environment: nil
descriptors: nil.
proc ifNil: [self class noAccessorAvailable].
^ proc]


regards
Sabine
 
 

2016-06-06 8:41 GMT+02:00 Sabine Manaa <[hidden email]>:
Hi Dave, 

I get the german ä with:

(Character value: 228) asString

Do you want me to go in it and suggest a solution or do you want to try to fix it and I test it?

Thanks for helping!

Regards Sabine

2016-06-05 23:08 GMT+02:00 David T. Lewis [via Smalltalk] <[hidden email]>:



If you reply to this email, your message will be added to the discussion below:
http://forum.world.st/OSProcess-command-with-german-umlaut-does-not-work-tp4899285p4899301.html
To start a new topic under Pharo Smalltalk Users, email [hidden email]
To unsubscribe from Pharo Smalltalk Users, click here.
NAML



View this message in context: Re: OSProcess command with german umlaut does not work
Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.


Reply | Threaded
Open this post in threaded view
|

Re: OSProcess command with german umlaut does not work

Sven Van Caekenberghe-2
In reply to this post by Sabine Manaa

> On 06 Jun 2016, at 17:22, Sabine Manaa <[hidden email]> wrote:
>
> why ByteArray?

http://www.unicode.org/faq/utf_bom.html

A Unicode transformation format (UTF) is an algorithmic mapping from every Unicode code point (except surrogate code points) to a unique byte sequence.

https://en.wikipedia.org/wiki/UTF-8

UTF-8 encodes each of the 1,112,064 valid code points in the Unicode code space (1,114,112 code points minus 2,048 surrogate code points) using one to four 8-bit bytes (a group of 8 bits is known as an octet in the Unicode Standard).

In Pharo

https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html

Of course, given a ByteArray, whose values are all between 0 and 255 by definition, you can convert it to a ByteString. That String is not a correct (Pharo) String anymore, it is like converting a PNG or JPEG to String, you can do it, it is just wrong.

When talking to the outside world, be it over a network connection, or via primitive calls, anything but pure ASCII strings need an encoding. This has to be agreed upon by both parties. If the receiving party wants UTF-8 forced into a (kind of) String, that is (still) possible.

Your initial solution seems to indicate that this is expected. This (ugly) conversion should be done at an as low level as possible, IMHO.

Sven


Reply | Threaded
Open this post in threaded view
|

Re: OSProcess command with german umlaut does not work

Sabine Manaa
Hi Sven,

thank you very much for your explanation. I will read the pharo book chapter again tomorrow morning. 
Each time I have to do with encoding, I have to start again with reading....;-(

I was not asking for the reason of encoding but because OSProcess command: needs a String and not a Byte Array. But yes, sure, first encode it and then convert it back to a string.

Regards and a nice evening
Sabine

2016-06-06 19:57 GMT+02:00 Sven Van Caekenberghe-2 [via Smalltalk] <[hidden email]>:

> On 06 Jun 2016, at 17:22, Sabine Manaa <[hidden email]> wrote:
>
> why ByteArray?

http://www.unicode.org/faq/utf_bom.html

A Unicode transformation format (UTF) is an algorithmic mapping from every Unicode code point (except surrogate code points) to a unique byte sequence.

https://en.wikipedia.org/wiki/UTF-8

UTF-8 encodes each of the 1,112,064 valid code points in the Unicode code space (1,114,112 code points minus 2,048 surrogate code points) using one to four 8-bit bytes (a group of 8 bits is known as an octet in the Unicode Standard).

In Pharo

https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html

Of course, given a ByteArray, whose values are all between 0 and 255 by definition, you can convert it to a ByteString. That String is not a correct (Pharo) String anymore, it is like converting a PNG or JPEG to String, you can do it, it is just wrong.

When talking to the outside world, be it over a network connection, or via primitive calls, anything but pure ASCII strings need an encoding. This has to be agreed upon by both parties. If the receiving party wants UTF-8 forced into a (kind of) String, that is (still) possible.

Your initial solution seems to indicate that this is expected. This (ugly) conversion should be done at an as low level as possible, IMHO.

Sven





If you reply to this email, your message will be added to the discussion below:
http://forum.world.st/OSProcess-command-with-german-umlaut-does-not-work-tp4899285p4899446.html
To start a new topic under Pharo Smalltalk Users, email [hidden email]
To unsubscribe from Pharo Smalltalk Users, click here.
NAML

Reply | Threaded
Open this post in threaded view
|

Re: OSProcess command with german umlaut does not work

David T. Lewis
In reply to this post by Sven Van Caekenberghe-2
On Mon, Jun 06, 2016 at 08:34:40PM +0200, Sven Van Caekenberghe wrote:

>
> > On 06 Jun 2016, at 17:22, Sabine Manaa <[hidden email]> wrote:
> >
> > why ByteArray?
>
> http://www.unicode.org/faq/utf_bom.html
>
> A Unicode transformation format (UTF) is an algorithmic mapping from every Unicode code point (except surrogate code points) to a unique byte sequence.
>
> https://en.wikipedia.org/wiki/UTF-8
>
> UTF-8 encodes each of the 1,112,064 valid code points in the Unicode code space (1,114,112 code points minus 2,048 surrogate code points) using one to four 8-bit bytes (a group of 8 bits is known as an octet in the Unicode Standard).
>
> In Pharo
>
> https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html
>
> Of course, given a ByteArray, whose values are all between 0 and 255 by definition, you can convert it to a ByteString. That String is not a correct (Pharo) String anymore, it is like converting a PNG or JPEG to String, you can do it, it is just wrong.
>
> When talking to the outside world, be it over a network connection, or via primitive calls, anything but pure ASCII strings need an encoding. This has to be agreed upon by both parties. If the receiving party wants UTF-8 forced into a (kind of) String, that is (still) possible.
>
> Your initial solution seems to indicate that this is expected. This (ugly) conversion should be done at an as low level as possible, IMHO.
>

Hi Sven,

Thanks for this concise summary. I think perhaps what is conceptually
a problem in my OSProcess implementation is that I allow command arguments
to be given in the form of Strings, then pass the byte array contents of
those Squeak/Pharo Strings to a Unix shell or to an exec() system call.
This is convenient from my point of view, because strings are very easy
to use, but it does not account for the differences in mapping from a
String to a byte array. It is the byte array that is actually used in
the calls to the operating system such as:

  UnixOSProcessAccessor>>primForkExec: executableFile
        stdIn: inputFileHandle
        stdOut: outputFileHandle
        stdErr: errorFileHandle
        argBuf: argVec
        argOffsets: argOffsets
        envBuf: envVec
        envOffsets: envOffsets
        workingDir: pathString

At this point, the argVec is composed of "strings" in the C sense of the
word, which really means that it contains byte array data from the Strings.
And of course, if the string encodings in the Squeak/Pharo strings do not
happen to match the string encodings of the operating system, then indeed
the byte arrays do not match and we get a "file not found" kind of problem.

My hope is that Mariano's assessment is correct, and that we can treat
this as the right way to handle the encoding match issues:

On Mon, Jun 06, 2016 at 01:59:21PM -0300, Mariano Martinez Peck wrote:
> Hi Dave, Sabine, Norbert et all,
>
> Few weeks (months?) ago I was also reviewing this topic of encoding a
> OS(Sub)Process. After surfing a bit the web, I found out the most simple
> and accurate answer/solution was indeed to set the correct locale and/or
> text encoding in the computer in question. Anyway...more answers below.

This certainly sounds like the Right Thing To Do if only it works :-)

Dave


Reply | Threaded
Open this post in threaded view
|

Re: OSProcess command with german umlaut does not work

stepharo
In reply to this post by Sven Van Caekenberghe-2
Sven could you update the class comments of such classes and we should
finish to get rid of them.

Your solution is much nicer.


Stef


Le 5/6/16 à 21:40, Sven Van Caekenberghe a écrit :

>> On 05 Jun 2016, at 21:07, Hilaire <[hidden email]> wrote:
>>
>> Hello Sabine,
>>
>> Just a suggestion.
>> If your string is not pure ascii you may need to convert it to UTF8
>> first as this is what likely expect your OS host.
>>
>> Indeed Pharo string are not internally encoded as utf8
> If that is the case, that the OS expects a different encoding, then the OS(Sub)Process implementation should deal with it, not the user of the API.
>
>> Check the UTF8TextConverter class to do so.
> No, these converters are conceptually wrong, since they encode String to String, while it should be String to ByteArray.
>
> The ZnCharacterEncoder hierarchy should be used instead.
>
> String>>#utf8Encoded is a convenience method that can be used to do a quick conversion.
>
>> Hilaire
>>
>> Le 05/06/2016 18:39, Sabine Manaa a écrit :
>>> on command line, this works, my file is copied:
>>> cp  /Library/WebServer/Documents/reports/bär.pdf
>>> /Library/WebServer/Documents/reports/test.pdf
>>>
>>> In Pharo (4+5) this does not work (file not copied, no error message)
>>> OSProcess command: 'cp  /Library/WebServer/Documents/reports/bär.pdf
>>> /Library/WebServer/Documents/reports/test-a.pdf'.
>>>
>>> This works (file is copied):
>>> OSProcess command: 'cp  /Library/WebServer/Documents/reports/bar.pdf
>>> /Library/WebServer/Documents/reports/test-b.pdf'.
>>>
>>> Seems that german umlauts don't work.
>>>
>>> Is there something, I can I do, I don't want to replace the umlauts in
>>> filenames...
>>>
>>> Regards
>>> Sabine
>>>
>>>
>> --
>> Dr. Geo
>> http://drgeo.eu
>>
>>
>
>