Re: [Pharo-users] CommandLine handler and UTF8 path

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] CommandLine handler and UTF8 path

Sven Van Caekenberghe-2
Command line arguments enter the image level via VirtualMachine>>#getSystemAttribute:

At that point they are already Strings.

In your case, they must already be wrong at that point.

> On 20 Jan 2015, at 16:51, Hilaire <[hidden email]> wrote:
>
> Le 20/01/2015 16:34, Sven Van Caekenberghe a écrit :
>> No they are not - Strings and Characters in Pharo are using plain Unicode encoding internally.
>
> Thanks for the update, and the reference link.
>
> Hilaire
>
> --
> Dr. Geo - http://drgeo.eu
> iStoa - http://istoa.drgeo.eu
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] CommandLine handler and UTF8 path

Eliot Miranda-2


On Tue, Jan 20, 2015 at 8:00 AM, Sven Van Caekenberghe <[hidden email]> wrote:
Command line arguments enter the image level via VirtualMachine>>#getSystemAttribute:

At that point they are already Strings.

ByteString, according to the primitive.  So if the shell supplies e.g. UTF-8 strings for command-line parameters, which the VM sees as bytes, then the ByteString instances answered by getSystemAttribute: would need decoding, right?
 
In your case, they must already be wrong at that point.

Not necessarily.  The  getSystemAttribute: primitive doesn't do decoding.  Perhaps it should.


> On 20 Jan 2015, at 16:51, Hilaire <[hidden email]> wrote:
>
> Le 20/01/2015 16:34, Sven Van Caekenberghe a écrit :
>> No they are not - Strings and Characters in Pharo are using plain Unicode encoding internally.
>
> Thanks for the update, and the reference link.
>
> Hilaire
>
> --
> Dr. Geo - http://drgeo.eu
> iStoa - http://istoa.drgeo.eu
>
>





--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] CommandLine handler and UTF8 path

Sven Van Caekenberghe-2
Hi Eliot,

> On 20 Jan 2015, at 20:38, Eliot Miranda <[hidden email]> wrote:
>
>
>
> On Tue, Jan 20, 2015 at 8:00 AM, Sven Van Caekenberghe <[hidden email]> wrote:
> Command line arguments enter the image level via VirtualMachine>>#getSystemAttribute:
>
> At that point they are already Strings.
>
> ByteString, according to the primitive.  So if the shell supplies e.g. UTF-8 strings for command-line parameters, which the VM sees as bytes, then the ByteString instances answered by getSystemAttribute: would need decoding, right?
>  
> In your case, they must already be wrong at that point.
>
> Not necessarily.  The  getSystemAttribute: primitive doesn't do decoding.  Perhaps it should.

Yes, probably. I just tried on Mac OS X, Pharo 4:

$ export FOO=élève-Français

$ echo $FOO
élève-Français

$ ./pharo Pharo.image eval 'OSPlatform current environment at: #FOO'
'élève-Français'

$ ./pharo Pharo.image eval '(OSPlatform current environment at: #FOO) asByteArray utf8Decoded'
'élève-Français'

The question is, is this true for all platforms ? Windows ?

> > On 20 Jan 2015, at 16:51, Hilaire <[hidden email]> wrote:
> >
> > Le 20/01/2015 16:34, Sven Van Caekenberghe a écrit :
> >> No they are not - Strings and Characters in Pharo are using plain Unicode encoding internally.
> >
> > Thanks for the update, and the reference link.
> >
> > Hilaire
> >
> > --
> > Dr. Geo - http://drgeo.eu
> > iStoa - http://istoa.drgeo.eu
> >
> >
>
>
>
>
>
> --
> best,
> Eliot


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] CommandLine handler and UTF8 path

Jan Vrany
On Tue, 2015-01-20 at 21:35 +0100, Sven Van Caekenberghe wrote:

> Hi Eliot,
>
> > On 20 Jan 2015, at 20:38, Eliot Miranda <[hidden email]> wrote:
> >
> >
> >
> > On Tue, Jan 20, 2015 at 8:00 AM, Sven Van Caekenberghe <[hidden email]> wrote:
> > Command line arguments enter the image level via VirtualMachine>>#getSystemAttribute:
> >
> > At that point they are already Strings.
> >
> > ByteString, according to the primitive.  So if the shell supplies e.g. UTF-8 strings for command-line parameters, which the VM sees as bytes, then the ByteString instances answered by getSystemAttribute: would need decoding, right?
> >  
> > In your case, they must already be wrong at that point.
> >
> > Not necessarily.  The  getSystemAttribute: primitive doesn't do decoding.  Perhaps it should.
>
> Yes, probably. I just tried on Mac OS X, Pharo 4:
>
> $ export FOO=élève-Français
>
> $ echo $FOO
> élève-Français
>
> $ ./pharo Pharo.image eval 'OSPlatform current environment at: #FOO'
> 'élève-Français'
>
> $ ./pharo Pharo.image eval '(OSPlatform current environment at: #FOO) asByteArray utf8Decoded'
> 'élève-Français'
>
> The question is, is this true for all platforms ? Windows ?

Certainly not :-) On UNIX systems the encoding is undefined AFAIK.
You may want to consult locale before deciding on encoding -
this works quite well. nl_langinfo() is the guy to ask.

Best, Jan

>
> > > On 20 Jan 2015, at 16:51, Hilaire <[hidden email]> wrote:
> > >
> > > Le 20/01/2015 16:34, Sven Van Caekenberghe a écrit :
> > >> No they are not - Strings and Characters in Pharo are using plain Unicode encoding internally.
> > >
> > > Thanks for the update, and the reference link.
> > >
> > > Hilaire
> > >
> > > --
> > > Dr. Geo - http://drgeo.eu
> > > iStoa - http://istoa.drgeo.eu
> > >
> > >
> >
> >
> >
> >
> >
> > --
> > best,
> > Eliot
>
>
>



Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] CommandLine handler and UTF8 path

Eliot Miranda-2
In reply to this post by Sven Van Caekenberghe-2


On Tue, Jan 20, 2015 at 12:35 PM, Sven Van Caekenberghe <[hidden email]> wrote:
Hi Eliot,

> On 20 Jan 2015, at 20:38, Eliot Miranda <[hidden email]> wrote:
>
>
>
> On Tue, Jan 20, 2015 at 8:00 AM, Sven Van Caekenberghe <[hidden email]> wrote:
> Command line arguments enter the image level via VirtualMachine>>#getSystemAttribute:
>
> At that point they are already Strings.
>
> ByteString, according to the primitive.  So if the shell supplies e.g. UTF-8 strings for command-line parameters, which the VM sees as bytes, then the ByteString instances answered by getSystemAttribute: would need decoding, right?
>
> In your case, they must already be wrong at that point.
>
> Not necessarily.  The  getSystemAttribute: primitive doesn't do decoding.  Perhaps it should.

Yes, probably. I just tried on Mac OS X, Pharo 4:

$ export FOO=élève-Français

$ echo $FOO
élève-Français

$ ./pharo Pharo.image eval 'OSPlatform current environment at: #FOO'
'élève-Français'

$ ./pharo Pharo.image eval '(OSPlatform current environment at: #FOO) asByteArray utf8Decoded'
'élève-Français'

The question is, is this true for all platforms ? Windows ?

I'm trying to test this in Pharo 3.  I get

KeyNotFound: key #FOO not found in PlatformIndependentEnvironment
PlatformIndependentEnvironment(OSEnvironment)>>at: in Block: [ KeyNotFound signalFor: aKey ]
UndefinedObject>>ifNil:
PlatformIndependentEnvironment(OSEnvironment)>>at:ifAbsent:
PlatformIndependentEnvironment(OSEnvironment)>>at:
UndefinedObject>>DoIt
OpalCompiler>>evaluate
OpalCompiler(AbstractCompiler)>>evaluate:
SmalltalkImage>>evaluate:
EvaluateCommandLineHandler>>evaluate: in Block: [ ...
BlockClosure>>on:do: 


Does the environment access depend on NativeBoost?



> > On 20 Jan 2015, at 16:51, Hilaire <[hidden email]> wrote:
> >
> > Le 20/01/2015 16:34, Sven Van Caekenberghe a écrit :
> >> No they are not - Strings and Characters in Pharo are using plain Unicode encoding internally.
> >
> > Thanks for the update, and the reference link.
> >
> > Hilaire
> >
> > --
> > Dr. Geo - http://drgeo.eu
> > iStoa - http://istoa.drgeo.eu
> >
> >
>
>
>
>
>
> --
> best,
> Eliot





--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] CommandLine handler and UTF8 path

Sven Van Caekenberghe-2

> On 21 Jan 2015, at 23:25, Eliot Miranda <[hidden email]> wrote:
>
>
>
> On Tue, Jan 20, 2015 at 12:35 PM, Sven Van Caekenberghe <[hidden email]> wrote:
> Hi Eliot,
>
> > On 20 Jan 2015, at 20:38, Eliot Miranda <[hidden email]> wrote:
> >
> >
> >
> > On Tue, Jan 20, 2015 at 8:00 AM, Sven Van Caekenberghe <[hidden email]> wrote:
> > Command line arguments enter the image level via VirtualMachine>>#getSystemAttribute:
> >
> > At that point they are already Strings.
> >
> > ByteString, according to the primitive.  So if the shell supplies e.g. UTF-8 strings for command-line parameters, which the VM sees as bytes, then the ByteString instances answered by getSystemAttribute: would need decoding, right?
> >
> > In your case, they must already be wrong at that point.
> >
> > Not necessarily.  The  getSystemAttribute: primitive doesn't do decoding.  Perhaps it should.
>
> Yes, probably. I just tried on Mac OS X, Pharo 4:
>
> $ export FOO=élève-Français
>
> $ echo $FOO
> élève-Français
>
> $ ./pharo Pharo.image eval 'OSPlatform current environment at: #FOO'
> 'élève-Français'
>
> $ ./pharo Pharo.image eval '(OSPlatform current environment at: #FOO) asByteArray utf8Decoded'
> 'élève-Français'
>
> The question is, is this true for all platforms ? Windows ?
>
> I'm trying to test this in Pharo 3.  I get
>
> KeyNotFound: key #FOO not found in PlatformIndependentEnvironment
> PlatformIndependentEnvironment(OSEnvironment)>>at: in Block: [ KeyNotFound signalFor: aKey ]
> UndefinedObject>>ifNil:
> PlatformIndependentEnvironment(OSEnvironment)>>at:ifAbsent:
> PlatformIndependentEnvironment(OSEnvironment)>>at:
> UndefinedObject>>DoIt
> OpalCompiler>>evaluate
> OpalCompiler(AbstractCompiler)>>evaluate:
> SmalltalkImage>>evaluate:
> EvaluateCommandLineHandler>>evaluate: in Block: [ ...
> BlockClosure>>on:do:
>
>
> Does the environment access depend on NativeBoost?

Yes, I believe so.

> > > On 20 Jan 2015, at 16:51, Hilaire <[hidden email]> wrote:
> > >
> > > Le 20/01/2015 16:34, Sven Van Caekenberghe a écrit :
> > >> No they are not - Strings and Characters in Pharo are using plain Unicode encoding internally.
> > >
> > > Thanks for the update, and the reference link.
> > >
> > > Hilaire
> > >
> > > --
> > > Dr. Geo - http://drgeo.eu
> > > iStoa - http://istoa.drgeo.eu
> > >
> > >
> >
> >
> >
> >
> >
> > --
> > best,
> > Eliot
>
>
>
>
>
> --
> best,
> Eliot


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] CommandLine handler and UTF8 path

philippeback
Unix

environAt: index
| address |
address := NBExternalAddress value: (self environ nbUInt32AtOffset: index-1 * 4).
address isNull ifTrue: [ ^ nil ].
^ address readString

Windows

getEnv: aVariableName
| valueSize buffer |
valueSize := self getEnvSize: aVariableName.
valueSize = 0 
ifTrue: [ ^ nil ].
buffer := String new: valueSize.
(self getEnv: aVariableName buffer: buffer size: valueSize) = (valueSize - 1)
ifFalse: [ ^ nil ].
^ buffer allButLast

getEnvSize: nameString

<primitive: #primitiveNativeCall module: #NativeBoostPlugin error: errorCode >
^ self nbCall: #( int GetEnvironmentVariableA ( String nameString, nil, 0 ) ) module: #Kernel32

In Windows:

OSPlatform current  environment at: #PATH

works.

So, they both depend on NB indeed.

Phil




On Wed, Jan 21, 2015 at 11:50 PM, Sven Van Caekenberghe <[hidden email]> wrote:

> On 21 Jan 2015, at 23:25, Eliot Miranda <[hidden email]> wrote:
>
>
>
> On Tue, Jan 20, 2015 at 12:35 PM, Sven Van Caekenberghe <[hidden email]> wrote:
> Hi Eliot,
>
> > On 20 Jan 2015, at 20:38, Eliot Miranda <[hidden email]> wrote:
> >
> >
> >
> > On Tue, Jan 20, 2015 at 8:00 AM, Sven Van Caekenberghe <[hidden email]> wrote:
> > Command line arguments enter the image level via VirtualMachine>>#getSystemAttribute:
> >
> > At that point they are already Strings.
> >
> > ByteString, according to the primitive.  So if the shell supplies e.g. UTF-8 strings for command-line parameters, which the VM sees as bytes, then the ByteString instances answered by getSystemAttribute: would need decoding, right?
> >
> > In your case, they must already be wrong at that point.
> >
> > Not necessarily.  The  getSystemAttribute: primitive doesn't do decoding.  Perhaps it should.
>
> Yes, probably. I just tried on Mac OS X, Pharo 4:
>
> $ export FOO=élève-Français
>
> $ echo $FOO
> élève-Français
>
> $ ./pharo Pharo.image eval 'OSPlatform current environment at: #FOO'
> 'élève-Français'
>
> $ ./pharo Pharo.image eval '(OSPlatform current environment at: #FOO) asByteArray utf8Decoded'
> 'élève-Français'
>
> The question is, is this true for all platforms ? Windows ?
>
> I'm trying to test this in Pharo 3.  I get
>
> KeyNotFound: key #FOO not found in PlatformIndependentEnvironment
> PlatformIndependentEnvironment(OSEnvironment)>>at: in Block: [ KeyNotFound signalFor: aKey ]
> UndefinedObject>>ifNil:
> PlatformIndependentEnvironment(OSEnvironment)>>at:ifAbsent:
> PlatformIndependentEnvironment(OSEnvironment)>>at:
> UndefinedObject>>DoIt
> OpalCompiler>>evaluate
> OpalCompiler(AbstractCompiler)>>evaluate:
> SmalltalkImage>>evaluate:
> EvaluateCommandLineHandler>>evaluate: in Block: [ ...
> BlockClosure>>on:do:
>
>
> Does the environment access depend on NativeBoost?

Yes, I believe so.

> > > On 20 Jan 2015, at 16:51, Hilaire <[hidden email]> wrote:
> > >
> > > Le 20/01/2015 16:34, Sven Van Caekenberghe a écrit :
> > >> No they are not - Strings and Characters in Pharo are using plain Unicode encoding internally.
> > >
> > > Thanks for the update, and the reference link.
> > >
> > > Hilaire
> > >
> > > --
> > > Dr. Geo - http://drgeo.eu
> > > iStoa - http://istoa.drgeo.eu
> > >
> > >
> >
> >
> >
> >
> >
> > --
> > best,
> > Eliot
>
>
>
>
>
> --
> best,
> Eliot





--
---
Philippe Back
Visible Performance Improvements
Mob: +32(0) 478 650 140 | Fax: +32 (0) 70 408 027
Blog: http://philippeback.be | Twitter: @philippeback

High Octane SPRL
rue cour Boisacq 101 | 1301 Bierges | Belgium

Pharo Consortium Member - http://consortium.pharo.org/
Featured on the Software Process and Measurement Cast - http://spamcast.libsyn.com
Sparx Systems Enterprise Architect and Ability Engineering EADocX Value Added Reseller