Re: Win32OSProcessPlugin question (was: Better management of encoding of environment variables)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Win32OSProcessPlugin question (was: Better management of encoding of environment variables)

Eliot Miranda-2
 
Hi David,

  why in the plugin and not the base vm?

_,,,^..^,,,_ (phone)

> On Jan 22, 2019, at 9:57 AM, David T. Lewis <[hidden email]> wrote:
>
>
> Hi Nicolas,
>
> Motivated by the discussion on Pharo list, I added new primitives to
> OSProcessPlugin to answer environment variables and path information as
> raw ByteArray, such that those byte arrays can be converted in the
> image to strings with any encoding.
>
> I did the changes in "trunk" OSPP, and am now implementing them in the
> oscog branch. I have tested the Unix changes in both branches.
>
> Unfortunately I do not have a Windows development at the moment, so I
> need to ask for help.
>
> In Win32OSProcessPlugin, I am adding two primitives:
>  #primitiveGetCurrentWorkingDirectoryAsBytes
>  #primitiveGetEnvironmentStringsAsBytes
>
> These are based on the two string primitives, which remain available:
>  #primitiveGetCurrentWorkingDirectory
>  #primitiveGetEnvironmentStrings
>
> The string primitives include your recent changes for UTF8 encoding,
> and the "xxxAsBytes" variants are based on my earlier logic without
> the UTF8 support.
>
> This means that OSProcess on Windows will use your UTF8 string support,
> and the additional primitives (based on the old logic, and answering
> raw bytes) will be available for people who want to do the encoding
> work in the image.
>
> Does this sound like the right approach?
>
> Thanks,
> Dave
>
>
> I have a question concerning the Win32 changes.
>
>
>> On Wed, Jan 16, 2019 at 10:24:51AM +0100, Nicolas Cellier wrote:
>> IMO, windows VM (and plugins) should do the UCS2 -> UTF8 conversion because
>> the purpose of a VM is to provide an OS independant fa??ade.
>> I made progress recently in this area, but we should finish the
>> job/test/consolidate.
>> If someone bypass the VM and use direct windows API thru FFI, then he takes
>> the responsibility, but uniformity doesn't hurt.
>>
>> Le mer. 16 janv. 2019 ?? 10:14, Guillermo Polito <[hidden email]>
>> a ??crit :
>>
>>> Hi Stephan,
>>>
>>> I'm sorry for the noise.
>>>
>>> At the time, both #at: and #getEnv: variants existed. The changes
>>> backported from the PharoLauncher were only using the getter versions of
>>> getEnv, but for Pharo I decided to implement also the setter versions. And
>>> after checking the code and its users in image, I've finally decided to go
>>> for an at:[[ifAbsent]put:] version. So I'd say that the leading
>>> **guideline** was at the end the one here in the mailing list, but also if
>>> you check the PR I've introduced a more complete and consistent API,
>>> following the one of dictionaries.
>>>
>>> https://github.com/pharo-project/pharo/pull/1980/files
>>>
>>> at:
>>> at:ifAbsent:
>>> at:ifPresent:
>>> at:ifPresent:ifAbsent:
>>> at:put:
>>> removeKey:
>>>
>>> Plus, in *nix, variants where an encoding can be specified.
>>>
>>> I'm sorry if I've introduced some confussion.
>>>
>>>
>>> On Wed, Jan 16, 2019 at 9:47 AM Stephan Eggermont <[hidden email]>
>>> wrote:
>>>>
>>>> Guillermo Polito <[hidden email]>
>>>> wrote:
>>>>> Hi all,
>>>>>
>>>>> following the meeting we had here @Inria headquarters, I'll be
>>> backporting
>>>>> some of the improvements we did in the launcher this last month
>>> regarding
>>>>> the encoding of environment variables.
>>>>>
>>>>> I've opened for this issue https://pharo.fogbugz.com/f/cases/22658/
>>>>>
>>>>> We have already studied possible alternatives with Pablo and
>>> Christophe and
>>>>> we have some conclusions and we propose some changes:
>>>>>
>>>>> API Proposal for OSEnvironment
>>>>> =========================
>>>>>
>>>>>
>>>>>   -
>>>>> *at: aVariableName *
>>>>>
>>>>> Gets the String value of an environment variable called `aVariableName`
>>>>> It is the system reponsibility to manage the encoding.
>>>>> Rationale: A common denominator for all platforms providing an already
>>>>> decoded string, because windows does not (compared to *nix systems)
>>> provide
>>>>> a encoded byte representation of the value. Windows has instead its own
>>>>> wide string representation.
>>>>>
>>>>>   - *[optionally] rawAt: anEncodedVariableName*
>>>>>
>>>>> Gets the Byte value of an environment variable called
>>>>> `anEncodedVariableName`.
>>>>> It is the user responsibility to encode and decode argument and return
>>>>> values in the encoding of this preference.
>>>>> Rationale: Some systems may want to have the liberty to use different
>>>>> encodings, or even to put binary data in the variables.
>>>>>
>>>>>   - *[optionally] at: aVariableName encoding: anEncoding*
>>>>>
>>>>> Gets the value of an environment variable called `aVariableName` using
>>>>> `anEncoding` to encode/decode arguments and return values.
>>>>> Rationale: *xes could potentially use different encodings for their
>>>>> environment variables or even use different encodings in different
>>> parts of
>>>>> their file system.
>>>>>
>>>>> Other Implementation details
>>>>> =========================
>>>>>
>>>>>   - VM primitives returning paths Strings should be carefuly managed
>>> to
>>>>>   decode them, since they are actually C strings (so byte arrays)
>>> disguised
>>>>>   as ByteStrings.
>>>>>   - Windows requires calling the right *Wide version of the functions
>>> from
>>>>>   C, plus the correct encoding routine. This could be implemented as
>>> an FFI
>>>>>   call or by modifying the VM to do it properly instead of calling
>>> the Ascii
>>>>>   version
>>>>>
>>>>>
>>>>
>>>> What is the conclusion from this and issue 22658? See PR 2238. #getEnv:
>>> is
>>>> public API
>>>>
>>>> Stephan
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>>
>>>
>>> Guille Polito
>>>
>>> Research Engineer
>>>
>>> Centre de Recherche en Informatique, Signal et Automatique de Lille
>>>
>>> CRIStAL - UMR 9189
>>>
>>> French National Center for Scientific Research - http://www.cnrs.fr
>>>
>>>
>>> Web: http://guillep.github.io
>>>
>>> Phone: +33 06 52 70 66 13
>>>
Reply | Threaded
Open this post in threaded view
|

Re: Win32OSProcessPlugin question (was: Better management of encoding of environment variables)

David T. Lewis
 
On Tue, Jan 22, 2019 at 10:05:23AM -0800, Eliot Miranda wrote:
>  
> Hi David,
>
>   why in the plugin and not the base vm?

Because, as the author of OSPP and OSProcess, I have a personal
interest in tending that garden. Sven explained why it was important
to be able to do string encoding in the image, and I decided to see
how hard it would be to support that. It was easy, just some simple
refactoring in Smalltalk :-)

As an aside, I would not really recommend putting primitives like
these into the core VM. The underlying concepts are basically
Unix ideas that have taken root on Windows. They probably don't
make much sense on RiscOS, Android or SqueakJS runtime environments.

$0.02,

Dave

>
> _,,,^..^,,,_ (phone)
>
> > On Jan 22, 2019, at 9:57 AM, David T. Lewis <[hidden email]> wrote:
> >
> >
> > Hi Nicolas,
> >
> > Motivated by the discussion on Pharo list, I added new primitives to
> > OSProcessPlugin to answer environment variables and path information as
> > raw ByteArray, such that those byte arrays can be converted in the
> > image to strings with any encoding.
> >
> > I did the changes in "trunk" OSPP, and am now implementing them in the
> > oscog branch. I have tested the Unix changes in both branches.
> >
> > Unfortunately I do not have a Windows development at the moment, so I
> > need to ask for help.
> >
> > In Win32OSProcessPlugin, I am adding two primitives:
> >  #primitiveGetCurrentWorkingDirectoryAsBytes
> >  #primitiveGetEnvironmentStringsAsBytes
> >
> > These are based on the two string primitives, which remain available:
> >  #primitiveGetCurrentWorkingDirectory
> >  #primitiveGetEnvironmentStrings
> >
> > The string primitives include your recent changes for UTF8 encoding,
> > and the "xxxAsBytes" variants are based on my earlier logic without
> > the UTF8 support.
> >
> > This means that OSProcess on Windows will use your UTF8 string support,
> > and the additional primitives (based on the old logic, and answering
> > raw bytes) will be available for people who want to do the encoding
> > work in the image.
> >
> > Does this sound like the right approach?
> >
> > Thanks,
> > Dave
> >
> >
> > I have a question concerning the Win32 changes.
> >
> >
> >> On Wed, Jan 16, 2019 at 10:24:51AM +0100, Nicolas Cellier wrote:
> >> IMO, windows VM (and plugins) should do the UCS2 -> UTF8 conversion because
> >> the purpose of a VM is to provide an OS independant fa??ade.
> >> I made progress recently in this area, but we should finish the
> >> job/test/consolidate.
> >> If someone bypass the VM and use direct windows API thru FFI, then he takes
> >> the responsibility, but uniformity doesn't hurt.
> >>
> >> Le mer. 16 janv. 2019 ?? 10:14, Guillermo Polito <[hidden email]>
> >> a ??crit :
> >>
> >>> Hi Stephan,
> >>>
> >>> I'm sorry for the noise.
> >>>
> >>> At the time, both #at: and #getEnv: variants existed. The changes
> >>> backported from the PharoLauncher were only using the getter versions of
> >>> getEnv, but for Pharo I decided to implement also the setter versions. And
> >>> after checking the code and its users in image, I've finally decided to go
> >>> for an at:[[ifAbsent]put:] version. So I'd say that the leading
> >>> **guideline** was at the end the one here in the mailing list, but also if
> >>> you check the PR I've introduced a more complete and consistent API,
> >>> following the one of dictionaries.
> >>>
> >>> https://github.com/pharo-project/pharo/pull/1980/files
> >>>
> >>> at:
> >>> at:ifAbsent:
> >>> at:ifPresent:
> >>> at:ifPresent:ifAbsent:
> >>> at:put:
> >>> removeKey:
> >>>
> >>> Plus, in *nix, variants where an encoding can be specified.
> >>>
> >>> I'm sorry if I've introduced some confussion.
> >>>
> >>>
> >>> On Wed, Jan 16, 2019 at 9:47 AM Stephan Eggermont <[hidden email]>
> >>> wrote:
> >>>>
> >>>> Guillermo Polito <[hidden email]>
> >>>> wrote:
> >>>>> Hi all,
> >>>>>
> >>>>> following the meeting we had here @Inria headquarters, I'll be
> >>> backporting
> >>>>> some of the improvements we did in the launcher this last month
> >>> regarding
> >>>>> the encoding of environment variables.
> >>>>>
> >>>>> I've opened for this issue https://pharo.fogbugz.com/f/cases/22658/
> >>>>>
> >>>>> We have already studied possible alternatives with Pablo and
> >>> Christophe and
> >>>>> we have some conclusions and we propose some changes:
> >>>>>
> >>>>> API Proposal for OSEnvironment
> >>>>> =========================
> >>>>>
> >>>>>
> >>>>>   -
> >>>>> *at: aVariableName *
> >>>>>
> >>>>> Gets the String value of an environment variable called `aVariableName`
> >>>>> It is the system reponsibility to manage the encoding.
> >>>>> Rationale: A common denominator for all platforms providing an already
> >>>>> decoded string, because windows does not (compared to *nix systems)
> >>> provide
> >>>>> a encoded byte representation of the value. Windows has instead its own
> >>>>> wide string representation.
> >>>>>
> >>>>>   - *[optionally] rawAt: anEncodedVariableName*
> >>>>>
> >>>>> Gets the Byte value of an environment variable called
> >>>>> `anEncodedVariableName`.
> >>>>> It is the user responsibility to encode and decode argument and return
> >>>>> values in the encoding of this preference.
> >>>>> Rationale: Some systems may want to have the liberty to use different
> >>>>> encodings, or even to put binary data in the variables.
> >>>>>
> >>>>>   - *[optionally] at: aVariableName encoding: anEncoding*
> >>>>>
> >>>>> Gets the value of an environment variable called `aVariableName` using
> >>>>> `anEncoding` to encode/decode arguments and return values.
> >>>>> Rationale: *xes could potentially use different encodings for their
> >>>>> environment variables or even use different encodings in different
> >>> parts of
> >>>>> their file system.
> >>>>>
> >>>>> Other Implementation details
> >>>>> =========================
> >>>>>
> >>>>>   - VM primitives returning paths Strings should be carefuly managed
> >>> to
> >>>>>   decode them, since they are actually C strings (so byte arrays)
> >>> disguised
> >>>>>   as ByteStrings.
> >>>>>   - Windows requires calling the right *Wide version of the functions
> >>> from
> >>>>>   C, plus the correct encoding routine. This could be implemented as
> >>> an FFI
> >>>>>   call or by modifying the VM to do it properly instead of calling
> >>> the Ascii
> >>>>>   version
> >>>>>
> >>>>>
> >>>>
> >>>> What is the conclusion from this and issue 22658? See PR 2238. #getEnv:
> >>> is
> >>>> public API
> >>>>
> >>>> Stephan
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>>
> >>>
> >>>
> >>> Guille Polito
> >>>
> >>> Research Engineer
> >>>
> >>> Centre de Recherche en Informatique, Signal et Automatique de Lille
> >>>
> >>> CRIStAL - UMR 9189
> >>>
> >>> French National Center for Scientific Research - http://www.cnrs.fr
> >>>
> >>>
> >>> Web: http://guillep.github.io
> >>>
> >>> Phone: +33 06 52 70 66 13
> >>>