Hi,
The dictionary OSPlatform current environment contains a copy of the OS's environment variables (more correctly of the VM process), as key/value pairs. These are obtained via the following system calls: on macOS & *nix LIBC environ on Windows KERNEL32 GetEnvironmentStrings It is however a bit unclear how these are encoded. On macOS & *nix that seems to be UTF8, on Windows there are some reports that it appears to be Latin1 - but both might be locale specific, I don't know either way. Does anyone know for sure ? I furthermore think that OSEnvironment and its subclasses, who do this call, should be responsible for decoding the C strings into proper Pharo strings, and not leave that responsibility to its users. Fundamentally, in the following, the decoding is still not done correctly and that is wrong/confusing IMHO. $ FOO=benoît ./pharo Pharo.image eval 'OSEnvironment current associations' {'TERM_PROGRAM'->'Apple_Terminal'. 'TERM'->'xterm-256color'. 'SHELL'->'/bin/bash'. 'TMPDIR'->'/var/folders/sy/sndrtj9j1tq06j0lfnshmrl80000gn/T/'. 'FOO'->'benoît'. 'Apple_PubSub_Socket_Render'->'/private/tmp/com.apple.launchd.uWk7pivcLT/Render'. 'TERM_PROGRAM_VERSION'->'404'. 'TERM_SESSION_ID'->'845BECCD-0AB0-4686-B7F9-3A0FF84BDCB7'. 'USER'->'sven'. 'SSH_AUTH_SOCK'->'/private/tmp/com.apple.launchd.y5oCwdUyaG/Listeners'. 'PATH'->'/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/texbin:/opt/X11/bin'. 'PWD'->'/tmp/benoiÌt'. 'XPC_FLAGS'->'0x0'. 'XPC_SERVICE_NAME'->'0'. 'HOME'->'/Users/sven'. 'SHLVL'->'2'. 'LOGNAME'->'sven'. 'LC_CTYPE'->'UTF-8'. 'DISPLAY'->'/private/tmp/com.apple.launchd.lsgASYFiWW/org.macosforge.xquartz:0'. 'SECURITYSESSIONID'->'186a9'. 'OLDPWD'->'/tmp/benoiÌt'. '_'->'/tmp/benoiÌt/pharo-vm/Pharo.app/Contents/MacOS/Pharo'. '__CF_USER_TEXT_ENCODING'->'0x1F5:0x0:0x0'} Of course, if we change this, we will need to fix callers. Opinions ? Sven PS: Furthermore, I note that there is a subtle difference in how $FOO and $PWD in the above are UTF-8 encoded. In the former, normalisation was done, in the latter not. Maybe that could lead to problems (when comparing/composing them). This is a difficult/complex subject (https://medium.com/concerning-pharo/an-implementation-of-unicode-normalization-7c6719068f43). |
I do remember clearly that while debugging that problem, the %LOCALAPPDATA% environment at some point kept that string encoded with Latin-1 (I'm on Windows 10, french version). I have not been able to reproduce the exact sequence which led to that specific case unfortunately...
----------------- Benoît St-Jean Yahoo! Messenger: bstjean Twitter: @BenLeChialeux Pinterest: benoitstjean Instagram: Chef_Benito IRC: lamneth Blogue: endormitoire.wordpress.com "A standpoint is an intellectual horizon of radius zero". (A. Einstein)
On Tuesday, April 17, 2018, 3:37:01 a.m. EDT, Sven Van Caekenberghe <[hidden email]> wrote:
Hi, The dictionary OSPlatform current environment contains a copy of the OS's environment variables (more correctly of the VM process), as key/value pairs. These are obtained via the following system calls: on macOS & *nix LIBC environ on Windows KERNEL32 GetEnvironmentStrings It is however a bit unclear how these are encoded. On macOS & *nix that seems to be UTF8, on Windows there are some reports that it appears to be Latin1 - but both might be locale specific, I don't know either way. Does anyone know for sure ? I furthermore think that OSEnvironment and its subclasses, who do this call, should be responsible for decoding the C strings into proper Pharo strings, and not leave that responsibility to its users. Fundamentally, in the following, the decoding is still not done correctly and that is wrong/confusing IMHO. $ FOO=benoît ./pharo Pharo.image eval 'OSEnvironment current associations' {'TERM_PROGRAM'->'Apple_Terminal'. 'TERM'->'xterm-256color'. 'SHELL'->'/bin/bash'. 'TMPDIR'->'/var/folders/sy/sndrtj9j1tq06j0lfnshmrl80000gn/T/'. 'FOO'->'benoît'. 'Apple_PubSub_Socket_Render'->'/private/tmp/com.apple.launchd.uWk7pivcLT/Render'. 'TERM_PROGRAM_VERSION'->'404'. 'TERM_SESSION_ID'->'845BECCD-0AB0-4686-B7F9-3A0FF84BDCB7'. 'USER'->'sven'. 'SSH_AUTH_SOCK'->'/private/tmp/com.apple.launchd.y5oCwdUyaG/Listeners'. 'PATH'->'/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/texbin:/opt/X11/bin'. 'PWD'->'/tmp/benoiÌt'. 'XPC_FLAGS'->'0x0'. 'XPC_SERVICE_NAME'->'0'. 'HOME'->'/Users/sven'. 'SHLVL'->'2'. 'LOGNAME'->'sven'. 'LC_CTYPE'->'UTF-8'. 'DISPLAY'->'/private/tmp/com.apple.launchd.lsgASYFiWW/org.macosforge.xquartz:0'. 'SECURITYSESSIONID'->'186a9'. 'OLDPWD'->'/tmp/benoiÌt'. '_'->'/tmp/benoiÌt/pharo-vm/Pharo.app/Contents/MacOS/Pharo'. '__CF_USER_TEXT_ENCODING'->'0x1F5:0x0:0x0'} Of course, if we change this, we will need to fix callers. Opinions ? Sven PS: Furthermore, I note that there is a subtle difference in how $FOO and $PWD in the above are UTF-8 encoded. In the former, normalisation was done, in the latter not. Maybe that could lead to problems (when comparing/composing them). This is a difficult/complex subject (https://medium.com/concerning-pharo/an-implementation-of-unicode-normalization-7c6719068f43). |
In reply to this post by Sven Van Caekenberghe-2
It seems macOS normalizes UTF-8 differently from everyone else in file names (I think base character + composing instead of precomposed codepoint). That might affect PWD. For environment variables, even if most sensible platforms should have adopted UTF-8 by now, I wouldn't be surprised if there's no official encoding whatsoever (i.e. they're just bytes with a 0 at the end…) On 17 April 2018 at 09:36, Sven Van Caekenberghe <[hidden email]> wrote: Hi, |
> On 17 Apr 2018, at 09:57, Damien Pollet <[hidden email]> wrote: > > It seems macOS normalizes UTF-8 differently from everyone else in file names (I think base character + composing instead of precomposed codepoint). That might affect PWD. > For environment variables, even if most sensible platforms should have adopted UTF-8 by now, I wouldn't be surprised if there's no official encoding whatsoever (i.e. they're just bytes with a 0 at the end…) ;-) We can decode everything, we have all the tools, but of course, we first have to know what encoding is being used. Hence my question. > On 17 April 2018 at 09:36, Sven Van Caekenberghe <[hidden email]> wrote: > Hi, > > The dictionary > > OSPlatform current environment > > contains a copy of the OS's environment variables (more correctly of the VM process), as key/value pairs. > > These are obtained via the following system calls: > > on macOS & *nix > > LIBC environ > > on Windows > > KERNEL32 GetEnvironmentStrings > > It is however a bit unclear how these are encoded. On macOS & *nix that seems to be UTF8, on Windows there are some reports that it appears to be Latin1 - but both might be locale specific, I don't know either way. > > Does anyone know for sure ? > > I furthermore think that OSEnvironment and its subclasses, who do this call, should be responsible for decoding the C strings into proper Pharo strings, and not leave that responsibility to its users. > > Fundamentally, in the following, the decoding is still not done correctly and that is wrong/confusing IMHO. > > $ FOO=benoît ./pharo Pharo.image eval 'OSEnvironment current associations' > {'TERM_PROGRAM'->'Apple_Terminal'. 'TERM'->'xterm-256color'. 'SHELL'->'/bin/bash'. 'TMPDIR'->'/var/folders/sy/sndrtj9j1tq06j0lfnshmrl80000gn/T/'. 'FOO'->'benoît'. 'Apple_PubSub_Socket_Render'->'/private/tmp/com.apple.launchd.uWk7pivcLT/Render'. 'TERM_PROGRAM_VERSION'->'404'. 'TERM_SESSION_ID'->'845BECCD-0AB0-4686-B7F9-3A0FF84BDCB7'. 'USER'->'sven'. 'SSH_AUTH_SOCK'->'/private/tmp/com.apple.launchd.y5oCwdUyaG/Listeners'. 'PATH'->'/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/texbin:/opt/X11/bin'. 'PWD'->'/tmp/benoiÌ‚t'. 'XPC_FLAGS'->'0x0'. 'XPC_SERVICE_NAME'->'0'. 'HOME'->'/Users/sven'. 'SHLVL'->'2'. 'LOGNAME'->'sven'. 'LC_CTYPE'->'UTF-8'. 'DISPLAY'->'/private/tmp/com.apple.launchd.lsgASYFiWW/org.macosforge.xquartz:0'. 'SECURITYSESSIONID'->'186a9'. 'OLDPWD'->'/tmp/benoiÌ‚t'. '_'->'/tmp/benoiÌ‚t/pharo-vm/Pharo.app/Contents/MacOS/Pharo'. '__CF_USER_TEXT_ENCODING'->'0x1F5:0x0:0x0'} > > Of course, if we change this, we will need to fix callers. > > Opinions ? > > Sven > > PS: Furthermore, I note that there is a subtle difference in how $FOO and $PWD in the above are UTF-8 encoded. In the former, normalisation was done, in the latter not. Maybe that could lead to problems (when comparing/composing them). This is a difficult/complex subject (https://medium.com/concerning-pharo/an-implementation-of-unicode-normalization-7c6719068f43). > > > > > > > -- > Damien Pollet > type less, do more [ | ] http://people.untyped.org/damien.pollet |
2018-04-17 10:05 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
Interestingly, this is only for the dictionary operations (asDictionary, keysAndValuesDo...) If you just access the variable with getEnv, it works: OSPlatform current environment setEnv:'FOO' value:'benoît'. OSPlatform current environment getEnv:'FOO'. "'benoît'" OSPlatform current environment asDictionary at: 'FOO'. "'benoŒt'"
|
In reply to this post by Sven Van Caekenberghe-2
primitiveGetenv returns values in the current locale's code page on Windows;
a value bound to € returns a stings with single char 128 on MS1252 (western european) at least. On windows, there are three versions of each api call with string parameters/returns; xxx (depending on UNICODE being defined, either resolves to *A or *W) xxxA (Ascii, or, more accurately, current code page)* xxxW (UTF-16) IIRC, the intention is that primitives receiving/sending char* to the image will expect/return utf8, so a conversion macro before passing it on to the syscall would be necessary on Windows; I believe one exist already and is used in at least the file plugin primitives. Cheers, Henry * The windows FFI fallback used if primitive fails calls the *A version directly, and can be changed to call *W correctly, but there's a fair bit of wrapping fluff involved; https://youtu.be/Um41DPPs5ZA?list=PL843D1D545F9F52B6&t=1591 -- Sent from: http://forum.world.st/Pharo-Smalltalk-Developers-f1294837.html |
In reply to this post by Nicolai Hess-3-2
> On 17 Apr 2018, at 10:40, Nicolai Hess <[hidden email]> wrote: > > > > 2018-04-17 10:05 GMT+02:00 Sven Van Caekenberghe <[hidden email]>: > > > > On 17 Apr 2018, at 09:57, Damien Pollet <[hidden email]> wrote: > > > > It seems macOS normalizes UTF-8 differently from everyone else in file names (I think base character + composing instead of precomposed codepoint). That might affect PWD. > > For environment variables, even if most sensible platforms should have adopted UTF-8 by now, I wouldn't be surprised if there's no official encoding whatsoever (i.e. they're just bytes with a 0 at the end…) > > ;-) > > We can decode everything, we have all the tools, but of course, we first have to know what encoding is being used. Hence my question. > > > On 17 April 2018 at 09:36, Sven Van Caekenberghe <[hidden email]> wrote: > > Hi, > > > > The dictionary > > > > OSPlatform current environment > > > > contains a copy of the OS's environment variables (more correctly of the VM process), as key/value pairs. > > > > These are obtained via the following system calls: > > > > on macOS & *nix > > > > LIBC environ > > > > on Windows > > > > KERNEL32 GetEnvironmentStrings > > > Interestingly, this is only for the dictionary operations (asDictionary, keysAndValuesDo...) > If you just access the variable with getEnv, it works: > > OSPlatform current environment setEnv:'FOO' value:'benoît'. > OSPlatform current environment getEnv:'FOO'. "'benoît'" > OSPlatform current environment asDictionary at: 'FOO'. "'benoŒt'" Hmm, not for me (on macOS): $ FOO=benoît ./pharo Pharo.image eval "OSPlatform current environment getEnv:'FOO'" 'benoît' If you put it in yourself, are you not cheating then ? > > > > It is however a bit unclear how these are encoded. On macOS & *nix that seems to be UTF8, on Windows there are some reports that it appears to be Latin1 - but both might be locale specific, I don't know either way. > > > > Does anyone know for sure ? > > > > I furthermore think that OSEnvironment and its subclasses, who do this call, should be responsible for decoding the C strings into proper Pharo strings, and not leave that responsibility to its users. > > > > Fundamentally, in the following, the decoding is still not done correctly and that is wrong/confusing IMHO. > > > > $ FOO=benoît ./pharo Pharo.image eval 'OSEnvironment current associations' > > {'TERM_PROGRAM'->'Apple_Terminal'. 'TERM'->'xterm-256color'. 'SHELL'->'/bin/bash'. 'TMPDIR'->'/var/folders/sy/sndrtj9j1tq06j0lfnshmrl80000gn/T/'. 'FOO'->'benoît'. 'Apple_PubSub_Socket_Render'->'/private/tmp/com.apple.launchd.uWk7pivcLT/Render'. 'TERM_PROGRAM_VERSION'->'404'. 'TERM_SESSION_ID'->'845BECCD-0AB0-4686-B7F9-3A0FF84BDCB7'. 'USER'->'sven'. 'SSH_AUTH_SOCK'->'/private/tmp/com.apple.launchd.y5oCwdUyaG/Listeners'. 'PATH'->'/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/texbin:/opt/X11/bin'. 'PWD'->'/tmp/benoiÌ‚t'. 'XPC_FLAGS'->'0x0'. 'XPC_SERVICE_NAME'->'0'. 'HOME'->'/Users/sven'. 'SHLVL'->'2'. 'LOGNAME'->'sven'. 'LC_CTYPE'->'UTF-8'. 'DISPLAY'->'/private/tmp/com.apple.launchd.lsgASYFiWW/org.macosforge.xquartz:0'. 'SECURITYSESSIONID'->'186a9'. 'OLDPWD'->'/tmp/benoiÌ‚t'. '_'->'/tmp/benoiÌ‚t/pharo-vm/Pharo.app/Contents/MacOS/Pharo'. '__CF_USER_TEXT_ENCODING'->'0x1F5:0x0:0x0'} > > > > Of course, if we change this, we will need to fix callers. > > > > Opinions ? > > > > Sven > > > > PS: Furthermore, I note that there is a subtle difference in how $FOO and $PWD in the above are UTF-8 encoded. In the former, normalisation was done, in the latter not. Maybe that could lead to problems (when comparing/composing them). This is a difficult/complex subject (https://medium.com/concerning-pharo/an-implementation-of-unicode-normalization-7c6719068f43). > > > > > > > > > > > > > > -- > > Damien Pollet > > type less, do more [ | ] http://people.untyped.org/damien.pollet |
In reply to this post by Henrik Sperre Johansen
Yes, I also saw the *W variants, my initial reaction was that those would make more sense, but it is hard to see all consequences.
> On 17 Apr 2018, at 10:49, Henrik Sperre Johansen <[hidden email]> wrote: > > primitiveGetenv returns values in the current locale's code page on Windows; > a value bound to € returns a stings with single char 128 on MS1252 (western > european) at least. > > On windows, there are three versions of each api call with string > parameters/returns; > xxx (depending on UNICODE being defined, either resolves to *A or *W) > xxxA (Ascii, or, more accurately, current code page)* > xxxW (UTF-16) > > IIRC, the intention is that primitives receiving/sending char* to the image > will expect/return utf8, so a conversion macro before passing it on to the > syscall would be necessary on Windows; I believe one exist already and is > used in at least the file plugin primitives. > > Cheers, > Henry > > * The windows FFI fallback used if primitive fails calls the *A version > directly, and can be changed to call *W correctly, but there's a fair bit of > wrapping fluff involved; > https://youtu.be/Um41DPPs5ZA?list=PL843D1D545F9F52B6&t=1591 > > > > -- > Sent from: http://forum.world.st/Pharo-Smalltalk-Developers-f1294837.html > |
In reply to this post by Damien Pollet
Damien Pollet wrote
> It seems macOS normalizes UTF-8 differently from everyone else in file > names (I think base character + composing instead of precomposed > codepoint). That might affect PWD. > For environment variables, even if most sensible platforms should have > adopted UTF-8 by now, I wouldn't be surprised if there's no official > encoding whatsoever (i.e. they're just bytes with a 0 at the end…) > > On 17 April 2018 at 09:36, Sven Van Caekenberghe < > sven@ > > wrote: > >> Hi, >> >> The dictionary >> >> OSPlatform current environment >> >> contains a copy of the OS's environment variables (more correctly of the >> VM process), as key/value pairs. >> >> These are obtained via the following system calls: >> >> on macOS & *nix >> >> LIBC environ >> >> on Windows >> >> KERNEL32 GetEnvironmentStrings >> >> It is however a bit unclear how these are encoded. On macOS & *nix that >> seems to be UTF8, on Windows there are some reports that it appears to be >> Latin1 - but both might be locale specific, I don't know either way. >> >> Does anyone know for sure ? >> >> I furthermore think that OSEnvironment and its subclasses, who do this >> call, should be responsible for decoding the C strings into proper Pharo >> strings, and not leave that responsibility to its users. >> >> Fundamentally, in the following, the decoding is still not done correctly >> and that is wrong/confusing IMHO. >> >> $ FOO=benoît ./pharo Pharo.image eval 'OSEnvironment current >> associations' >> {'TERM_PROGRAM'->'Apple_Terminal'. 'TERM'->'xterm-256color'. >> 'SHELL'->'/bin/bash'. 'TMPDIR'->'/var/folders/sy/ >> sndrtj9j1tq06j0lfnshmrl80000gn/T/'. 'FOO'->'benoît'. >> 'Apple_PubSub_Socket_Render'->'/private/tmp/com.apple.launchd.uWk7pivcLT/Render'. >> 'TERM_PROGRAM_VERSION'->'404'. >> 'TERM_SESSION_ID'->'845BECCD-0AB0-4686-B7F9-3A0FF84BDCB7'. >> 'USER'->'sven'. >> 'SSH_AUTH_SOCK'->'/private/tmp/com.apple.launchd.y5oCwdUyaG/Listeners'. >> 'PATH'->'/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/texbin:/opt/X11/bin'. >> 'PWD'->'/tmp/benoiÌ‚t'. 'XPC_FLAGS'->'0x0'. 'XPC_SERVICE_NAME'->'0'. >> 'HOME'->'/Users/sven'. 'SHLVL'->'2'. 'LOGNAME'->'sven'. >> 'LC_CTYPE'->'UTF-8'. 'DISPLAY'->'/private/tmp/com. >> apple.launchd.lsgASYFiWW/org.macosforge.xquartz:0'. >> 'SECURITYSESSIONID'->'186a9'. 'OLDPWD'->'/tmp/benoiÌ‚t'. >> '_'->'/tmp/benoiÌ‚t/pharo-vm/Pharo.app/Contents/MacOS/Pharo'. >> '__CF_USER_TEXT_ENCODING'->'0x1F5:0x0:0x0'} >> >> Of course, if we change this, we will need to fix callers. >> >> Opinions ? >> >> Sven >> >> PS: Furthermore, I note that there is a subtle difference in how $FOO and >> $PWD in the above are UTF-8 encoded. In the former, normalisation was >> done, >> in the latter not. Maybe that could lead to problems (when >> comparing/composing them). This is a difficult/complex subject ( >> https://medium.com/concerning-pharo/an-implementation-of-unicode- >> normalization-7c6719068f43). >> >> >> >> > > > -- > Damien Pollet > type less, do more [ | ] http://people.untyped.org/damien.pollet If by different, you mean that it actually normalizes the file names, then yes. All Mac filenames are in a well defined form; NFD. On linux, they're just arrays of bytes, and anything goes. That the bytes mostly happen to be valid utf8 strings in NFC, is just a by-product of the fact that's the format most programs use when calling the file primitives. Cheers, Henry -- Sent from: http://forum.world.st/Pharo-Smalltalk-Developers-f1294837.html |
Hi, I think this problem is not environment variable exclusive. It also affects file paths and others. So far Pharo does not detect the locale to perform the encoding and it should be nice to do it. On Tue, Apr 17, 2018 at 10:56 AM, Henrik Sperre Johansen <[hidden email]> wrote: Damien Pollet wrote
|
> On 19 Apr 2018, at 10:21, Guillermo Polito <[hidden email]> wrote: > > Hi, > > I think this problem is not environment variable exclusive. It also affects file paths and others. So far Pharo does not detect the locale to perform the encoding and it should be nice to do it. Sure, it would be nice/good/helpful to detect locale (BTW, don't we have that already more or less). But I would be surprised if an OS API would deliver different encoded data to a process, depending on the locale - I mean in general. That would be setting up things for a huge distaster, IMHO. A modern OS should just deliver UTF-8 (full Unicode data points) and be done with it. > On Tue, Apr 17, 2018 at 10:56 AM, Henrik Sperre Johansen <[hidden email]> wrote: > Damien Pollet wrote > > It seems macOS normalizes UTF-8 differently from everyone else in file > > names (I think base character + composing instead of precomposed > > codepoint). That might affect PWD. > > For environment variables, even if most sensible platforms should have > > adopted UTF-8 by now, I wouldn't be surprised if there's no official > > encoding whatsoever (i.e. they're just bytes with a 0 at the end…) > > > > On 17 April 2018 at 09:36, Sven Van Caekenberghe < > > > sven@ > > > > wrote: > > > >> Hi, > >> > >> The dictionary > >> > >> OSPlatform current environment > >> > >> contains a copy of the OS's environment variables (more correctly of the > >> VM process), as key/value pairs. > >> > >> These are obtained via the following system calls: > >> > >> on macOS & *nix > >> > >> LIBC environ > >> > >> on Windows > >> > >> KERNEL32 GetEnvironmentStrings > >> > >> It is however a bit unclear how these are encoded. On macOS & *nix that > >> seems to be UTF8, on Windows there are some reports that it appears to be > >> Latin1 - but both might be locale specific, I don't know either way. > >> > >> Does anyone know for sure ? > >> > >> I furthermore think that OSEnvironment and its subclasses, who do this > >> call, should be responsible for decoding the C strings into proper Pharo > >> strings, and not leave that responsibility to its users. > >> > >> Fundamentally, in the following, the decoding is still not done correctly > >> and that is wrong/confusing IMHO. > >> > >> $ FOO=benoît ./pharo Pharo.image eval 'OSEnvironment current > >> associations' > >> {'TERM_PROGRAM'->'Apple_Terminal'. 'TERM'->'xterm-256color'. > >> 'SHELL'->'/bin/bash'. 'TMPDIR'->'/var/folders/sy/ > >> sndrtj9j1tq06j0lfnshmrl80000gn/T/'. 'FOO'->'benoît'. > >> 'Apple_PubSub_Socket_Render'->'/private/tmp/com.apple.launchd.uWk7pivcLT/Render'. > >> 'TERM_PROGRAM_VERSION'->'404'. > >> 'TERM_SESSION_ID'->'845BECCD-0AB0-4686-B7F9-3A0FF84BDCB7'. > >> 'USER'->'sven'. > >> 'SSH_AUTH_SOCK'->'/private/tmp/com.apple.launchd.y5oCwdUyaG/Listeners'. > >> 'PATH'->'/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/texbin:/opt/X11/bin'. > >> 'PWD'->'/tmp/benoiÌ‚t'. 'XPC_FLAGS'->'0x0'. 'XPC_SERVICE_NAME'->'0'. > >> 'HOME'->'/Users/sven'. 'SHLVL'->'2'. 'LOGNAME'->'sven'. > >> 'LC_CTYPE'->'UTF-8'. 'DISPLAY'->'/private/tmp/com. > >> apple.launchd.lsgASYFiWW/org.macosforge.xquartz:0'. > >> 'SECURITYSESSIONID'->'186a9'. 'OLDPWD'->'/tmp/benoiÌ‚t'. > >> '_'->'/tmp/benoiÌ‚t/pharo-vm/Pharo.app/Contents/MacOS/Pharo'. > >> '__CF_USER_TEXT_ENCODING'->'0x1F5:0x0:0x0'} > >> > >> Of course, if we change this, we will need to fix callers. > >> > >> Opinions ? > >> > >> Sven > >> > >> PS: Furthermore, I note that there is a subtle difference in how $FOO and > >> $PWD in the above are UTF-8 encoded. In the former, normalisation was > >> done, > >> in the latter not. Maybe that could lead to problems (when > >> comparing/composing them). This is a difficult/complex subject ( > >> https://medium.com/concerning-pharo/an-implementation-of-unicode- > >> normalization-7c6719068f43). > >> > >> > >> > >> > > > > > > -- > > Damien Pollet > > type less, do more [ | ] http://people.untyped.org/damien.pollet > > If by different, you mean that it actually normalizes the file names, then > yes. > All Mac filenames are in a well defined form; NFD. > On linux, they're just arrays of bytes, and anything goes. > That the bytes mostly happen to be valid utf8 strings in NFC, is just a > by-product of the fact that's the format most programs use when calling the > file primitives. > > Cheers, > Henry > > > > -- > Sent from: http://forum.world.st/Pharo-Smalltalk-Developers-f1294837.html > > > > > -- > > Guille Polito > Research Engineer > > Centre de Recherche en Informatique, Signal et Automatique de Lille > CRIStAL - UMR 9189 > French National Center for Scientific Research - http://www.cnrs.fr > > Web: http://guillep.github.io > Phone: +33 06 52 70 66 13 |
Free forum by Nabble | Edit this page |