Encoding of path/file names in security plugin

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Encoding of path/file names in security plugin

Nicolas Cellier
 
Hi all,
while reviewing compiler warnings via MSVC for WIN-64 SPUR VM, I saw that we incorrectly pass a TCHAR * (via fromSqueak() function) to a function (isAccessiblePathName) expecting a WCHAR * (that is UTF-16 encoded wide char).

TCHAR is a facade, it might be char *, or it might be WCHAR *, depending on -DUNICODE compiler flag, or #define UNICODE preprocessor macro.

fromSqueak() just copies the image-side ByteString to a null terminated string. If -DUNICODE, it copies to a WCHAR *, but this only work for ASCII strings.

I suggest to rather interpret every path passed to the VM as UTF-8 encoded, and for that purpose I've committed a change to sqWin32Security.c.

But I see that other platforms flavours (unix at least) require the same kind of changes. Any thought before i proceeed?



Reply | Threaded
Open this post in threaded view
|

Re: Encoding of path/file names in security plugin

alistairgrant
 
Hi Nicolas,

On Thu, 27 Dec 2018 at 17:05, Nicolas Cellier
<[hidden email]> wrote:

>
>
> Hi all,
> while reviewing compiler warnings via MSVC for WIN-64 SPUR VM, I saw that we incorrectly pass a TCHAR * (via fromSqueak() function) to a function (isAccessiblePathName) expecting a WCHAR * (that is UTF-16 encoded wide char).
>
> TCHAR is a facade, it might be char *, or it might be WCHAR *, depending on -DUNICODE compiler flag, or #define UNICODE preprocessor macro.
>
> fromSqueak() just copies the image-side ByteString to a null terminated string. If -DUNICODE, it copies to a WCHAR *, but this only work for ASCII strings.
>
> I suggest to rather interpret every path passed to the VM as UTF-8 encoded, and for that purpose I've committed a change to sqWin32Security.c.
> https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/b52caab76f7f6b91c1f16d9037e0b0a43d968176

I actually thought this was what the VM always did, i.e. paths should
be UTF-8 encoded in the image before being passed to a plugin.


> But I see that other platforms flavours (unix at least) require the same kind of changes. Any thought before i proceeed?

For Linux and Mac there is:

- ux2sqPath(char *from, int fromLen, char *to, int toLen, int term)
- sq2uxPath(char *from, int fromLen, char *to, int toLen, int term)

On Mac this takes care of using the Mac specific decomposed UTF-8
encoding required on that platform.  Both of these assume that the
image is passing down (and receiving) a UTF-8 encoded string.  See
platforms/unix/vm/sqUnixCharConv.h.


Also, Tobias wrote a helper function for long file names on Windows:

https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/476f70605a0352dd7528d251f7403e9233716cdb/platforms/win32/plugins/FilePlugin/sqWin32File.h#L33


HTH,
Alistair
Reply | Threaded
Open this post in threaded view
|

Re: Encoding of path/file names in security plugin

Nicolas Cellier
 


Le jeu. 27 déc. 2018 à 17:40, Alistair Grant <[hidden email]> a écrit :
 
Hi Nicolas,

On Thu, 27 Dec 2018 at 17:05, Nicolas Cellier
<[hidden email]> wrote:
>
>
> Hi all,
> while reviewing compiler warnings via MSVC for WIN-64 SPUR VM, I saw that we incorrectly pass a TCHAR * (via fromSqueak() function) to a function (isAccessiblePathName) expecting a WCHAR * (that is UTF-16 encoded wide char).
>
> TCHAR is a facade, it might be char *, or it might be WCHAR *, depending on -DUNICODE compiler flag, or #define UNICODE preprocessor macro.
>
> fromSqueak() just copies the image-side ByteString to a null terminated string. If -DUNICODE, it copies to a WCHAR *, but this only work for ASCII strings.
>
> I suggest to rather interpret every path passed to the VM as UTF-8 encoded, and for that purpose I've committed a change to sqWin32Security.c.
> https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/b52caab76f7f6b91c1f16d9037e0b0a43d968176

I actually thought this was what the VM always did, i.e. paths should
be UTF-8 encoded in the image before being passed to a plugin.

I agree, but apparently security plugin was left apart from these evolutions.
It's not a concern for Pharo, this plugin is only for sandboxing, I think mainly for E-toys.
I think that uptodate technics would be to use some kind of virtualization like thru Docker.


> But I see that other platforms flavours (unix at least) require the same kind of changes. Any thought before i proceeed?

For Linux and Mac there is:

- ux2sqPath(char *from, int fromLen, char *to, int toLen, int term)
- sq2uxPath(char *from, int fromLen, char *to, int toLen, int term)

On Mac this takes care of using the Mac specific decomposed UTF-8
encoding required on that platform.  Both of these assume that the
image is passing down (and receiving) a UTF-8 encoded string.  See
platforms/unix/vm/sqUnixCharConv.h.


Also, Tobias wrote a helper function for long file names on Windows:

https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/476f70605a0352dd7528d251f7403e9233716cdb/platforms/win32/plugins/FilePlugin/sqWin32File.h#L33


HTH,
Alistair

Thanks for the reminder, after an interrupt of a year or so, this will reduce the warm-up time ;)