Smalltalk › Squeak › Squeak VM

Windos UNICODE (?!?!?)

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

8 messages Options

Eliot Miranda-2

Windos UNICODE (?!?!?)

Hi All, especially Windows persons,

I'm debugging the 64-bit VM in the context of Perf on win64. I've just noticed that we don't appear to be building a UNICODE VM?!?! I rind no define of UNICODE in the win32 platform files and no define of UNICODE on the compiler command line in the build.winXXxYY/common makefiles. This is surely a mistake, isn't it? If this is a regression, when and why did this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles. AFAICT this should also happen with the Cygwin Makefiles. But before I make changes that could affect lots of people I thought I should ask here. Anyone know when and why we dropped -DUNICODE from the Cygwin build command line?

_,,,^..^,,,_

best, Eliot

marcel.taeumel

Re: Windos UNICODE (?!?!?)

Hi Eliot,

I suppose we decided to make explicit use of the unicode functions. See https://devblogs.microsoft.com/oldnewthing/?p=40643 -- Maybe this is from a time when our source held both ANSI and UNICODE variants. These days, we could simplify the code by defining -DUNICODE=1 and skip using, e.g., "GetWindowTextW" because then "GetWindowText" will automatically choose the unicode versions.

I think that that -DUNICODE has never been part of some makefile.

Best,

Marcel

Am 07.05.2020 23:09:23 schrieb Eliot Miranda <[hidden email]>:
Hi All, especially Windows persons,

I'm debugging the 64-bit VM in the context of Perf on win64. I've
just noticed that we don't appear to be building a UNICODE VM?!?! I rind
no define of UNICODE in the win32 platform files and no define of UNICODE
on the compiler command line in the build.winXXxYY/common makefiles. This
is surely a mistake, isn't it? If this is a regression, when and why did
this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles.
AFAICT this should also happen with the Cygwin Makefiles. But before I
make changes that could affect lots of people I thought I should ask here.
Anyone know when and why we dropped -DUNICODE from the Cygwin build command
line?

_,,,^..^,,,_
best, Eliot
Hi All, especially Windows persons,

I'm debugging the 64-bit VM in the context of Perf on win64. I've just noticed that we don't appear to be building a UNICODE VM?!?! I rind no define of UNICODE in the win32 platform files and no define of UNICODE on the compiler command line in the build.winXXxYY/common makefiles. This is surely a mistake, isn't it? If this is a regression, when and why did this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles. AFAICT this should also happen with the Cygwin Makefiles. But before I make changes that could affect lots of people I thought I should ask here. Anyone know when and why we dropped -DUNICODE from the Cygwin build command line?

_,,,^..^,,,_
best, Eliot

marcel.taeumel

Re: Windos UNICODE (?!?!?)

...on second thought: I might be better to first find all non-unicode calls, convert them, make sure everything works, then enable that UNICODE flag, and finally change all explicit unicode calls to "transparent" ones. :-)

Am 08.05.2020 09:03:40 schrieb Marcel Taeumel <[hidden email]>:
Hi Eliot,

I suppose we decided to make explicit use of the unicode functions. See https://devblogs.microsoft.com/oldnewthing/?p=40643 -- Maybe this is from a time when our source held both ANSI and UNICODE variants. These days, we could simplify the code by defining -DUNICODE=1 and skip using, e.g., "GetWindowTextW" because then "GetWindowText" will automatically choose the unicode versions.

I think that that -DUNICODE has never been part of some makefile.

Best,
Marcel

Am 07.05.2020 23:09:23 schrieb Eliot Miranda <[hidden email]>:
Hi All, especially Windows persons,

I'm debugging the 64-bit VM in the context of Perf on win64. I've
just noticed that we don't appear to be building a UNICODE VM?!?! I rind
no define of UNICODE in the win32 platform files and no define of UNICODE
on the compiler command line in the build.winXXxYY/common makefiles. This
is surely a mistake, isn't it? If this is a regression, when and why did
this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles.
AFAICT this should also happen with the Cygwin Makefiles. But before I
make changes that could affect lots of people I thought I should ask here.
Anyone know when and why we dropped -DUNICODE from the Cygwin build command
line?

_,,,^..^,,,_
best, Eliot
Hi All, especially Windows persons,

I'm debugging the 64-bit VM in the context of Perf on win64. I've just noticed that we don't appear to be building a UNICODE VM?!?! I rind no define of UNICODE in the win32 platform files and no define of UNICODE on the compiler command line in the build.winXXxYY/common makefiles. This is surely a mistake, isn't it? If this is a regression, when and why did this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles. AFAICT this should also happen with the Cygwin Makefiles. But before I make changes that could affect lots of people I thought I should ask here. Anyone know when and why we dropped -DUNICODE from the Cygwin build command line?

_,,,^..^,,,_
best, Eliot

Tobias Pape

Re: Windos UNICODE (?!?!?)

Hi

I think we ought to define UNICODE and _UNICODE.
-t

> On 08.05.2020, at 09:14, Marcel Taeumel <[hidden email]> wrote:
>
> ...on second thought: I might be better to first find all non-unicode calls, convert them, make sure everything works, then enable that UNICODE flag, and finally change all explicit unicode calls to "transparent" ones. :-)
>> Am 08.05.2020 09:03:40 schrieb Marcel Taeumel <[hidden email]>:
>>
>> Hi Eliot,
>>
>> I suppose we decided to make explicit use of the unicode functions. See https://devblogs.microsoft.com/oldnewthing/?p=40643 -- Maybe this is from a time when our source held both ANSI and UNICODE variants. These days, we could simplify the code by defining -DUNICODE=1 and skip using, e.g., "GetWindowTextW" because then "GetWindowText" will automatically choose the unicode versions.
>>
>> I think that that -DUNICODE has never been part of some makefile.
>>
>> Best,
>> Marcel
>>> Am 07.05.2020 23:09:23 schrieb Eliot Miranda <[hidden email]>:
>>>
>>> Hi All, especially Windows persons,
>>>
>>> I'm debugging the 64-bit VM in the context of Perf on win64. I've
>>> just noticed that we don't appear to be building a UNICODE VM?!?! I rind
>>> no define of UNICODE in the win32 platform files and no define of UNICODE
>>> on the compiler command line in the build.winXXxYY/common makefiles. This
>>> is surely a mistake, isn't it? If this is a regression, when and why did
>>> this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles.
>>> AFAICT this should also happen with the Cygwin Makefiles. But before I
>>> make changes that could affect lots of people I thought I should ask here.
>>> Anyone know when and why we dropped -DUNICODE from the Cygwin build command
>>> line?
>>>
>>> _,,,^..^,,,_
>>> best, Eliot
>>> Hi All, especially Windows persons,
>>>
>>> I'm debugging the 64-bit VM in the context of Perf on win64. I've just noticed that we don't appear to be building a UNICODE VM?!?! I rind no define of UNICODE in the win32 platform files and no define of UNICODE on the compiler command line in the build.winXXxYY/common makefiles. This is surely a mistake, isn't it? If this is a regression, when and why did this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles. AFAICT this should also happen with the Cygwin Makefiles. But before I make changes that could affect lots of people I thought I should ask here. Anyone know when and why we dropped -DUNICODE from the Cygwin build command line?
>>>
>>> _,,,^..^,,,_
>>> best, Eliot
>>>

Nicolas Cellier

Re: Windos UNICODE (?!?!?)

I think we already have done most of the work to enable this... minheadless is UNICODE already.

Le ven. 8 mai 2020 à 09:23, Tobias Pape <[hidden email]> a écrit :

Hi

I think we ought to define UNICODE and _UNICODE.
-t

> On 08.05.2020, at 09:14, Marcel Taeumel <[hidden email]> wrote:
>
> ...on second thought: I might be better to first find all non-unicode calls, convert them, make sure everything works, then enable that UNICODE flag, and finally change all explicit unicode calls to "transparent" ones. :-)
>> Am 08.05.2020 09:03:40 schrieb Marcel Taeumel <[hidden email]>:
>>
>> Hi Eliot,
>>
>> I suppose we decided to make explicit use of the unicode functions. See https://devblogs.microsoft.com/oldnewthing/?p=40643 -- Maybe this is from a time when our source held both ANSI and UNICODE variants. These days, we could simplify the code by defining -DUNICODE=1 and skip using, e.g., "GetWindowTextW" because then "GetWindowText" will automatically choose the unicode versions.
>>
>> I think that that -DUNICODE has never been part of some makefile.
>>
>> Best,
>> Marcel
>>> Am 07.05.2020 23:09:23 schrieb Eliot Miranda <[hidden email]>:
>>>
>>> Hi All, especially Windows persons,
>>>
>>> I'm debugging the 64-bit VM in the context of Perf on win64. I've
>>> just noticed that we don't appear to be building a UNICODE VM?!?! I rind
>>> no define of UNICODE in the win32 platform files and no define of UNICODE
>>> on the compiler command line in the build.winXXxYY/common makefiles. This
>>> is surely a mistake, isn't it? If this is a regression, when and why did
>>> this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles.
>>> AFAICT this should also happen with the Cygwin Makefiles. But before I
>>> make changes that could affect lots of people I thought I should ask here.
>>> Anyone know when and why we dropped -DUNICODE from the Cygwin build command
>>> line?
>>>
>>> _,,,^..^,,,_
>>> best, Eliot
>>> Hi All, especially Windows persons,
>>>
>>> I'm debugging the 64-bit VM in the context of Perf on win64. I've just noticed that we don't appear to be building a UNICODE VM?!?! I rind no define of UNICODE in the win32 platform files and no define of UNICODE on the compiler command line in the build.winXXxYY/common makefiles. This is surely a mistake, isn't it? If this is a regression, when and why did this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles. AFAICT this should also happen with the Cygwin Makefiles. But before I make changes that could affect lots of people I thought I should ask here. Anyone know when and why we dropped -DUNICODE from the Cygwin build command line?
>>>
>>> _,,,^..^,,,_
>>> best, Eliot
>>>

Tobias Pape

Re: Windos UNICODE (?!?!?)

> On 08.05.2020, at 09:52, Nicolas Cellier <[hidden email]> wrote:
>
> +1
>
> I think we already have done most of the work to enable this... minheadless is UNICODE already.

I remember breaking my head when trying in my windows branches.
at least the ssl stuff should work ;)

-t

>
> Le ven. 8 mai 2020 à 09:23, Tobias Pape <[hidden email]> a écrit :
>
> Hi
>
> I think we ought to define UNICODE and _UNICODE.
> -t
>
> > On 08.05.2020, at 09:14, Marcel Taeumel <[hidden email]> wrote:
> >
> > ...on second thought: I might be better to first find all non-unicode calls, convert them, make sure everything works, then enable that UNICODE flag, and finally change all explicit unicode calls to "transparent" ones. :-)
> >> Am 08.05.2020 09:03:40 schrieb Marcel Taeumel <[hidden email]>:
> >>
> >> Hi Eliot,
> >>
> >> I suppose we decided to make explicit use of the unicode functions. See https://devblogs.microsoft.com/oldnewthing/?p=40643 -- Maybe this is from a time when our source held both ANSI and UNICODE variants. These days, we could simplify the code by defining -DUNICODE=1 and skip using, e.g., "GetWindowTextW" because then "GetWindowText" will automatically choose the unicode versions.
> >>
> >> I think that that -DUNICODE has never been part of some makefile.
> >>
> >> Best,
> >> Marcel
> >>> Am 07.05.2020 23:09:23 schrieb Eliot Miranda <[hidden email]>:
> >>>
> >>> Hi All, especially Windows persons,
> >>>
> >>> I'm debugging the 64-bit VM in the context of Perf on win64. I've
> >>> just noticed that we don't appear to be building a UNICODE VM?!?! I rind
> >>> no define of UNICODE in the win32 platform files and no define of UNICODE
> >>> on the compiler command line in the build.winXXxYY/common makefiles. This
> >>> is surely a mistake, isn't it? If this is a regression, when and why did
> >>> this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles.
> >>> AFAICT this should also happen with the Cygwin Makefiles. But before I
> >>> make changes that could affect lots of people I thought I should ask here.
> >>> Anyone know when and why we dropped -DUNICODE from the Cygwin build command
> >>> line?
> >>>
> >>> _,,,^..^,,,_
> >>> best, Eliot
> >>> Hi All, especially Windows persons,
> >>>
> >>> I'm debugging the 64-bit VM in the context of Perf on win64. I've just noticed that we don't appear to be building a UNICODE VM?!?! I rind no define of UNICODE in the win32 platform files and no define of UNICODE on the compiler command line in the build.winXXxYY/common makefiles. This is surely a mistake, isn't it? If this is a regression, when and why did this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles. AFAICT this should also happen with the Cygwin Makefiles. But before I make changes that could affect lots of people I thought I should ask here. Anyone know when and why we dropped -DUNICODE from the Cygwin build command line?
> >>>
> >>> _,,,^..^,,,_
> >>> best, Eliot
> >>>
>
>

Nicolas Cellier

Re: Windos UNICODE (?!?!?)

Le ven. 8 mai 2020 à 09:55, Tobias Pape <[hidden email]> a écrit :

> On 08.05.2020, at 09:52, Nicolas Cellier <[hidden email]> wrote:
>
> +1
>
> I think we already have done most of the work to enable this... minheadless is UNICODE already.

I remember breaking my head when trying in my windows branches.
at least the ssl stuff should work ;)

-t

and all the hickups around July 2016 shows that converting legacy code is paved with traps:

https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/6437fb3788a373d9ed733b9c2e23c500fed98505

This was good work, sorry for making you revert that, but it had to be finished/polished...

If we go to UNICODE, we must check:

- if the primitives or plugin are implicitely using the W variant

- how the string arguments are interpreted (converted from some encoding to UTF16? passed directly as if UTF16?)

- what the image is passing to the primitve (this must be done in Squeak/Cuis and maybe Pharo)

The strategy we have chosen so far is to

- encode most strings passed to the underlying OS as UTF8

- let the primitives convert to UTF16 on win32

- let the primitive explicitely invoke the W variant

Similarly, strings returned are converted back to UTF8.

This is making like more simple at image side, because platform independent (in the spirit of a VM).

We could as well have chosen a platform specific format for passing strings (UTF16 would be the choice on win32), and let image delegate the encoding to OSPlatform current, or something like that.

From VM perspective, we could offer a set of functions for interpreting input Smallalk object as:

- ByteString = iso8859L1 to fit the image

- ByteArray = utf8 encoded bytes

- DoubleByteArray = utf16 encoded bytes

- WordArray (including WideString) = ucs-4

and converting to various platform encodings...

Most of these functions would use native capabilities of OS, so it's not a lot of work.

This versatility would delegate the decision to trade the uniformity (utf8) for efficiency at image side.

Thoughts?

>
> Le ven. 8 mai 2020 à 09:23, Tobias Pape <[hidden email]> a écrit :
>
> Hi
>
> I think we ought to define UNICODE and _UNICODE.
> -t
>
> > On 08.05.2020, at 09:14, Marcel Taeumel <[hidden email]> wrote:
> >
> > ...on second thought: I might be better to first find all non-unicode calls, convert them, make sure everything works, then enable that UNICODE flag, and finally change all explicit unicode calls to "transparent" ones. :-)
> >> Am 08.05.2020 09:03:40 schrieb Marcel Taeumel <[hidden email]>:
> >>
> >> Hi Eliot,
> >>
> >> I suppose we decided to make explicit use of the unicode functions. See https://devblogs.microsoft.com/oldnewthing/?p=40643 -- Maybe this is from a time when our source held both ANSI and UNICODE variants. These days, we could simplify the code by defining -DUNICODE=1 and skip using, e.g., "GetWindowTextW" because then "GetWindowText" will automatically choose the unicode versions.
> >>
> >> I think that that -DUNICODE has never been part of some makefile.
> >>
> >> Best,
> >> Marcel
> >>> Am 07.05.2020 23:09:23 schrieb Eliot Miranda <[hidden email]>:
> >>>
> >>> Hi All, especially Windows persons,
> >>>
> >>> I'm debugging the 64-bit VM in the context of Perf on win64. I've
> >>> just noticed that we don't appear to be building a UNICODE VM?!?! I rind
> >>> no define of UNICODE in the win32 platform files and no define of UNICODE
> >>> on the compiler command line in the build.winXXxYY/common makefiles. This
> >>> is surely a mistake, isn't it? If this is a regression, when and why did
> >>> this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles.
> >>> AFAICT this should also happen with the Cygwin Makefiles. But before I
> >>> make changes that could affect lots of people I thought I should ask here.
> >>> Anyone know when and why we dropped -DUNICODE from the Cygwin build command
> >>> line?
> >>>
> >>> _,,,^..^,,,_
> >>> best, Eliot
> >>> Hi All, especially Windows persons,
> >>>
> >>> I'm debugging the 64-bit VM in the context of Perf on win64. I've just noticed that we don't appear to be building a UNICODE VM?!?! I rind no define of UNICODE in the win32 platform files and no define of UNICODE on the compiler command line in the build.winXXxYY/common makefiles. This is surely a mistake, isn't it? If this is a regression, when and why did this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles. AFAICT this should also happen with the Cygwin Makefiles. But before I make changes that could affect lots of people I thought I should ask here. Anyone know when and why we dropped -DUNICODE from the Cygwin build command line?
> >>>
> >>> _,,,^..^,,,_
> >>> best, Eliot
> >>>
>
>

marcel.taeumel

Re: Windos UNICODE (?!?!?)

I also saw several uses of TCHAR that might get upset being suddenly replaced with wchar_t instead of char.

Best,

Marcel

Am 09.05.2020 10:29:18 schrieb Nicolas Cellier <[hidden email]>:
Le ven. 8 mai 2020 à 09:55, Tobias Pape a écrit :

>
>
> > On 08.05.2020, at 09:52, Nicolas Cellier <>
> [hidden email]> wrote:
> >
> > +1
> >
> > I think we already have done most of the work to enable this...
> minheadless is UNICODE already.
>
> I remember breaking my head when trying in my windows branches.
> at least the ssl stuff should work ;)
>
> -t
>
> and all the hickups around July 2016 shows that converting legacy code is
paved with traps:
https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/6437fb3788a373d9ed733b9c2e23c500fed98505
This was good work, sorry for making you revert that, but it had to be
finished/polished...

If we go to UNICODE, we must check:
- if the primitives or plugin are implicitely using the W variant
- how the string arguments are interpreted (converted from some encoding to
UTF16? passed directly as if UTF16?)
- what the image is passing to the primitve (this must be done in
Squeak/Cuis and maybe Pharo)

The strategy we have chosen so far is to
- encode most strings passed to the underlying OS as UTF8
- let the primitives convert to UTF16 on win32
- let the primitive explicitely invoke the W variant

Similarly, strings returned are converted back to UTF8.
This is making like more simple at image side, because platform independent
(in the spirit of a VM).

We could as well have chosen a platform specific format for passing strings
(UTF16 would be the choice on win32), and let image delegate the encoding
to OSPlatform current, or something like that.

From VM perspective, we could offer a set of functions for interpreting
input Smallalk object as:
- ByteString = iso8859L1 to fit the image
- ByteArray = utf8 encoded bytes
- DoubleByteArray = utf16 encoded bytes
- WordArray (including WideString) = ucs-4
and converting to various platform encodings...
Most of these functions would use native capabilities of OS, so it's not a
lot of work.

This versatility would delegate the decision to trade the uniformity (utf8)
for efficiency at image side.
Thoughts?

>
> > Le ven. 8 mai 2020 à 09:23, Tobias Pape a écrit :
> >
> > Hi
> >
> > I think we ought to define UNICODE and _UNICODE.
> > -t
> >
> > > On 08.05.2020, at 09:14, Marcel Taeumel wrote:
> > >
> > > ...on second thought: I might be better to first find all non-unicode
> calls, convert them, make sure everything works, then enable that UNICODE
> flag, and finally change all explicit unicode calls to "transparent" ones.
> :-)
> > >> Am 08.05.2020 09:03:40 schrieb Marcel Taeumel
> >:
> > >>
> > >> Hi Eliot,
> > >>
> > >> I suppose we decided to make explicit use of the unicode functions.
> See https://devblogs.microsoft.com/oldnewthing/?p=40643 -- Maybe this is
> from a time when our source held both ANSI and UNICODE variants. These
> days, we could simplify the code by defining -DUNICODE=1 and skip using,
> e.g., "GetWindowTextW" because then "GetWindowText" will automatically
> choose the unicode versions.
> > >>
> > >> I think that that -DUNICODE has never been part of some makefile.
> > >>
> > >> Best,
> > >> Marcel
> > >>> Am 07.05.2020 23:09:23 schrieb Eliot Miranda <>
> [hidden email]>:
> > >>>
> > >>> Hi All, especially Windows persons,
> > >>>
> > >>> I'm debugging the 64-bit VM in the context of Perf on win64. I've
> > >>> just noticed that we don't appear to be building a UNICODE VM?!?! I
> rind
> > >>> no define of UNICODE in the win32 platform files and no define of
> UNICODE
> > >>> on the compiler command line in the build.winXXxYY/common makefiles.
> This
> > >>> is surely a mistake, isn't it? If this is a regression, when and why
> did
> > >>> this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc
> makefiles.
> > >>> AFAICT this should also happen with the Cygwin Makefiles. But before
> I
> > >>> make changes that could affect lots of people I thought I should ask
> here.
> > >>> Anyone know when and why we dropped -DUNICODE from the Cygwin build
> command
> > >>> line?
> > >>>
> > >>> _,,,^..^,,,_
> > >>> best, Eliot
> > >>> Hi All, especially Windows persons,
> > >>>
> > >>> I'm debugging the 64-bit VM in the context of Perf on win64.
> I've just noticed that we don't appear to be building a UNICODE VM?!?! I
> rind no define of UNICODE in the win32 platform files and no define of
> UNICODE on the compiler command line in the build.winXXxYY/common
> makefiles. This is surely a mistake, isn't it? If this is a regression,
> when and why did this occur? I'm going to add -DUNICODE=1 to the
> Makefile.msvc makefiles. AFAICT this should also happen with the Cygwin
> Makefiles. But before I make changes that could affect lots of people I
> thought I should ask here. Anyone know when and why we dropped -DUNICODE
> from the Cygwin build command line?
> > >>>
> > >>> _,,,^..^,,,_
> > >>> best, Eliot
> > >>>
> >
> >
>
>
>

Le ven. 8 mai 2020 à 09:55, Tobias Pape <[hidden email]> a écrit :

> On 08.05.2020, at 09:52, Nicolas Cellier <[hidden email]> wrote:

>

> +1

>

> I think we already have done most of the work to enable this... minheadless is UNICODE already.

I remember breaking my head when trying in my windows branches.

at least the ssl stuff should work ;)

-t

and all the hickups around July 2016 shows that converting legacy code is paved with traps:
https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/6437fb3788a373d9ed733b9c2e23c500fed98505
This was good work, sorry for making you revert that, but it had to be finished/polished...

If we go to UNICODE, we must check:
- if the primitives or plugin are implicitely using the W variant
- how the string arguments are interpreted (converted from some encoding to UTF16? passed directly as if UTF16?)
- what the image is passing to the primitve (this must be done in Squeak/Cuis and maybe Pharo)

The strategy we have chosen so far is to
- encode most strings passed to the underlying OS as UTF8
- let the primitives convert to UTF16 on win32
- let the primitive explicitely invoke the W variant

Similarly, strings returned are converted back to UTF8.
This is making like more simple at image side, because platform independent (in the spirit of a VM).

We could as well have chosen a platform specific format for passing strings (UTF16 would be the choice on win32), and let image delegate the encoding to OSPlatform current, or something like that.

From VM perspective, we could offer a set of functions for interpreting input Smallalk object as:
- ByteString = iso8859L1 to fit the image
- ByteArray = utf8 encoded bytes
- DoubleByteArray = utf16 encoded bytes
- WordArray (including WideString) = ucs-4
and converting to various platform encodings...
Most of these functions would use native capabilities of OS, so it's not a lot of work.

This versatility would delegate the decision to trade the uniformity (utf8) for efficiency at image side.
Thoughts?

>

> Le ven. 8 mai 2020 à 09:23, Tobias Pape <[hidden email]> a écrit :

>

> Hi

>

> I think we ought to define UNICODE and _UNICODE.

> -t

>

> > On 08.05.2020, at 09:14, Marcel Taeumel <[hidden email]> wrote:

> >

> > ...on second thought: I might be better to first find all non-unicode calls, convert them, make sure everything works, then enable that UNICODE flag, and finally change all explicit unicode calls to "transparent" ones. :-)

> >> Am 08.05.2020 09:03:40 schrieb Marcel Taeumel <[hidden email]>:

> >>

> >> Hi Eliot,

> >>

> >> I suppose we decided to make explicit use of the unicode functions. See https://devblogs.microsoft.com/oldnewthing/?p=40643 -- Maybe this is from a time when our source held both ANSI and UNICODE variants. These days, we could simplify the code by defining -DUNICODE=1 and skip using, e.g., "GetWindowTextW" because then "GetWindowText" will automatically choose the unicode versions.

> >>

> >> I think that that -DUNICODE has never been part of some makefile.

> >>

> >> Best,

> >> Marcel

> >>> Am 07.05.2020 23:09:23 schrieb Eliot Miranda <[hidden email]>:

> >>>

> >>> Hi All, especially Windows persons,

> >>>

> >>> I'm debugging the 64-bit VM in the context of Perf on win64. I've

> >>> just noticed that we don't appear to be building a UNICODE VM?!?! I rind

> >>> no define of UNICODE in the win32 platform files and no define of UNICODE

> >>> on the compiler command line in the build.winXXxYY/common makefiles. This

> >>> is surely a mistake, isn't it? If this is a regression, when and why did

> >>> this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles.

> >>> AFAICT this should also happen with the Cygwin Makefiles. But before I

> >>> make changes that could affect lots of people I thought I should ask here.

> >>> Anyone know when and why we dropped -DUNICODE from the Cygwin build command

> >>> line?

> >>>

> >>> _,,,^..^,,,_

> >>> best, Eliot

> >>> Hi All, especially Windows persons,

> >>>

> >>> I'm debugging the 64-bit VM in the context of Perf on win64. I've just noticed that we don't appear to be building a UNICODE VM?!?! I rind no define of UNICODE in the win32 platform files and no define of UNICODE on the compiler command line in the build.winXXxYY/common makefiles. This is surely a mistake, isn't it? If this is a regression, when and why did this occur? I'm going to add -DUNICODE=1 to the Makefile.msvc makefiles. AFAICT this should also happen with the Cygwin Makefiles. But before I make changes that could affect lots of people I thought I should ask here. Anyone know when and why we dropped -DUNICODE from the Cygwin build command line?

> >>>

> >>> _,,,^..^,,,_

> >>> best, Eliot

> >>>

>

>