[UFFI] Dealing with Windows types

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

[UFFI] Dealing with Windows types

kilon.alios
So now I need to port my CPPBridge to Windows and I have to deal with this monster. And it would not be Windows if it did not make life a lot harder , so it defines its own types.

Such types are

handle_t / HANDLE
TCHAR
LPCTSTR
PVOID

also I will have to deal with Unicode , I have no clue how to proceed, any help is greatly appreciated.

Reply | Threaded
Open this post in threaded view
|

Re: [UFFI] Dealing with Windows types

bestlem

Sorry I have not got a Windows machine here but MSDN should provide all
the documentation you need. Some of these types are not translatable and
have to be treated as opaque on the Smalltalk side ie just pass as data

All are defined in Windows.h and MSDN provides documentation - for types
see
https://msdn.microsoft.com/en-us/library/windows/desktop/aa383751(v=vs.85).aspx

Note the naming of many of these comes from Windows 2 or earlier so was
used on 16 bit programs

handle_t / HANDLE is a pseudo pointer that Windows has created and
Windows system calls understand. They will dereference it to find
something useful - This is similar to a POSIX file handle that you can
only use as a parameter to system calls.

PVOID is a pointer to a void ie just a pointer to something

LPCTSTR and TCHAR depend on Unicode

In Wnndows C API there are usually two forms of many systems call they
either take 8 bit ANSI strings or 16 bit ones which MS says is Unicode
(unfortuneatly for us now MS made the decision of 16 bit when all
Unicode points fitted into 16 bits so not now a simple mapping. See
https://msdn.microsoft.com/en-us/library/vs/alm/dd183415(v=vs.85).aspx 
and https://msdn.microsoft.com/en-us/library/2dax2h36.aspx

I'll assume you want Unicode

TCHAR is then a 16 bit Unicode character
LPCTSTR is a const long pointer to a 16 bit Unicode string

I would suggest finding a book or long tutorial :( and sorry I can't
suggest any as I read this from Petzold books over 25 years ago

Mark


On 19/11/2016 18:51, Dimitris Chloupis wrote:

> So now I need to port my CPPBridge to Windows and I have to deal with
> this monster. And it would not be Windows if it did not make life a lot
> harder , so it defines its own types.
>
> Such types are
>
> handle_t / HANDLE
> TCHAR
> LPCTSTR
> PVOID
>
> also I will have to deal with Unicode , I have no clue how to proceed,
> any help is greatly appreciated.
>


--
Mark


Reply | Threaded
Open this post in threaded view
|

Re: [UFFI] Dealing with Windows types

kilon.alios
Actually that was very helpful and there is no need to apologise at all. I just wanted a push towards the right direction , you practically teleported me there. 

Don't worry I am not lazy (not claiming that you imply that I am ) , I actually enjoy investigating code so diving inside header files is not a problem at all. I dont really want or need unicode , since unicode is useful only if you want to type or print in specific languages, but even my native language Greek only needs ANSI /  ASCII , not that I ever use greek in any of my code mind you , I am 100% English speaking and typing when it comes to coding. 

I have some things figured out that I did not mention for example I knew that TCHAR is 2 bytes when char is 1 byte because of Unicode support . However your advise just revealed to me that it was a mistake to compile my C++ code for 64 bit since that would affect those types as well. 

Another question I have is about const char / const long pointer , I know that const here means "not able to change value/ content" is there anything more to it ? 

Again thank you very much , things are much more clear now :)  

On Sun, Nov 20, 2016 at 1:59 AM Mark Bestley <[hidden email]> wrote:

Sorry I have not got a Windows machine here but MSDN should provide all
the documentation you need. Some of these types are not translatable and
have to be treated as opaque on the Smalltalk side ie just pass as data

All are defined in Windows.h and MSDN provides documentation - for types
see
https://msdn.microsoft.com/en-us/library/windows/desktop/aa383751(v=vs.85).aspx

Note the naming of many of these comes from Windows 2 or earlier so was
used on 16 bit programs

handle_t / HANDLE is a pseudo pointer that Windows has created and
Windows system calls understand. They will dereference it to find
something useful - This is similar to a POSIX file handle that you can
only use as a parameter to system calls.

PVOID is a pointer to a void ie just a pointer to something

LPCTSTR and TCHAR depend on Unicode

In Wnndows C API there are usually two forms of many systems call they
either take 8 bit ANSI strings or 16 bit ones which MS says is Unicode
(unfortuneatly for us now MS made the decision of 16 bit when all
Unicode points fitted into 16 bits so not now a simple mapping. See
https://msdn.microsoft.com/en-us/library/vs/alm/dd183415(v=vs.85).aspx
and https://msdn.microsoft.com/en-us/library/2dax2h36.aspx

I'll assume you want Unicode

TCHAR is then a 16 bit Unicode character
LPCTSTR is a const long pointer to a 16 bit Unicode string

I would suggest finding a book or long tutorial :( and sorry I can't
suggest any as I read this from Petzold books over 25 years ago

Mark


On 19/11/2016 18:51, Dimitris Chloupis wrote:
> So now I need to port my CPPBridge to Windows and I have to deal with
> this monster. And it would not be Windows if it did not make life a lot
> harder , so it defines its own types.
>
> Such types are
>
> handle_t / HANDLE
> TCHAR
> LPCTSTR
> PVOID
>
> also I will have to deal with Unicode , I have no clue how to proceed,
> any help is greatly appreciated.
>


--
Mark


Reply | Threaded
Open this post in threaded view
|

Re: [UFFI] Dealing with Windows types

EstebanLM
In reply to this post by bestlem
Hi,

> On 20 Nov 2016, at 00:52, Mark Bestley <[hidden email]> wrote:
>
>
> Sorry I have not got a Windows machine here but MSDN should provide all the documentation you need. Some of these types are not translatable and have to be treated as opaque on the Smalltalk side ie just pass as data
>
> All are defined in Windows.h and MSDN provides documentation - for types see https://msdn.microsoft.com/en-us/library/windows/desktop/aa383751(v=vs.85).aspx
>
> Note the naming of many of these comes from Windows 2 or earlier so was used on 16 bit programs
>
> handle_t / HANDLE is a pseudo pointer that Windows has created and Windows system calls understand. They will dereference it to find something useful - This is similar to a POSIX file handle that you can only use as a parameter to system calls.

and these are handled by UFFI by FFIConstantHandle (read the comment)

>
> PVOID is a pointer to a void ie just a pointer to something
>
> LPCTSTR and TCHAR depend on Unicode

yep… maybe you want to see OSWindows from Torsten, he already aliased many of this types.

Esteban

>
> In Wnndows C API there are usually two forms of many systems call they either take 8 bit ANSI strings or 16 bit ones which MS says is Unicode (unfortuneatly for us now MS made the decision of 16 bit when all Unicode points fitted into 16 bits so not now a simple mapping. See https://msdn.microsoft.com/en-us/library/vs/alm/dd183415(v=vs.85).aspx and https://msdn.microsoft.com/en-us/library/2dax2h36.aspx
>
> I'll assume you want Unicode
>
> TCHAR is then a 16 bit Unicode character
> LPCTSTR is a const long pointer to a 16 bit Unicode string
>
> I would suggest finding a book or long tutorial :( and sorry I can't suggest any as I read this from Petzold books over 25 years ago
>
> Mark
>
>
> On 19/11/2016 18:51, Dimitris Chloupis wrote:
>> So now I need to port my CPPBridge to Windows and I have to deal with
>> this monster. And it would not be Windows if it did not make life a lot
>> harder , so it defines its own types.
>>
>> Such types are
>>
>> handle_t / HANDLE
>> TCHAR
>> LPCTSTR
>> PVOID
>>
>> also I will have to deal with Unicode , I have no clue how to proceed,
>> any help is greatly appreciated.
>>
>
>
> --
> Mark
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [UFFI] Dealing with Windows types

kilon.alios
Once more I got lost in the the spaghetti called "Pharo code". I am referring to OSWindows.

Ironically its easier to read Windows header files and disassemble memory than actually read pharo code :D

Fortunately void pointers is not an issue since I already used them on Linux and Macos with UFFI without using FFIConstanthandle.

By the way the comment in that class wonder whether handles in Linux and Macos are the same , they are not, in Window a handle is a void pointer , in Linux and Macos they are just regular integers (int).

So far I have drawn the following conclusions

Unicode means 2 bytes char , windows names it wchar_t , the equivelant on pharo must be long char , it stores unicode strings which VS C++ defines as L"my text", first byte is a hex number for the character itself, second byte in most cases is 0 except for special characters.

Numbers 0-9 are 48 (hex: 30) to 57 (hex: 39)

Characters A-Z are 65(hex: 41) to 90(hex: 5A)

Characters a-z are  97(hex: 61) to 122(hex: 7A)

So either I use Unicode as it is or use Unicode as a 1 byte char chaining 2 bytes together instead of 1 at a time. Which will result in exact same memory value for all OS (Windows /Linux / Maco)

either solution could work.

windows PVOID(void pointer) and *wchar_t ( pointer to TCHAR [Unicode- 2 bytes]) are 4 bytes like Linux and MacOS equivalent pointers.

So it seems the differences are not as massive as I feared , apart from Unicode text the rest appears to be more or less the same.

But I will be sure as soon as I use UFFI to test these assumptions which I am about to do.

On Sun, Nov 20, 2016 at 10:51 AM Esteban Lorenzano <[hidden email]> wrote:
Hi,

> On 20 Nov 2016, at 00:52, Mark Bestley <[hidden email]> wrote:
>
>
> Sorry I have not got a Windows machine here but MSDN should provide all the documentation you need. Some of these types are not translatable and have to be treated as opaque on the Smalltalk side ie just pass as data
>
> All are defined in Windows.h and MSDN provides documentation - for types see https://msdn.microsoft.com/en-us/library/windows/desktop/aa383751(v=vs.85).aspx
>
> Note the naming of many of these comes from Windows 2 or earlier so was used on 16 bit programs
>
> handle_t / HANDLE is a pseudo pointer that Windows has created and Windows system calls understand. They will dereference it to find something useful - This is similar to a POSIX file handle that you can only use as a parameter to system calls.

and these are handled by UFFI by FFIConstantHandle (read the comment)

>
> PVOID is a pointer to a void ie just a pointer to something
>
> LPCTSTR and TCHAR depend on Unicode

yep… maybe you want to see OSWindows from Torsten, he already aliased many of this types.

Esteban

>
> In Wnndows C API there are usually two forms of many systems call they either take 8 bit ANSI strings or 16 bit ones which MS says is Unicode (unfortuneatly for us now MS made the decision of 16 bit when all Unicode points fitted into 16 bits so not now a simple mapping. See https://msdn.microsoft.com/en-us/library/vs/alm/dd183415(v=vs.85).aspx and https://msdn.microsoft.com/en-us/library/2dax2h36.aspx
>
> I'll assume you want Unicode
>
> TCHAR is then a 16 bit Unicode character
> LPCTSTR is a const long pointer to a 16 bit Unicode string
>
> I would suggest finding a book or long tutorial :( and sorry I can't suggest any as I read this from Petzold books over 25 years ago
>
> Mark
>
>
> On 19/11/2016 18:51, Dimitris Chloupis wrote:
>> So now I need to port my CPPBridge to Windows and I have to deal with
>> this monster. And it would not be Windows if it did not make life a lot
>> harder , so it defines its own types.
>>
>> Such types are
>>
>> handle_t / HANDLE
>> TCHAR
>> LPCTSTR
>> PVOID
>>
>> also I will have to deal with Unicode , I have no clue how to proceed,
>> any help is greatly appreciated.
>>
>
>
> --
> Mark
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [UFFI] Dealing with Windows types

Ben Coman
On Sun, Nov 20, 2016 at 9:50 PM, Dimitris Chloupis
<[hidden email]> wrote:

> Once more I got lost in the the spaghetti called "Pharo code". I am
> referring to OSWindows.
>
> Ironically its easier to read Windows header files and disassemble memory
> than actually read pharo code :D
>
> Fortunately void pointers is not an issue since I already used them on Linux
> and Macos with UFFI without using FFIConstanthandle.
>
> By the way the comment in that class wonder whether handles in Linux and
> Macos are the same , they are not, in Window a handle is a void pointer , in
> Linux and Macos they are just regular integers (int).
>
> So far I have drawn the following conclusions
>
> Unicode means 2 bytes char , windows names it wchar_t , the equivelant on
> pharo must be long char , it stores unicode strings which VS C++ defines as
> L"my text", first byte is a hex number for the character itself, second byte
> in most cases is 0 except for special characters.
>
> Numbers 0-9 are 48 (hex: 30) to 57 (hex: 39)
>
> Characters A-Z are 65(hex: 41) to 90(hex: 5A)
>
> Characters a-z are  97(hex: 61) to 122(hex: 7A)
>
> So either I use Unicode as it is or use Unicode as a 1 byte char chaining 2
> bytes together instead of 1 at a time.

> Which will result in exact same
> memory value for all OS (Windows /Linux / Maco)

reality check...
http://utf8everywhere.org/

Sorry I don't remember enough of my Unicode research six months ago to
be more help.
Only enough to know there are traps and hunt down that article.

cheers -ben

>
> either solution could work.
>
> windows PVOID(void pointer) and *wchar_t ( pointer to TCHAR [Unicode- 2
> bytes]) are 4 bytes like Linux and MacOS equivalent pointers.
>
> So it seems the differences are not as massive as I feared , apart from
> Unicode text the rest appears to be more or less the same.
>
> But I will be sure as soon as I use UFFI to test these assumptions which I
> am about to do.
>
> On Sun, Nov 20, 2016 at 10:51 AM Esteban Lorenzano <[hidden email]>
> wrote:
>>
>> Hi,
>>
>> > On 20 Nov 2016, at 00:52, Mark Bestley <[hidden email]> wrote:
>> >
>> >
>> > Sorry I have not got a Windows machine here but MSDN should provide all
>> > the documentation you need. Some of these types are not translatable and
>> > have to be treated as opaque on the Smalltalk side ie just pass as data
>> >
>> > All are defined in Windows.h and MSDN provides documentation - for types
>> > see
>> > https://msdn.microsoft.com/en-us/library/windows/desktop/aa383751(v=vs.85).aspx
>> >
>> > Note the naming of many of these comes from Windows 2 or earlier so was
>> > used on 16 bit programs
>> >
>> > handle_t / HANDLE is a pseudo pointer that Windows has created and
>> > Windows system calls understand. They will dereference it to find something
>> > useful - This is similar to a POSIX file handle that you can only use as a
>> > parameter to system calls.
>>
>> and these are handled by UFFI by FFIConstantHandle (read the comment)
>>
>> >
>> > PVOID is a pointer to a void ie just a pointer to something
>> >
>> > LPCTSTR and TCHAR depend on Unicode
>>
>> yep… maybe you want to see OSWindows from Torsten, he already aliased many
>> of this types.
>>
>> Esteban
>>
>> >
>> > In Wnndows C API there are usually two forms of many systems call they
>> > either take 8 bit ANSI strings or 16 bit ones which MS says is Unicode
>> > (unfortuneatly for us now MS made the decision of 16 bit when all Unicode
>> > points fitted into 16 bits so not now a simple mapping. See
>> > https://msdn.microsoft.com/en-us/library/vs/alm/dd183415(v=vs.85).aspx and
>> > https://msdn.microsoft.com/en-us/library/2dax2h36.aspx
>> >
>> > I'll assume you want Unicode
>> >
>> > TCHAR is then a 16 bit Unicode character
>> > LPCTSTR is a const long pointer to a 16 bit Unicode string
>> >
>> > I would suggest finding a book or long tutorial :( and sorry I can't
>> > suggest any as I read this from Petzold books over 25 years ago
>> >
>> > Mark
>> >
>> >
>> > On 19/11/2016 18:51, Dimitris Chloupis wrote:
>> >> So now I need to port my CPPBridge to Windows and I have to deal with
>> >> this monster. And it would not be Windows if it did not make life a lot
>> >> harder , so it defines its own types.
>> >>
>> >> Such types are
>> >>
>> >> handle_t / HANDLE
>> >> TCHAR
>> >> LPCTSTR
>> >> PVOID
>> >>
>> >> also I will have to deal with Unicode , I have no clue how to proceed,
>> >> any help is greatly appreciated.
>> >>
>> >
>> >
>> > --
>> > Mark
>> >
>> >
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: [UFFI] Dealing with Windows types

Sven Van Caekenberghe-2

> On 21 Nov 2016, at 10:23, Ben Coman <[hidden email]> wrote:
>
> On Sun, Nov 20, 2016 at 9:50 PM, Dimitris Chloupis
> <[hidden email]> wrote:
>> Once more I got lost in the the spaghetti called "Pharo code". I am
>> referring to OSWindows.
>>
>> Ironically its easier to read Windows header files and disassemble memory
>> than actually read pharo code :D
>>
>> Fortunately void pointers is not an issue since I already used them on Linux
>> and Macos with UFFI without using FFIConstanthandle.
>>
>> By the way the comment in that class wonder whether handles in Linux and
>> Macos are the same , they are not, in Window a handle is a void pointer , in
>> Linux and Macos they are just regular integers (int).
>>
>> So far I have drawn the following conclusions
>>
>> Unicode means 2 bytes char , windows names it wchar_t , the equivelant on
>> pharo must be long char , it stores unicode strings which VS C++ defines as
>> L"my text", first byte is a hex number for the character itself, second byte
>> in most cases is 0 except for special characters.
>>
>> Numbers 0-9 are 48 (hex: 30) to 57 (hex: 39)
>>
>> Characters A-Z are 65(hex: 41) to 90(hex: 5A)
>>
>> Characters a-z are  97(hex: 61) to 122(hex: 7A)
>>
>> So either I use Unicode as it is or use Unicode as a 1 byte char chaining 2
>> bytes together instead of 1 at a time.
>
>> Which will result in exact same
>> memory value for all OS (Windows /Linux / Maco)
>
> reality check...
> http://utf8everywhere.org/
>
> Sorry I don't remember enough of my Unicode research six months ago to
> be more help.
> Only enough to know there are traps and hunt down that article.

First read the general introduction

 https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html

That should be more than enough (we got all basic covered in Pharo).

If you want more, there is the Pharo Unicode project

 https://medium.com/concerning-pharo/an-implementation-of-unicode-normalization-7c6719068f43

But you most probably don't need that.

I thought that Windows used UTF-16 internally

 https://en.wikipedia.org/wiki/UTF-16#Usage

Sven

> cheers -ben
>
>>
>> either solution could work.
>>
>> windows PVOID(void pointer) and *wchar_t ( pointer to TCHAR [Unicode- 2
>> bytes]) are 4 bytes like Linux and MacOS equivalent pointers.
>>
>> So it seems the differences are not as massive as I feared , apart from
>> Unicode text the rest appears to be more or less the same.
>>
>> But I will be sure as soon as I use UFFI to test these assumptions which I
>> am about to do.
>>
>> On Sun, Nov 20, 2016 at 10:51 AM Esteban Lorenzano <[hidden email]>
>> wrote:
>>>
>>> Hi,
>>>
>>>> On 20 Nov 2016, at 00:52, Mark Bestley <[hidden email]> wrote:
>>>>
>>>>
>>>> Sorry I have not got a Windows machine here but MSDN should provide all
>>>> the documentation you need. Some of these types are not translatable and
>>>> have to be treated as opaque on the Smalltalk side ie just pass as data
>>>>
>>>> All are defined in Windows.h and MSDN provides documentation - for types
>>>> see
>>>> https://msdn.microsoft.com/en-us/library/windows/desktop/aa383751(v=vs.85).aspx
>>>>
>>>> Note the naming of many of these comes from Windows 2 or earlier so was
>>>> used on 16 bit programs
>>>>
>>>> handle_t / HANDLE is a pseudo pointer that Windows has created and
>>>> Windows system calls understand. They will dereference it to find something
>>>> useful - This is similar to a POSIX file handle that you can only use as a
>>>> parameter to system calls.
>>>
>>> and these are handled by UFFI by FFIConstantHandle (read the comment)
>>>
>>>>
>>>> PVOID is a pointer to a void ie just a pointer to something
>>>>
>>>> LPCTSTR and TCHAR depend on Unicode
>>>
>>> yep… maybe you want to see OSWindows from Torsten, he already aliased many
>>> of this types.
>>>
>>> Esteban
>>>
>>>>
>>>> In Wnndows C API there are usually two forms of many systems call they
>>>> either take 8 bit ANSI strings or 16 bit ones which MS says is Unicode
>>>> (unfortuneatly for us now MS made the decision of 16 bit when all Unicode
>>>> points fitted into 16 bits so not now a simple mapping. See
>>>> https://msdn.microsoft.com/en-us/library/vs/alm/dd183415(v=vs.85).aspx and
>>>> https://msdn.microsoft.com/en-us/library/2dax2h36.aspx
>>>>
>>>> I'll assume you want Unicode
>>>>
>>>> TCHAR is then a 16 bit Unicode character
>>>> LPCTSTR is a const long pointer to a 16 bit Unicode string
>>>>
>>>> I would suggest finding a book or long tutorial :( and sorry I can't
>>>> suggest any as I read this from Petzold books over 25 years ago
>>>>
>>>> Mark
>>>>
>>>>
>>>> On 19/11/2016 18:51, Dimitris Chloupis wrote:
>>>>> So now I need to port my CPPBridge to Windows and I have to deal with
>>>>> this monster. And it would not be Windows if it did not make life a lot
>>>>> harder , so it defines its own types.
>>>>>
>>>>> Such types are
>>>>>
>>>>> handle_t / HANDLE
>>>>> TCHAR
>>>>> LPCTSTR
>>>>> PVOID
>>>>>
>>>>> also I will have to deal with Unicode , I have no clue how to proceed,
>>>>> any help is greatly appreciated.
>>>>>
>>>>
>>>>
>>>> --
>>>> Mark
>>>>
>>>>
>>>
>>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: [UFFI] Dealing with Windows types

kilon.alios
In reply to this post by Ben Coman
" reality check...
http://utf8everywhere.org/

Sorry I don't remember enough of my Unicode research six months ago to
be more help.
Only enough to know there are traps and hunt down that article.

cheers -ben"

Really .... ???

So bad ???

Oh well good thing I did not intend to use Unicode. Very eye opening article thanks.

"First read the general introduction

 https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html

That should be more than enough (we got all basic covered in Pharo).

If you want more, there is the Pharo Unicode project

 https://medium.com/concerning-pharo/an-implementation-of-unicode-normalization-7c6719068f43

But you most probably don't need that.

I thought that Windows used UTF-16 internally

 https://en.wikipedia.org/wiki/UTF-16#Usage"

Yes Windoom uses UTF-16, I am amazed how the coding world managed to take another simple problem and make it so complex.

In any case , I dont need Unicode . I made the decision to just chain string characters  together in 2 byte unicode arrays , I wont be converting from or to Unicode anyway. For usual English characters the second byte of a Unicode character array on Win 10 is always 00 which is a waste of memory anyway. 

Unicode may come handy when I have to deal with translating the game to other languages , but this for now is a very low priority.

As always I love your medium posts, clear, simple and to the point. Definitely bookmarked for future reference. Thanks
Reply | Threaded
Open this post in threaded view
|

Re: [UFFI] Dealing with Windows types

Sven Van Caekenberghe-2

> On 21 Nov 2016, at 13:35, Dimitris Chloupis <[hidden email]> wrote:
>
> " reality check...
> http://utf8everywhere.org/
>
> Sorry I don't remember enough of my Unicode research six months ago to
> be more help.
> Only enough to know there are traps and hunt down that article.
>
> cheers -ben"
>
> Really .... ???
>
> So bad ???
>
> Oh well good thing I did not intend to use Unicode. Very eye opening article thanks.
>
> "First read the general introduction
>
>  https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html
>
> That should be more than enough (we got all basic covered in Pharo).
>
> If you want more, there is the Pharo Unicode project
>
>  https://medium.com/concerning-pharo/an-implementation-of-unicode-normalization-7c6719068f43
>
> But you most probably don't need that.
>
> I thought that Windows used UTF-16 internally
>
>  https://en.wikipedia.org/wiki/UTF-16#Usage"
>
> Yes Windoom uses UTF-16, I am amazed how the coding world managed to take another simple problem and make it so complex.
>
> In any case , I dont need Unicode . I made the decision to just chain string characters  together in 2 byte unicode arrays , I wont be converting from or to Unicode anyway. For usual English characters the second byte of a Unicode character array on Win 10 is always 00 which is a waste of memory anyway.  
>
> Unicode may come handy when I have to deal with translating the game to other languages , but this for now is a very low priority.
>
> As always I love your medium posts, clear, simple and to the point. Definitely bookmarked for future reference. Thanks

You might as well use the real thing (and then you're good for the future too):

ZnUTF16Encoder new encodeString: 'Ελλάδα'. "#[3 149 3 187 3 187 3 172 3 180 3 177]"
ZnUTF16Encoder new encodeString: 'Greece'. "#[0 71 0 114 0 101 0 101 0 99 0 101]"

ZnUTF16Encoder new decodeBytes: #[3 149 3 187 3 187 3 172 3 180 3 177]. "'Ελλάδα'"
ZnUTF16Encoder new decodeBytes: #[0 71 0 114 0 101 0 101 0 99 0 101]. "'Greece'"

Good luck with your coding.

Sven



Reply | Threaded
Open this post in threaded view
|

Re: [UFFI] Dealing with Windows types

kilon.alios
ah perfect , easy enough, thank you Sven , I learned a ton today. Definitely will keep in mind. I can have a mix of non Unicode and Unicode data, so its not a problem per se.

For now I am focused on creating a RPC, which means I will pass strings that will be interpreted by C++ and translated to Unreal API calls. Additionally to that I will be passing byte data for regular data, player lives, game level, ammo, time left etc.

A part of the shared memory could  be Unicode for international text that will appear in game. My problem was that Windows uses Unicode everywhere and it also uses data types that are double in size for C++ just to accommodate Unicode. That creates incompatibilites for the memory model I am creating that I want to be the same across Windows, Linux and MacOS.

Basically the entire shared memory model will be one huge byte array shared between Pharo and C++ separated in 3 sections:

Section 1- Header - this is where global data / general information is stored [standard size]
Section  2- Commands - strings and data to be interpreted by C++ as Unreal API commands  , 1 command at a time [standard size]
Section 3- Live data- this is data to be stored with the game and describe the live state of the game itself so the player can carry on from where it exited the game without any need to save.


On Mon, Nov 21, 2016 at 2:48 PM Sven Van Caekenberghe <[hidden email]> wrote:

> On 21 Nov 2016, at 13:35, Dimitris Chloupis <[hidden email]> wrote:
>
> " reality check...
> http://utf8everywhere.org/
>
> Sorry I don't remember enough of my Unicode research six months ago to
> be more help.
> Only enough to know there are traps and hunt down that article.
>
> cheers -ben"
>
> Really .... ???
>
> So bad ???
>
> Oh well good thing I did not intend to use Unicode. Very eye opening article thanks.
>
> "First read the general introduction
>
https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html
>
> That should be more than enough (we got all basic covered in Pharo).
>
> If you want more, there is the Pharo Unicode project
>
https://medium.com/concerning-pharo/an-implementation-of-unicode-normalization-7c6719068f43
>
> But you most probably don't need that.
>
> I thought that Windows used UTF-16 internally
>
https://en.wikipedia.org/wiki/UTF-16#Usage"
>
> Yes Windoom uses UTF-16, I am amazed how the coding world managed to take another simple problem and make it so complex.
>
> In any case , I dont need Unicode . I made the decision to just chain string characters  together in 2 byte unicode arrays , I wont be converting from or to Unicode anyway. For usual English characters the second byte of a Unicode character array on Win 10 is always 00 which is a waste of memory anyway.
>
> Unicode may come handy when I have to deal with translating the game to other languages , but this for now is a very low priority.
>
> As always I love your medium posts, clear, simple and to the point. Definitely bookmarked for future reference. Thanks

You might as well use the real thing (and then you're good for the future too):

ZnUTF16Encoder new encodeString: 'Ελλάδα'. "#[3 149 3 187 3 187 3 172 3 180 3 177]"
ZnUTF16Encoder new encodeString: 'Greece'. "#[0 71 0 114 0 101 0 101 0 99 0 101]"

ZnUTF16Encoder new decodeBytes: #[3 149 3 187 3 187 3 172 3 180 3 177]. "'Ελλάδα'"
ZnUTF16Encoder new decodeBytes: #[0 71 0 114 0 101 0 101 0 99 0 101]. "'Greece'"

Good luck with your coding.

Sven