Some Win32 ClipboardInterpreter still use squeakToMac, why???

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Some Win32 ClipboardInterpreter still use squeakToMac, why???

Nicolas Cellier
It seems to me that Clipboard primitives explicitely use UTF8 encoding on Win32 platforms.

See for example https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/aed5e3391301011cc6b9ee6a353ee563f4ab6dbd/platforms/win32/vm/sqWin32Window.c

/* Convert data to Unicode UTF16. */
MultiByteToWideChar( CP_UTF8, 0, cvt, -1, out, wcharsNeeded );

/* Send the Unicode text to the clipboard. */
EmptyClipboard();
SetClipboardData(CF_UNICODETEXT, h);

and:

/* Get clipboard data in Unicode format */
h = GetClipboardData(CF_UNICODETEXT);
src = GlobalLock(h);

/* How many bytes do we need to store the UTF8 representation? */
bytesNeeded = WideCharToMultiByte(CP_UTF8, 0, src, -1,
NULL, 0, NULL, NULL );

/* Convert Unicode text to UTF8. */
cvt = tmp = malloc(bytesNeeded);
WideCharToMultiByte(CP_UTF8, 0, src, -1, tmp, bytesNeeded, NULL, NULL);

So it seems to me that:
1) all the squeakToMac sends found in various ClipboardInterpreter subclasses (the Win32 ones at least) are completely obsolete
2) all the exotic ClipboardInterpreter subclasse, but UTF8ClipboardInterpreter, are themselves obsolete and could be simply withdrawn from service

Did I miss something, or can I use the high pressure cleaner in this area?


Reply | Threaded
Open this post in threaded view
|

Re: Some Win32 ClipboardInterpreter still use squeakToMac, why???

Tobias Pape

> On 11.06.2019, at 18:22, Nicolas Cellier <[hidden email]> wrote:
>
> It seems to me that Clipboard primitives explicitely use UTF8 encoding on Win32 platforms.
>
> See for example https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/aed5e3391301011cc6b9ee6a353ee563f4ab6dbd/platforms/win32/vm/sqWin32Window.c
>
> /* Convert data to Unicode UTF16. */
> MultiByteToWideChar( CP_UTF8, 0, cvt, -1, out, wcharsNeeded );
>
> /* Send the Unicode text to the clipboard. */
> EmptyClipboard();
> SetClipboardData(CF_UNICODETEXT, h);
>
> and:
>
> /* Get clipboard data in Unicode format */
> h = GetClipboardData(CF_UNICODETEXT);
> src = GlobalLock(h);
>
> /* How many bytes do we need to store the UTF8 representation? */
> bytesNeeded = WideCharToMultiByte(CP_UTF8, 0, src, -1,
> NULL, 0, NULL, NULL );
>
> /* Convert Unicode text to UTF8. */
> cvt = tmp = malloc(bytesNeeded);
> WideCharToMultiByte(CP_UTF8, 0, src, -1, tmp, bytesNeeded, NULL, NULL);
>
> So it seems to me that:
> 1) all the squeakToMac sends found in various ClipboardInterpreter subclasses (the Win32 ones at least) are completely obsolete
> 2) all the exotic ClipboardInterpreter subclasse, but UTF8ClipboardInterpreter, are themselves obsolete and could be simply withdrawn from service
>
> Did I miss something, or can I use the high pressure cleaner in this area?
>

Powerwash all the things!
Let's have UTF-8 for everything external (well, except CJK-locales object, but there we have the leading-char thing anyway, right?)
        -t


Reply | Threaded
Open this post in threaded view
|

Re: Some Win32 ClipboardInterpreter still use squeakToMac, why???

timrowledge


> On 2019-06-11, at 10:06 AM, Tobias Pape <[hidden email]> wrote:
>>
>
> Powerwash all the things!

I like the general approach of cleaning things up but do remember to test that the older images are ok afterwards. Eliot has nicely explained the requirement many times in the past.


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
"Bother" said Pooh, as the IRS kicked his door in.



Reply | Threaded
Open this post in threaded view
|

Re: Some Win32 ClipboardInterpreter still use squeakToMac, why???

Tobias Pape

> On 11.06.2019, at 20:07, tim Rowledge <[hidden email]> wrote:
>
>
>
>> On 2019-06-11, at 10:06 AM, Tobias Pape <[hidden email]> wrote:
>>>
>>
>> Powerwash all the things!
>
> I like the general approach of cleaning things up but do remember to test that the older images are ok afterwards. Eliot has nicely explained the requirement many times in the past.
>
That's true and I am in no way opposed to that.

However, with (a) the change from “traditional” Squeak encoding (ie, macroman) to latin1 (aka iso-8859-1, hence "squeakToIso") and (b) then the  “rebrand” of bytestring-is-latin1/widestring-is-utf32 to "just"  unicode[1], the (a) step became lost to many people.

_if_ old images would not work with the changes Nicolas proposes, they would have been hosed long ago anyway.

Best regards
        -Tobias



[1]: this is the most amazing encoding "hack" I have seen, and I think that's actually why Unicode for codepoints <256 is the way it is: that latin1 strings are just that: valid unicode. I know it was not done _because_ of Squeak, but Andreas obviously saw then chance and took it.



Reply | Threaded
Open this post in threaded view
|

Re: Some Win32 ClipboardInterpreter still use squeakToMac, why???

Yoshiki Ohshima-3

On Tue, Jun 11, 2019 at 11:19 AM Tobias Pape <[hidden email]> wrote:

[1]: this is the most amazing encoding "hack" I have seen, and I think that's actually why Unicode for codepoints <256 is the way it is: that latin1 strings are just that: valid unicode. I know it was not done _because_ of Squeak, but Andreas obviously saw then chance and took it.


The story was a bit more complicated, but we documented some ideas here:



--
-- Yoshiki



Reply | Threaded
Open this post in threaded view
|

Re: Some Win32 ClipboardInterpreter still use squeakToMac, why???

Tobias Pape
Dear Yoshiki

> On 11.06.2019, at 20:23, Yoshiki Ohshima <[hidden email]> wrote:
>
>
> On Tue, Jun 11, 2019 at 11:19 AM Tobias Pape <[hidden email]> wrote:
>
> [1]: this is the most amazing encoding "hack" I have seen, and I think that's actually why Unicode for codepoints <256 is the way it is: that latin1 strings are just that: valid unicode. I know it was not done _because_ of Squeak, but Andreas obviously saw then chance and took it.
>
>
> The story was a bit more complicated, but we documented some ideas here:
>
> http://www.vpri.org/pdf/ohshima_c5.pdf


Thanks for setting me straight!
I didn't know this document existed, will read it. Finally I'll understand some design decisions, eg, regarding TTCFonts :)

Best regards
        -Tobias


Reply | Threaded
Open this post in threaded view
|

Re: Some Win32 ClipboardInterpreter still use squeakToMac, why???

Chris Cunnington-4
How do I load Japanese into Squeak5.2? I bounced between the Swiki and SqueakMap without much success.

Chris

> On Jun 11, 2019, at 2:29 PM, Tobias Pape <[hidden email]> wrote:
>
> Dear Yoshiki
>
>> On 11.06.2019, at 20:23, Yoshiki Ohshima <[hidden email]> wrote:
>>
>>
>> On Tue, Jun 11, 2019 at 11:19 AM Tobias Pape <[hidden email]> wrote:
>>
>> [1]: this is the most amazing encoding "hack" I have seen, and I think that's actually why Unicode for codepoints <256 is the way it is: that latin1 strings are just that: valid unicode. I know it was not done _because_ of Squeak, but Andreas obviously saw then chance and took it.
>>
>>
>> The story was a bit more complicated, but we documented some ideas here:
>>
>> http://www.vpri.org/pdf/ohshima_c5.pdf
>
>
> Thanks for setting me straight!
> I didn't know this document existed, will read it. Finally I'll understand some design decisions, eg, regarding TTCFonts :)
>
> Best regards
> -Tobias
>
>