File size limit

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: UTF-8

Chantal Thibodeau
Thanks, I will look at all that this morning
----- Original Message -----
Sent: Wednesday, March 21, 2012 6:51 AM
Subject: Re: UTF-8

If the accents are your only problem, it sounds as if something is translating the text for you.  Make sure you have configured MySQL to accept and hold unicode data.  If not, the database itself will convert the text to latin and the accents are lost.  This link might help: http://dev.hubspot.com/bid/7049/MySQL-and-Unicode-Three-Gotchas

Also, since Windows 2000, Windows APIs use UTF16 (previously used UCS2), so as Todor pointed out you may need to convert your UTF8 to UTF16 if you want to call functions that expect Unicode parameters.

Jon




From:        Chantal Thibodeau <[hidden email]>
To:        [hidden email],
Date:        03/20/2012 03:54 PM
Subject:        UTF-8
Sent by:        Using Visual Smalltalk for Windows/Enterprise <[hidden email]>




Hi everybody
 
Has someone already tried to read UTF-8 strings ?
 
My problem is we receive UTF-8 XML files, that are then put in a MySQL database, in UTF8 format.
 
When I read them, all the accents are screwed up.  I looked on the net but I only find code for Squeak or other, and I am unable to make it works.
 
Thanks
 
 
*** this signature added by listserv *** *** Visit http://www.listserv.dfn.de/archives/vswe-l.html *** *** for archive browsing and VSWE-L membership management ***
*** this signature added by listserv *** *** Visit http://www.listserv.dfn.de/archives/vswe-l.html *** *** for archive browsing and VSWE-L membership management ***
*** this signature added by listserv *** *** Visit http://www.listserv.dfn.de/archives/vswe-l.html *** *** for archive browsing and VSWE-L membership management ***
Reply | Threaded
Open this post in threaded view
|

Re: UTF-8

Chantal Thibodeau
In reply to this post by Jon Raiford
Thanks to you all, it is finally working, I've got my accents right.
 
 
UnicodeString fromUTF8Content: aByteArray        (where aByteArray is the result of my mySQL query)
 
 
 
fromUTF8Content: aByteArrayOrString
 "Answer a new instance of the receiver containing the same characters as the <aByteArrayOrString>
 argument.
 Implementation Note: CP_ACP is the only code page supported by Win95."
 
 | answer answerSize |
 
 aByteArrayOrString isEmpty ifTrue: [^UnicodeString new].
 answer := self new: aByteArrayOrString size * 8.
 
 (answerSize := KernelLibrary
  multiByteToWideCharCp:  65001 "CP_UTF8"
  flags: 0
  lpstr: aByteArrayOrString
  cchstr: aByteArrayOrString size
  lpwstr: answer
  cchwstr: answer basicSize) == 0
  ifTrue: [^KernelLibrary default systemError].
 ^answer copyFrom: 1 to: answerSize
 
 
----- Original Message -----
Sent: Wednesday, March 21, 2012 6:51 AM
Subject: Re: UTF-8

If the accents are your only problem, it sounds as if something is translating the text for you.  Make sure you have configured MySQL to accept and hold unicode data.  If not, the database itself will convert the text to latin and the accents are lost.  This link might help: http://dev.hubspot.com/bid/7049/MySQL-and-Unicode-Three-Gotchas

Also, since Windows 2000, Windows APIs use UTF16 (previously used UCS2), so as Todor pointed out you may need to convert your UTF8 to UTF16 if you want to call functions that expect Unicode parameters.

Jon




From:        Chantal Thibodeau <[hidden email]>
To:        [hidden email],
Date:        03/20/2012 03:54 PM
Subject:        UTF-8
Sent by:        Using Visual Smalltalk for Windows/Enterprise <[hidden email]>




Hi everybody
 
Has someone already tried to read UTF-8 strings ?
 
My problem is we receive UTF-8 XML files, that are then put in a MySQL database, in UTF8 format.
 
When I read them, all the accents are screwed up.  I looked on the net but I only find code for Squeak or other, and I am unable to make it works.
 
Thanks
 
 
*** this signature added by listserv *** *** Visit http://www.listserv.dfn.de/archives/vswe-l.html *** *** for archive browsing and VSWE-L membership management ***
*** this signature added by listserv *** *** Visit http://www.listserv.dfn.de/archives/vswe-l.html *** *** for archive browsing and VSWE-L membership management ***
*** this signature added by listserv *** *** Visit http://www.listserv.dfn.de/archives/vswe-l.html *** *** for archive browsing and VSWE-L membership management ***
Reply | Threaded
Open this post in threaded view
|

Re: UTF-8

Chantal Thibodeau
In reply to this post by Chantal Thibodeau
Small correction, the sample was troncated
 
change last line by
 
 ^ (UnicodeStringBuffer stringFromUnicode: answer length: answerSize) asString
----- Original Message -----
Sent: Wednesday, March 21, 2012 11:13 AM
Subject: Re: UTF-8

Thanks to you all, it is finally working, I've got my accents right.
 
 
UnicodeString fromUTF8Content: aByteArray        (where aByteArray is the result of my mySQL query)
 
 
 
fromUTF8Content: aByteArrayOrString
 "Answer a new instance of the receiver containing the same characters as the <aByteArrayOrString>
 argument.
 Implementation Note: CP_ACP is the only code page supported by Win95."
 
 | answer answerSize |
 
 aByteArrayOrString isEmpty ifTrue: [^UnicodeString new].
 answer := self new: aByteArrayOrString size * 8.
 
 (answerSize := KernelLibrary
  multiByteToWideCharCp:  65001 "CP_UTF8"
  flags: 0
  lpstr: aByteArrayOrString
  cchstr: aByteArrayOrString size
  lpwstr: answer
  cchwstr: answer basicSize) == 0
  ifTrue: [^KernelLibrary default systemError].
 ^answer copyFrom: 1 to: answerSize
 
 
----- Original Message -----
Sent: Wednesday, March 21, 2012 6:51 AM
Subject: Re: UTF-8

If the accents are your only problem, it sounds as if something is translating the text for you.  Make sure you have configured MySQL to accept and hold unicode data.  If not, the database itself will convert the text to latin and the accents are lost.  This link might help: http://dev.hubspot.com/bid/7049/MySQL-and-Unicode-Three-Gotchas

Also, since Windows 2000, Windows APIs use UTF16 (previously used UCS2), so as Todor pointed out you may need to convert your UTF8 to UTF16 if you want to call functions that expect Unicode parameters.

Jon




From:        Chantal Thibodeau <[hidden email]>
To:        [hidden email],
Date:        03/20/2012 03:54 PM
Subject:        UTF-8
Sent by:        Using Visual Smalltalk for Windows/Enterprise <[hidden email]>




Hi everybody
 
Has someone already tried to read UTF-8 strings ?
 
My problem is we receive UTF-8 XML files, that are then put in a MySQL database, in UTF8 format.
 
When I read them, all the accents are screwed up.  I looked on the net but I only find code for Squeak or other, and I am unable to make it works.
 
Thanks
 
 
*** this signature added by listserv *** *** Visit http://www.listserv.dfn.de/archives/vswe-l.html *** *** for archive browsing and VSWE-L membership management ***
*** this signature added by listserv *** *** Visit http://www.listserv.dfn.de/archives/vswe-l.html *** *** for archive browsing and VSWE-L membership management ***
*** this signature added by listserv *** *** Visit http://www.listserv.dfn.de/archives/vswe-l.html *** *** for archive browsing and VSWE-L membership management ***
12