We use VisualWorks 7.7.1, and have a customer which uses Oracle 10
with 'NLS_CHARACTERSET' set to 'AL32UTF8'. When connecting with OracleEXDI, the connection fails with 'Unhandled exception: Column encoding not yet recognized'. Looking into the system, I can see that OracleConnection class>>initializeEncoderMap does not include AL32UTF8. Here is what Oracle writes about AL32UTF8: "The 8-bit encoding of Unicode. It is a variable-width encoding. One Unicode character can be 1 byte, 2 bytes, 3 bytes, or 4 bytes in UTF-8 encoding. Characters from the European scripts are represented in either 1 or 2 bytes. Characters from most Asian scripts are represented in 3 bytes. Supplementary characters are represented in 4 bytes. The Oracle character set that supports UTF-8 is AL32UTF8." ( http://download.oracle.com/docs/cd/B19306_01/server.102/b14225/glossary.htm ) As far as I can understand, AL32UTF8 is Oracle's proper implementation of UTF8: http://www.mail-archive.com/perl-unicode@.../msg02239.html I therefore wonder whether OracleConnection class>>initializeEncoderMap should have the following entry added: at: 'AL32UTF8' put: #utf_8 Kind regards Runar _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
On Mar 17, 2011, at 1:28 58PM, Runar Jordahl wrote:
> We use VisualWorks 7.7.1, and have a customer which uses Oracle 10 > with 'NLS_CHARACTERSET' set to 'AL32UTF8'. When connecting with > OracleEXDI, the connection fails with 'Unhandled exception: Column > encoding not yet recognized'. > > Looking into the system, I can see that OracleConnection > class>>initializeEncoderMap does not include AL32UTF8. > > Here is what Oracle writes about AL32UTF8: > > "The 8-bit encoding of Unicode. It is a variable-width encoding. One > Unicode character can be 1 byte, 2 bytes, 3 bytes, or 4 bytes in UTF-8 > encoding. Characters from the European scripts are represented in > either 1 or 2 bytes. Characters from most Asian scripts are > represented in 3 bytes. Supplementary characters are represented in 4 > bytes. The Oracle character set that supports UTF-8 is AL32UTF8." ( > http://download.oracle.com/docs/cd/B19306_01/server.102/b14225/glossary.htm > ) > > As far as I can understand, AL32UTF8 is Oracle's proper implementation > of UTF8: http://www.mail-archive.com/perl-unicode@.../msg02239.html > > I therefore wonder whether OracleConnection > class>>initializeEncoderMap should have the following entry added: > > at: 'AL32UTF8' put: #utf_8 > > > Kind regards > Runar > __________________ I've run into the same, it is really outdated. Not to mention it contains incorrect mappings, like using a standard UTF8Encoder for oracle's "UTF8" (which is correct in *most* cases though) IMHO, the whole mess of looking up where the client reads NLS_LANG from to find the correct format to send the data in could/should be replaced by setting the encoding of the connection directly, using OCIEnvNlsCreate(). It's been there since 9.X, and 8.X went out of extended support from Oracle > 5 years ago... Not that it solves all problems for us norwegians, even when you directly set the character set you'll be sending data in to something sensible, you still need to convert decimal points to that used by the country specified in NLS_LANG (which is not currently done at all by VW, btw)... You can get it for an environment through OCINlsGetInfo, passing as item OCI_NLS_DECIMAL, haven't found a way to tell it what to use. TLDR; Oracle i18n support in VisualWorks is a real mess, and could be simplified/improved significantly. Cheers, Henry PS. AFAIK, NLS_CHARACTERSET is internal to the server, and have no effect from a clients point of view (except being what the client converts to for certain column types) To find the expected encoding of strings you give to the client (if not set specifically as described above), you use NLS_LANG. _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
FWIW, the thousands separator and decimal point can be accessed for the CLDR locales from the associated NumberPrintPolicy or currency policy.
There is no guarantee that the operating system's and the Unicode Consortium's value agree for any particular operating system, however. (In fact, I'd be interested in hearing of cases where they disagree.)
Les From: [hidden email] on behalf of Henrik Johansen Sent: Thu 3/17/2011 7:34 AM To: Runar Jordahl Cc: [hidden email] Subject: Re: [vwnc] OracleEXDI has no support for AL32UTF8 On Mar 17, 2011, at 1:28 58PM, Runar Jordahl wrote: _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
On 17.03.2011 17:15, Kooyman, Les wrote:
Haha. These aren't Unicode or OS system's locale values we are talking about, these are Oracles. Just like with their encoding names, they felt no need whatsoever to follow any kind of standard. Heck, they even have special Java classes just for mapping between them: http://www.stanford.edu/dept/itss/docs/oracle/10g/appdev.101/b10971/oracle/i18n/util/LocaleMapper.html Sorry if I sound harsh, still a bit agitated from trying to remember the details required for writing the last post, and remembering the pain it brought me when trying to understand how it worked. Cheers, Henry _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Free forum by Nabble | Edit this page |