Rick,
You can use VW codepage support for that. These are snippets from my code:
convertFromUTF8ToUnicode: aString
^(EncodedStream
on: aString asIntegerArray readStream
encodedBy: (StreamEncoder new: #'UTF_8')) contents
convertFromUnicodeToUTF8: aString
^(aString asByteArrayEncoding: #'UTF_8') asByteString
Rick Flower wrote:
> I need to parse a CSV file that may appears to have string fields
> encoded in either UTF-8 or perhaps Unicode (each character has two bytes
> associated with it such as "@E^@a^@g^@l^@e^@" = "Eagle". Anyway, I need
> a way to parse this file not only to hopefully build a collection of
> rows, but perhaps be able to convert these strings into something useful
> to VW (and for use w/ Seaside). I found (on Google) a link to Thomas
> Gagne's eFinNet-DataFile, but when I tried using it I got a DNU for a
> tuple issue on my dataclass.. Anyway, before I start tracking down that
> issue I thought I'd get the collective wisdom on CSV parsing (and
> UTF-8/Unicode handling as well) from you all.. As a side note, I'll
> pester Thomas about his package as well since I'm not sure how old it is
> or anything.
>
> Thanks again!
>
>
--
Janko Mivšek
Svetovalec za informatiko
EraNova d.o.o.
Ljubljana, Slovenija
www.eranova.si
tel: 01 514 22 55
faks: 01 514 22 56
gsm: 031 674 565