NeoCSV/NeoJSON and encodings

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

NeoCSV/NeoJSON and encodings

Esteban A. Maringolo
Hi all, Sven,

I would like to know what is the proper way (steps) to parse a UTF-8 encoded CSV file, which will store most of the strings into domain objects instVars which will get mapped back to JSON and send trough the wire by means of a Seaside RESTful Filter.

I haven't specified any encoding information during the input or the output, and then I'm not seeing the right characters in the inspectors (I expected that), nor in the JSON output or the Seaside HTML output.

The Zinc server adaptor has its default codec, it is, utf-8.





Reply | Threaded
Open this post in threaded view
|

Re: NeoCSV/NeoJSON and encodings

Sven Van Caekenberghe-2
Hello Esteban,

On 11 Mar 2013, at 03:17, Esteban A. Maringolo <[hidden email]> wrote:

> Hi all, Sven,
>
> I would like to know what is the proper way (steps) to parse a UTF-8 encoded
> CSV file, which will store most of the strings into domain objects instVars
> which will get mapped back to JSON and send trough the wire by means of a
> Seaside RESTful Filter.
>
> I haven't specified any encoding information during the input or the output,
> and then I'm not seeing the right characters in the inspectors (I expected
> that), nor in the JSON output or the Seaside HTML output.
>
> The Zinc server adaptor has its default codec, it is, utf-8.

Both NeoCSV and NeoJSON were written to be encoding agnostic. I.e. they work on character streams that you provide. The encoding/decoding is up to you or up to whatever you use to instanciate the character streams.

Here is a quick example (Pharo VM on Mac OS X, #20587, standard NeoCSV release).

'foo.csv' asFileReference writeStreamDo: [ :out |
        (NeoCSVWriter on: out)
                nextPut: #( 1 'élève en Français' ) ].

'foo.csv' asFileReference readStreamDo: [ :in |
        (NeoCSVReader on: in)
                next ].

#('1' 'élève en Français')

$ cat foo.csv
"1","élève en Français"

$ file foo.csv
foo.csv: UTF-8 Unicode text, with CRLF line terminators

The above code uses whatever FileReference offers, namely UTF-8 encoded character streams.

I would suggest that you inspect the contents of the character streams before feeding them to NeoCSV or NeoJSON, the wrong encoding will probably be visible there.

'foo.csv' asFileReference readStreamDo: [ :in | in upToEnd ].

Zinc, both the client and the server, should normally always do the right thing (™): based on the Content-Type bytes will be converted using the proper encoding.

Regards,

Sven

--
Sven Van Caekenberghe
http://stfx.eu
Smalltalk is the Red Pill


Reply | Threaded
Open this post in threaded view
|

Re: NeoCSV/NeoJSON and encodings

Esteban A. Maringolo
Hi all, Sven,

Sorry I forgot to reply to this yesterday.

It was simpler than anything else, for some reason I was using StandardFileStream to read the file. I switched to FileStream and everything went fine.


ps: aString asFileReference writeStreamDo: doesn't work in my 1.4 image.
Reply | Threaded
Open this post in threaded view
|

Re: NeoCSV/NeoJSON and encodings

Sven Van Caekenberghe-2

On 12 Mar 2013, at 20:38, "Esteban A. Maringolo" <[hidden email]> wrote:

> Hi all, Sven,
>
> Sorry I forgot to reply to this yesterday.
>
> It was simpler than anything else, for some reason I was using
> StandardFileStream to read the file. I switched to FileStream and everything
> went fine.

Good!

> ps: aString asFileReference writeStreamDo: doesn't work in my 1.4 image.

;-)

I switched to 2.0 months ago, it so much nicer.

--
Sven Van Caekenberghe
http://stfx.eu
Smalltalk is the Red Pill