Hi I found this, that works:How can I specify the character encoding when opening a readStream on a FileRerefence. | readStream fileContent | readStream := (File named: aFileName) openForRead. fileContent := ZnCharacterReadStream on: readStream encoding: encoding. fileContent upToEnd asString. | readStream fileContent | readStream := aFileName asFileReference readStream. fileContent := ZnCharacterReadStream on: readStream encoding: encoding. fileContent upToEnd asString. |
Hi Nicolai,
The FileSystem API is a bit inconsistent, yes. This is how you can use it: (FileLocator temp / 'foo.txt') writeStreamDo: [ :out | out binary. (ZnCharacterWriteStream on: out encoding: #utf8) << 'élève' ]. (FileLocator temp / 'foo.txt') readStreamDo: [ :in | in binary. ZnCharacterReadStream on: in encoding: #utf8) upToEnd ]. (FileLocator temp / 'foo.txt') binaryReadStreamDo: [ :in | (ZnCharacterReadStream on: in encoding: #utf8) upToEnd ]. There is no #binaryWriteStreamDo: The API around File is more correct, IMHO. Does this help ? What exactly is your question ? Sven > On 4 Feb 2017, at 12:09, Nicolai Hess <[hidden email]> wrote: > > Hi > How can I specify the character encoding when opening a readStream on a FileRerefence. > > I found this, that works: > > | readStream fileContent | > readStream := (File named: aFileName) openForRead. > fileContent := ZnCharacterReadStream on: readStream encoding: encoding. > fileContent upToEnd asString. > > But if I try to do the same with a readStream from a FileReference > > | readStream fileContent | > readStream := aFileName asFileReference readStream. > fileContent := ZnCharacterReadStream on: readStream encoding: encoding. > fileContent upToEnd asString. > > I get an error SmallInteger DNU #asciiValue, > > this is because, in the first method, we create a binary filestream, and if we > use readStream from a FileReference, the stream is a MultibyteFileStream. > > How can I us ZnEncoder for a readstream from a FileReference? > > (and is it on purpose that both readStream method (openForRead/readStream) > return different kinds of binary streams?) > > > nicolai |
2017-02-04 12:49 GMT+01:00 Sven Van Caekenberghe <[hidden email]>: Hi Nicolai, Yes, thanks for the fast response. What exactly is your question ? I am looking at the issues with FileList, there are some parts that don't work anymore (see FileList>>#contents, it calls some unimplemted methods), and it uses TextConverter and somoe parts of the older File api. It looks like most other (newer) parts are using ZnCharacterReadstream for encoding, but I couldn't find a way to use it together with FileReferences (most parts of the FileList already operate with the newer FileSystem API.
|
> On 4 Feb 2017, at 13:01, Nicolai Hess <[hidden email]> wrote: > > > > 2017-02-04 12:49 GMT+01:00 Sven Van Caekenberghe <[hidden email]>: > Hi Nicolai, > > The FileSystem API is a bit inconsistent, yes. > > This is how you can use it: > > (FileLocator temp / 'foo.txt') writeStreamDo: [ :out | > out binary. > (ZnCharacterWriteStream on: out encoding: #utf8) << 'élève' ]. > > (FileLocator temp / 'foo.txt') readStreamDo: [ :in | > in binary. > ZnCharacterReadStream on: in encoding: #utf8) upToEnd ]. > > (FileLocator temp / 'foo.txt') binaryReadStreamDo: [ :in | > (ZnCharacterReadStream on: in encoding: #utf8) upToEnd ]. > > There is no #binaryWriteStreamDo: > > The API around File is more correct, IMHO. > > Does this help ? > > Yes, thanks for the fast response. > > What exactly is your question ? > > I am looking at the issues with FileList, there are some parts that don't work anymore (see FileList>>#contents, it calls some unimplemted methods), and it uses TextConverter and somoe parts of the older File api. > It looks like most other (newer) parts are using ZnCharacterReadstream for encoding, but I couldn't find a way to use it together with FileReferences (most parts of the FileList already operate with the newer FileSystem API. OK. From my standpoint and understanding, I would always use binary streams with explicit Zn converters, this combination is much easier to understand and better implemented with more features. If you encounter any problem or have any questions, I will gladly try to help you. > Sven > > > On 4 Feb 2017, at 12:09, Nicolai Hess <[hidden email]> wrote: > > > > Hi > > How can I specify the character encoding when opening a readStream on a FileRerefence. > > > > I found this, that works: > > > > | readStream fileContent | > > readStream := (File named: aFileName) openForRead. > > fileContent := ZnCharacterReadStream on: readStream encoding: encoding. > > fileContent upToEnd asString. > > > > But if I try to do the same with a readStream from a FileReference > > > > | readStream fileContent | > > readStream := aFileName asFileReference readStream. > > fileContent := ZnCharacterReadStream on: readStream encoding: encoding. > > fileContent upToEnd asString. > > > > I get an error SmallInteger DNU #asciiValue, > > > > this is because, in the first method, we create a binary filestream, and if we > > use readStream from a FileReference, the stream is a MultibyteFileStream. > > > > How can I us ZnEncoder for a readstream from a FileReference? > > > > (and is it on purpose that both readStream method (openForRead/readStream) > > return different kinds of binary streams?) > > > > > > nicolai |
>> 2017-02-04 12:49 GMT+01:00 Sven Van Caekenberghe <[hidden email]>: >> Hi Nicolai, >> >> The FileSystem API is a bit inconsistent, yes. >> >> This is how you can use it: >> >> (FileLocator temp / 'foo.txt') writeStreamDo: [ :out | >> out binary. >> (ZnCharacterWriteStream on: out encoding: #utf8) << 'élève' ]. >> >> (FileLocator temp / 'foo.txt') readStreamDo: [ :in | >> in binary. >> ZnCharacterReadStream on: in encoding: #utf8) upToEnd ]. >> >> (FileLocator temp / 'foo.txt') binaryReadStreamDo: [ :in | >> (ZnCharacterReadStream on: in encoding: #utf8) upToEnd ]. >> >> There is no #binaryWriteStreamDo: >> >> The API around File is more correct, IMHO. >> >> Does this help ? >> >> Yes, thanks for the fast response. >> >> What exactly is your question ? >> >> I am looking at the issues with FileList, there are some parts that >> don't work anymore (see FileList>>#contents, it calls some unimplemted >> methods), and it uses TextConverter and somoe parts of the older File >> api. >> It looks like most other (newer) parts are using ZnCharacterReadstream >> for encoding, but I couldn't find a way to use it together with >> FileReferences (most parts of the FileList already operate with the >> newer FileSystem API. > > OK. > > From my standpoint and understanding, I would always use binary streams > with explicit Zn converters, this combination is much easier to > understand and better implemented with more features. > > If you encounter any problem or have any questions, I will gladly try to > help you. Sven I really think that we should clean and improve on this side. What would be a roadmap? - Improve filesystem API to call Zn - Deprecated other users? (what are they?) - what would be the next steps? Stef |
Le 04/02/2017 à 19:09, stepharong a écrit :
> >>> 2017-02-04 12:49 GMT+01:00 Sven Van Caekenberghe <[hidden email]>: >>> Hi Nicolai, >>> >>> The FileSystem API is a bit inconsistent, yes. >>> >>> This is how you can use it: >>> >>> (FileLocator temp / 'foo.txt') writeStreamDo: [ :out | >>> out binary. >>> (ZnCharacterWriteStream on: out encoding: #utf8) << 'élève' ]. >>> >>> (FileLocator temp / 'foo.txt') readStreamDo: [ :in | >>> in binary. >>> ZnCharacterReadStream on: in encoding: #utf8) upToEnd ]. >>> >>> (FileLocator temp / 'foo.txt') binaryReadStreamDo: [ :in | >>> (ZnCharacterReadStream on: in encoding: #utf8) upToEnd ]. >>> >>> There is no #binaryWriteStreamDo: >>> >>> The API around File is more correct, IMHO. >>> >>> Does this help ? >>> >>> Yes, thanks for the fast response. >>> >>> What exactly is your question ? >>> >>> I am looking at the issues with FileList, there are some parts that >>> don't work anymore (see FileList>>#contents, it calls some >>> unimplemted methods), and it uses TextConverter and somoe parts of >>> the older File api. >>> It looks like most other (newer) parts are using >>> ZnCharacterReadstream for encoding, but I couldn't find a way to use >>> it together with FileReferences (most parts of the FileList already >>> operate with the newer FileSystem API. >> >> OK. >> >> From my standpoint and understanding, I would always use binary >> streams with explicit Zn converters, this combination is much easier >> to understand and better implemented with more features. >> >> If you encounter any problem or have any questions, I will gladly try >> to help you. > > Sven > > I really think that we should clean and improve on this side. > What would be a roadmap? > > - Improve filesystem API to call Zn > - Deprecated other users? (what are they?) > - what would be the next steps? Have one mode that autoselect another encoding (TZ aware of course) if UTF-8 fails... I'm having that one regularly with a mix of utf-8 / latin1 files, i.e. C files with french accentuated comments in them. Thierry > Stef > > |
In reply to this post by stepharong
> On 4 Feb 2017, at 19:09, stepharong <[hidden email]> wrote: > > >>> 2017-02-04 12:49 GMT+01:00 Sven Van Caekenberghe <[hidden email]>: >>> Hi Nicolai, >>> >>> The FileSystem API is a bit inconsistent, yes. >>> >>> This is how you can use it: >>> >>> (FileLocator temp / 'foo.txt') writeStreamDo: [ :out | >>> out binary. >>> (ZnCharacterWriteStream on: out encoding: #utf8) << 'élève' ]. >>> >>> (FileLocator temp / 'foo.txt') readStreamDo: [ :in | >>> in binary. >>> ZnCharacterReadStream on: in encoding: #utf8) upToEnd ]. >>> >>> (FileLocator temp / 'foo.txt') binaryReadStreamDo: [ :in | >>> (ZnCharacterReadStream on: in encoding: #utf8) upToEnd ]. >>> >>> There is no #binaryWriteStreamDo: >>> >>> The API around File is more correct, IMHO. >>> >>> Does this help ? >>> >>> Yes, thanks for the fast response. >>> >>> What exactly is your question ? >>> >>> I am looking at the issues with FileList, there are some parts that don't work anymore (see FileList>>#contents, it calls some unimplemted methods), and it uses TextConverter and somoe parts of the older File api. >>> It looks like most other (newer) parts are using ZnCharacterReadstream for encoding, but I couldn't find a way to use it together with FileReferences (most parts of the FileList already operate with the newer FileSystem API. >> >> OK. >> >> From my standpoint and understanding, I would always use binary streams with explicit Zn converters, this combination is much easier to understand and better implemented with more features. >> >> If you encounter any problem or have any questions, I will gladly try to help you. > > Sven > > I really think that we should clean and improve on this side. > What would be a roadmap? > > - Improve filesystem API to call Zn > - Deprecated other users? (what are they?) > - what would be the next steps? > > Stef Guilermo's new File class with its simple binary streams can be perfectly combined (stacked) with Zn character encoding streams. A first step would be to make FileSystem return/produce those stacked streams. I even believe there is a prototype integrating this. I believe you are one (the ?) author of Nile, so you know very well how complex stream users are - that is the challenge. Authors of parsers or protocols that depend on streams should write to minimal API's, only using what they really need. Sven |
In reply to this post by stepharong
2017-02-04 19:09 GMT+01:00 stepharong <[hidden email]>: Sven Yes, we need remove completely old files code. Funny that file stream which we use (MultiByteFileStream) is in package 'Files-Deprecated'. Funny thing now is deprecated package tag |
In reply to this post by Sven Van Caekenberghe-2
> > Guilermo's new File class with its simple binary streams can be > perfectly combined (stacked) with Zn character encoding streams. A first > step would be to make FileSystem return/produce those stacked streams. > > I even believe there is a prototype integrating this > > I believe you are one (the ?) author of Nile, so you know very well how > complex stream users are - that is the challenge. Authors of parsers or > protocols that depend on streams should write to minimal API's, only > using what they really need. Sven my point is that if we want to make progress from this front we should slowly improve once we have in place the new version then we can plan for migrating yes it can be tedious and boring but we can do it if we have a plan. Stef |
In reply to this post by Denis Kudriashov
I will talk about guille about his file implementation and we can see what we can do. On Sat, 04 Feb 2017 20:28:07 +0100, Denis Kudriashov <[hidden email]> wrote:
-- Using Opera's mail client: http://www.opera.com/mail/ |
> On 4 Feb 2017, at 21:14, stepharong <[hidden email]> wrote: > > I will talk about guille about his file implementation and we can see what we can do. Yes, that is step 1, here is the issue I was talking about: https://pharo.fogbugz.com/f/cases/18414/Change-usages-of-StandardFileStream-and-MultiByteFileStream-to-File-decorators > On Sat, 04 Feb 2017 20:28:07 +0100, Denis Kudriashov <[hidden email]> wrote: > > > 2017-02-04 19:09 GMT+01:00 stepharong <[hidden email]>: > Sven > > I really think that we should clean and improve on this side. > What would be a roadmap? > > - Improve filesystem API to call Zn > - Deprecated other users? (what are they?) > - what would be the next steps? > > Yes, we need remove completely old files code. > Funny that file stream which we use (MultiByteFileStream) is in package 'Files-Deprecated'. > > > Funny thing now is deprecated package tag > > > > -- > Using Opera's mail client: http://www.opera.com/mail/ |
In reply to this post by Sven Van Caekenberghe-2
2017-02-04 13:40 GMT+01:00 Sven Van Caekenberghe <[hidden email]>:
:-) Is there a way to list all supported encodings, similar to TextConverter class >> #allEncodingNames ? I only found ZnSimplifiedByteEncoder class >> #knownEncodingIdentifiers and ZnByteEncoder class >> #knownEncodingIdentifiers and nothing for all supported utf-8/16 ... encodings. nicolai
|
> On 6 Feb 2017, at 22:33, Nicolai Hess <[hidden email]> wrote: > > > > 2017-02-04 13:40 GMT+01:00 Sven Van Caekenberghe <[hidden email]>: > > > On 4 Feb 2017, at 13:01, Nicolai Hess <[hidden email]> wrote: > > > > > > > > 2017-02-04 12:49 GMT+01:00 Sven Van Caekenberghe <[hidden email]>: > > Hi Nicolai, > > > > The FileSystem API is a bit inconsistent, yes. > > > > This is how you can use it: > > > > (FileLocator temp / 'foo.txt') writeStreamDo: [ :out | > > out binary. > > (ZnCharacterWriteStream on: out encoding: #utf8) << 'élève' ]. > > > > (FileLocator temp / 'foo.txt') readStreamDo: [ :in | > > in binary. > > ZnCharacterReadStream on: in encoding: #utf8) upToEnd ]. > > > > (FileLocator temp / 'foo.txt') binaryReadStreamDo: [ :in | > > (ZnCharacterReadStream on: in encoding: #utf8) upToEnd ]. > > > > There is no #binaryWriteStreamDo: > > > > The API around File is more correct, IMHO. > > > > Does this help ? > > > > Yes, thanks for the fast response. > > > > What exactly is your question ? > > > > I am looking at the issues with FileList, there are some parts that don't work anymore (see FileList>>#contents, it calls some unimplemted methods), and it uses TextConverter and somoe parts of the older File api. > > It looks like most other (newer) parts are using ZnCharacterReadstream for encoding, but I couldn't find a way to use it together with FileReferences (most parts of the FileList already operate with the newer FileSystem API. > > OK. > > From my standpoint and understanding, I would always use binary streams with explicit Zn converters, this combination is much easier to understand and better implemented with more features. > > If you encounter any problem or have any questions, I will gladly try to help you. > > :-) > Is there a way to list all supported encodings, similar to TextConverter class >> #allEncodingNames ? > I only found > ZnSimplifiedByteEncoder class >> #knownEncodingIdentifiers > and > ZnByteEncoder class >> #knownEncodingIdentifiers > and nothing for all supported utf-8/16 ... encodings. Ah, yes, you are right, I should add a #knownEncodingIdentifiers at the level of ZnCharacterEncoder that returns the union of all of them. I will add that tomorrow. > nicolai > > > > Sven > > > > > On 4 Feb 2017, at 12:09, Nicolai Hess <[hidden email]> wrote: > > > > > > Hi > > > How can I specify the character encoding when opening a readStream on a FileRerefence. > > > > > > I found this, that works: > > > > > > | readStream fileContent | > > > readStream := (File named: aFileName) openForRead. > > > fileContent := ZnCharacterReadStream on: readStream encoding: encoding. > > > fileContent upToEnd asString. > > > > > > But if I try to do the same with a readStream from a FileReference > > > > > > | readStream fileContent | > > > readStream := aFileName asFileReference readStream. > > > fileContent := ZnCharacterReadStream on: readStream encoding: encoding. > > > fileContent upToEnd asString. > > > > > > I get an error SmallInteger DNU #asciiValue, > > > > > > this is because, in the first method, we create a binary filestream, and if we > > > use readStream from a FileReference, the stream is a MultibyteFileStream. > > > > > > How can I us ZnEncoder for a readstream from a FileReference? > > > > > > (and is it on purpose that both readStream method (openForRead/readStream) > > > return different kinds of binary streams?) > > > > > > > > > nicolai |
In reply to this post by Nicolai Hess-3-2
OK, I committed the following in Zn #bleedingEdge:
=== Name: Zinc-Character-Encoding-Core-SvenVanCaekenberghe.46 Author: SvenVanCaekenberghe Time: 7 February 2017, 11:07:31.306364 am UUID: a2928299-1dda-4fdf-b15e-e5d7bed2373e Ancestors: Zinc-Character-Encoding-Core-SvenVanCaekenberghe.45 Add ZnCharacterEncoder class>>#knownEncodingIdentifiers to return the collection of all known encoding identifiers in the system - thx Nicolai Hess for the request Make sure #null is resolved to ZnNullEncoder Add ZnCharacterEncoderTests>>#testKnownEncodingIdentifiers === Name: Zinc-Character-Encoding-Tests-SvenVanCaekenberghe.28 Author: SvenVanCaekenberghe Time: 7 February 2017, 11:07:50.782471 am UUID: 938d1683-b8c4-4861-b1e1-f876b57405ef Ancestors: Zinc-Character-Encoding-Tests-SvenVanCaekenberghe.27 Add ZnCharacterEncoder class>>#knownEncodingIdentifiers to return the collection of all known encoding identifiers in the system - thx Nicolai Hess for the request Make sure #null is resolved to ZnNullEncoder Add ZnCharacterEncoderTests>>#testKnownEncodingIdentifiers === Thanks again for the feedback, this is a useful addition. > On 6 Feb 2017, at 22:33, Nicolai Hess <[hidden email]> wrote: > > > > 2017-02-04 13:40 GMT+01:00 Sven Van Caekenberghe <[hidden email]>: > > > On 4 Feb 2017, at 13:01, Nicolai Hess <[hidden email]> wrote: > > > > > > > > 2017-02-04 12:49 GMT+01:00 Sven Van Caekenberghe <[hidden email]>: > > Hi Nicolai, > > > > The FileSystem API is a bit inconsistent, yes. > > > > This is how you can use it: > > > > (FileLocator temp / 'foo.txt') writeStreamDo: [ :out | > > out binary. > > (ZnCharacterWriteStream on: out encoding: #utf8) << 'élève' ]. > > > > (FileLocator temp / 'foo.txt') readStreamDo: [ :in | > > in binary. > > ZnCharacterReadStream on: in encoding: #utf8) upToEnd ]. > > > > (FileLocator temp / 'foo.txt') binaryReadStreamDo: [ :in | > > (ZnCharacterReadStream on: in encoding: #utf8) upToEnd ]. > > > > There is no #binaryWriteStreamDo: > > > > The API around File is more correct, IMHO. > > > > Does this help ? > > > > Yes, thanks for the fast response. > > > > What exactly is your question ? > > > > I am looking at the issues with FileList, there are some parts that don't work anymore (see FileList>>#contents, it calls some unimplemted methods), and it uses TextConverter and somoe parts of the older File api. > > It looks like most other (newer) parts are using ZnCharacterReadstream for encoding, but I couldn't find a way to use it together with FileReferences (most parts of the FileList already operate with the newer FileSystem API. > > OK. > > From my standpoint and understanding, I would always use binary streams with explicit Zn converters, this combination is much easier to understand and better implemented with more features. > > If you encounter any problem or have any questions, I will gladly try to help you. > > :-) > Is there a way to list all supported encodings, similar to TextConverter class >> #allEncodingNames ? > I only found > ZnSimplifiedByteEncoder class >> #knownEncodingIdentifiers > and > ZnByteEncoder class >> #knownEncodingIdentifiers > and nothing for all supported utf-8/16 ... encodings. > > nicolai > > > > Sven > > > > > On 4 Feb 2017, at 12:09, Nicolai Hess <[hidden email]> wrote: > > > > > > Hi > > > How can I specify the character encoding when opening a readStream on a FileRerefence. > > > > > > I found this, that works: > > > > > > | readStream fileContent | > > > readStream := (File named: aFileName) openForRead. > > > fileContent := ZnCharacterReadStream on: readStream encoding: encoding. > > > fileContent upToEnd asString. > > > > > > But if I try to do the same with a readStream from a FileReference > > > > > > | readStream fileContent | > > > readStream := aFileName asFileReference readStream. > > > fileContent := ZnCharacterReadStream on: readStream encoding: encoding. > > > fileContent upToEnd asString. > > > > > > I get an error SmallInteger DNU #asciiValue, > > > > > > this is because, in the first method, we create a binary filestream, and if we > > > use readStream from a FileReference, the stream is a MultibyteFileStream. > > > > > > How can I us ZnEncoder for a readstream from a FileReference? > > > > > > (and is it on purpose that both readStream method (openForRead/readStream) > > > return different kinds of binary streams?) > > > > > > > > > nicolai |
Free forum by Nabble | Edit this page |