All,
is there any reason why MemoryFileSystemFile>>#readStream forces it's content to a String (#aString)? readStream ^ ReadStream on: self bytes asString from: 1 to: size I'm parsing XML files from an in-memory ZIP Archive and had some real problem with non-ASCII characters. Took me some while to figure out reading from the in-memory-archive returns a String. This prevents the XML Parser from doing a PI based decoding (utf-8 in this case). Just as a sidenote: Although the GT-Spotter/XML Integration relies on FileReference it assumes that the file is in the DiskFilesystem (some methods only pass a Path). Is this intentional? If not I'd try to fix it on the run. Thanks, Udo |
> On 29 Feb 2016, at 20:27, Udo Schneider <[hidden email]> wrote: > > All, > > is there any reason why MemoryFileSystemFile>>#readStream forces it's content to a String (#aString)? > > readStream > ^ ReadStream on: self bytes asString from: 1 to: size > > I'm parsing XML files from an in-memory ZIP Archive and had some real problem with non-ASCII characters. Took me some while to figure out reading from the in-memory-archive returns a String. This prevents the XML Parser from doing a PI based decoding (utf-8 in this case). I think it is not good to do this, better stick with bytes. Primitive streams should be binary only, interpreting them as characters is easily done wrapping a ZnCharacter[Read|Write]Stream on them. But this whole situation is a mess. > Just as a sidenote: Although the GT-Spotter/XML Integration relies on FileReference it assumes that the file is in the DiskFilesystem (some methods only pass a Path). Is this intentional? If not I'd try to fix it on the run. > > Thanks, > > Udo > > |
We introduced #binaryReadStream and friends as a work around in Pharo 4. The reason we didn’t fix the problem properly is that we’ve been waiting for the introduction of XStreams in hopes of getting a cleaner stream API.
Max > On 29 Feb 2016, at 23:06, Sven Van Caekenberghe <[hidden email]> wrote: > > >> On 29 Feb 2016, at 20:27, Udo Schneider <[hidden email]> wrote: >> >> All, >> >> is there any reason why MemoryFileSystemFile>>#readStream forces it's content to a String (#aString)? >> >> readStream >> ^ ReadStream on: self bytes asString from: 1 to: size >> >> I'm parsing XML files from an in-memory ZIP Archive and had some real problem with non-ASCII characters. Took me some while to figure out reading from the in-memory-archive returns a String. This prevents the XML Parser from doing a PI based decoding (utf-8 in this case). > > I think it is not good to do this, better stick with bytes. > > Primitive streams should be binary only, interpreting them as characters is easily done wrapping a ZnCharacter[Read|Write]Stream on them. > > But this whole situation is a mess. > >> Just as a sidenote: Although the GT-Spotter/XML Integration relies on FileReference it assumes that the file is in the DiskFilesystem (some methods only pass a Path). Is this intentional? If not I'd try to fix it on the run. >> >> Thanks, >> >> Udo >> >> > > |
In reply to this post by Sven Van Caekenberghe-2
Hi sven
I would like to get Xtreams (with a slightly modified API - no ++ and --) in Pharo 6.0. So I hope that we will be able to clean the stream part of Pharo. Stef Le 29/2/16 23:06, Sven Van Caekenberghe a écrit : >> On 29 Feb 2016, at 20:27, Udo Schneider <[hidden email]> wrote: >> >> All, >> >> is there any reason why MemoryFileSystemFile>>#readStream forces it's content to a String (#aString)? >> >> readStream >> ^ ReadStream on: self bytes asString from: 1 to: size >> >> I'm parsing XML files from an in-memory ZIP Archive and had some real problem with non-ASCII characters. Took me some while to figure out reading from the in-memory-archive returns a String. This prevents the XML Parser from doing a PI based decoding (utf-8 in this case). > I think it is not good to do this, better stick with bytes. > > Primitive streams should be binary only, interpreting them as characters is easily done wrapping a ZnCharacter[Read|Write]Stream on them. > > But this whole situation is a mess. > >> Just as a sidenote: Although the GT-Spotter/XML Integration relies on FileReference it assumes that the file is in the DiskFilesystem (some methods only pass a Path). Is this intentional? If not I'd try to fix it on the run. >> >> Thanks, >> >> Udo >> >> > > |
> On 01 Mar 2016, at 07:07, stepharo <[hidden email]> wrote: > > Hi sven > > I would like to get Xtreams (with a slightly modified API - no ++ and --) in Pharo 6.0. Yes that would be a good idea. > So I hope that we will be able to clean the stream part of Pharo. But the problem is the users of streams. The stream API is too wide, people expect hundreds of methods to be there, mixing characters and bytes, encodings, line end conventions, infinite buffering, arbitrary positioning, and so on. Many of these characteristics can be elegantly composed, just when you need then. Right now we have some cool streams in the image, with minimal APIs, targeted at just one function. Like Guile's new file streams, the Zn streams, Zdc socket streams, we just have to use them. > Stef > > > Le 29/2/16 23:06, Sven Van Caekenberghe a écrit : >>> On 29 Feb 2016, at 20:27, Udo Schneider <[hidden email]> wrote: >>> >>> All, >>> >>> is there any reason why MemoryFileSystemFile>>#readStream forces it's content to a String (#aString)? >>> >>> readStream >>> ^ ReadStream on: self bytes asString from: 1 to: size >>> >>> I'm parsing XML files from an in-memory ZIP Archive and had some real problem with non-ASCII characters. Took me some while to figure out reading from the in-memory-archive returns a String. This prevents the XML Parser from doing a PI based decoding (utf-8 in this case). >> I think it is not good to do this, better stick with bytes. >> >> Primitive streams should be binary only, interpreting them as characters is easily done wrapping a ZnCharacter[Read|Write]Stream on them. >> >> But this whole situation is a mess. >> >>> Just as a sidenote: Although the GT-Spotter/XML Integration relies on FileReference it assumes that the file is in the DiskFilesystem (some methods only pass a Path). Is this intentional? If not I'd try to fix it on the run. >>> >>> Thanks, >>> >>> Udo >>> >>> >> >> > > |
In reply to this post by Sven Van Caekenberghe-2
On 29/02/16 23:06, Sven Van Caekenberghe wrote:
> I think it is not good to do this, better stick with bytes. Should I fix it and submit a slice or use a workaround in my code? CU, Udo |
> On 01 Mar 2016, at 20:51, Udo Schneider <[hidden email]> wrote: > > On 29/02/16 23:06, Sven Van Caekenberghe wrote: >> I think it is not good to do this, better stick with bytes. > Should I fix it and submit a slice or use a workaround in my code? I don’t think it makes sense to change #readStream. I do think though, that the tools would profit from an improved XML handling. Max > > CU, > > Udo > > > > |
using #asString there is wrong, IMHO, if the memory stream is on bytes it should return a read stream reading bytes, if it is on characters, the read stream should return characters.
> On 01 Mar 2016, at 21:13, Max Leske <[hidden email]> wrote: > > >> On 01 Mar 2016, at 20:51, Udo Schneider <[hidden email]> wrote: >> >> On 29/02/16 23:06, Sven Van Caekenberghe wrote: >>> I think it is not good to do this, better stick with bytes. >> Should I fix it and submit a slice or use a workaround in my code? > > I don’t think it makes sense to change #readStream. I do think though, that the tools would profit from an improved XML handling. > > Max > >> >> CU, >> >> Udo >> >> >> >> > > |
In reply to this post by Sven Van Caekenberghe-2
> Yes that would be a good idea. > >> So I hope that we will be able to clean the stream part of Pharo. > But the problem is the users of streams. The stream API is too wide, people expect hundreds of methods to be there, mixing characters and bytes, encodings, line end conventions, infinite buffering, arbitrary positioning, and so on. Many of these characteristics can be elegantly composed, just when you need then. > > Right now we have some cool streams in the image, with minimal APIs, targeted at just one function. Like Guile's new file streams, the Zn streams, Zdc socket streams, we just have to use them. So what could be a plan? Where should we spend energy? Stef |
Free forum by Nabble | Edit this page |