Reading a text file line by line

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Reading a text file line by line

Dirk Olmes
Hi,

I'm trying to get started with Pharo doing something really simple - at
least that's what I thought ... I'm trying to read a text file line by line.

If I use  File named: '/tmp/linex.txt' readStream nextLine I'll get a
debugger telling me that BinaryFileStream does not understand nextLine.

Now I've tried my best to find a stream that may be reading plain text
lines but to no avail ...

Help!

-dirk

Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

Sven Van Caekenberghe-2
Hi,

> On 2 Oct 2017, at 13:07, Dirk Olmes <[hidden email]> wrote:
>
> Hi,
>
> I'm trying to get started with Pharo doing something really simple - at
> least that's what I thought ... I'm trying to read a text file line by line.
>
> If I use  File named: '/tmp/linex.txt' readStream nextLine I'll get a
> debugger telling me that BinaryFileStream does not understand nextLine.
>
> Now I've tried my best to find a stream that may be reading plain text
> lines but to no avail ...
>
> Help!
>
> -dirk

$ cat > /tmp/lines.txt
one
two
three

(FileLocator temp / 'lines.txt') contents lines.

'/tmp/lines.txt' asFileReference contents lines.

'/tmp/lines.txt' asFileReference readStreamDo: [ :in |
  Array streamContents: [ :out |
    [ in atEnd ] whileFalse: [ out nextPut: in nextLine ] ] ].

(File named: '/tmp/lines.txt') readStreamDo: [ :in |
  | characterStream |
  characterStream := ZnCharacterReadStream on: in.
  Array streamContents: [ :out |
    [ characterStream atEnd ] whileFalse: [ out nextPut: characterStream nextLine ] ] ].

They all return #('one' 'two' 'three').

In the last, more complex example, you first get a binary stream (and a 'line' is a character based concept), so wrapping the binary stream in a character read stream (which does know about lines) solves the problem.

HTH,

Sven
Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

Stephane Ducasse-3
Sven I do not see the binary stream. Is it ZnCharacterReadStream?

Stef

On Mon, Oct 2, 2017 at 1:22 PM, Sven Van Caekenberghe <[hidden email]> wrote:

> Hi,
>
>> On 2 Oct 2017, at 13:07, Dirk Olmes <[hidden email]> wrote:
>>
>> Hi,
>>
>> I'm trying to get started with Pharo doing something really simple - at
>> least that's what I thought ... I'm trying to read a text file line by line.
>>
>> If I use  File named: '/tmp/linex.txt' readStream nextLine I'll get a
>> debugger telling me that BinaryFileStream does not understand nextLine.
>>
>> Now I've tried my best to find a stream that may be reading plain text
>> lines but to no avail ...
>>
>> Help!
>>
>> -dirk
>
> $ cat > /tmp/lines.txt
> one
> two
> three
>
> (FileLocator temp / 'lines.txt') contents lines.
>
> '/tmp/lines.txt' asFileReference contents lines.
>
> '/tmp/lines.txt' asFileReference readStreamDo: [ :in |
>   Array streamContents: [ :out |
>     [ in atEnd ] whileFalse: [ out nextPut: in nextLine ] ] ].
>
> (File named: '/tmp/lines.txt') readStreamDo: [ :in |
>   | characterStream |
>   characterStream := ZnCharacterReadStream on: in.
>   Array streamContents: [ :out |
>     [ characterStream atEnd ] whileFalse: [ out nextPut: characterStream nextLine ] ] ].
>
> They all return #('one' 'two' 'three').
>
> In the last, more complex example, you first get a binary stream (and a 'line' is a character based concept), so wrapping the binary stream in a character read stream (which does know about lines) solves the problem.
>
> HTH,
>
> Sven

Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

Sven Van Caekenberghe-2
If you do

  (File named: '/tmp/lines.txt') readStream[Do:]

you seem to get a binary stream (this is the new implementation I guess), when you go via FileReference you get a character stream (but that are old ones).

I know, very confusing. We're always in the midst of transitions.

> On 2 Oct 2017, at 15:17, Stephane Ducasse <[hidden email]> wrote:
>
> Sven I do not see the binary stream. Is it ZnCharacterReadStream?
>
> Stef
>
> On Mon, Oct 2, 2017 at 1:22 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>> Hi,
>>
>>> On 2 Oct 2017, at 13:07, Dirk Olmes <[hidden email]> wrote:
>>>
>>> Hi,
>>>
>>> I'm trying to get started with Pharo doing something really simple - at
>>> least that's what I thought ... I'm trying to read a text file line by line.
>>>
>>> If I use  File named: '/tmp/linex.txt' readStream nextLine I'll get a
>>> debugger telling me that BinaryFileStream does not understand nextLine.
>>>
>>> Now I've tried my best to find a stream that may be reading plain text
>>> lines but to no avail ...
>>>
>>> Help!
>>>
>>> -dirk
>>
>> $ cat > /tmp/lines.txt
>> one
>> two
>> three
>>
>> (FileLocator temp / 'lines.txt') contents lines.
>>
>> '/tmp/lines.txt' asFileReference contents lines.
>>
>> '/tmp/lines.txt' asFileReference readStreamDo: [ :in |
>>  Array streamContents: [ :out |
>>    [ in atEnd ] whileFalse: [ out nextPut: in nextLine ] ] ].
>>
>> (File named: '/tmp/lines.txt') readStreamDo: [ :in |
>>  | characterStream |
>>  characterStream := ZnCharacterReadStream on: in.
>>  Array streamContents: [ :out |
>>    [ characterStream atEnd ] whileFalse: [ out nextPut: characterStream nextLine ] ] ].
>>
>> They all return #('one' 'two' 'three').
>>
>> In the last, more complex example, you first get a binary stream (and a 'line' is a character based concept), so wrapping the binary stream in a character read stream (which does know about lines) solves the problem.
>>
>> HTH,
>>
>> Sven
>


Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

Stephane Ducasse-3
Yes this is why we should continue to clean and remove cruft. Now I
remember that guille did that for File.

Stef

On Mon, Oct 2, 2017 at 3:20 PM, Sven Van Caekenberghe <[hidden email]> wrote:

> If you do
>
>   (File named: '/tmp/lines.txt') readStream[Do:]
>
> you seem to get a binary stream (this is the new implementation I guess), when you go via FileReference you get a character stream (but that are old ones).
>
> I know, very confusing. We're always in the midst of transitions.
>
>> On 2 Oct 2017, at 15:17, Stephane Ducasse <[hidden email]> wrote:
>>
>> Sven I do not see the binary stream. Is it ZnCharacterReadStream?
>>
>> Stef
>>
>> On Mon, Oct 2, 2017 at 1:22 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>>> Hi,
>>>
>>>> On 2 Oct 2017, at 13:07, Dirk Olmes <[hidden email]> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm trying to get started with Pharo doing something really simple - at
>>>> least that's what I thought ... I'm trying to read a text file line by line.
>>>>
>>>> If I use  File named: '/tmp/linex.txt' readStream nextLine I'll get a
>>>> debugger telling me that BinaryFileStream does not understand nextLine.
>>>>
>>>> Now I've tried my best to find a stream that may be reading plain text
>>>> lines but to no avail ...
>>>>
>>>> Help!
>>>>
>>>> -dirk
>>>
>>> $ cat > /tmp/lines.txt
>>> one
>>> two
>>> three
>>>
>>> (FileLocator temp / 'lines.txt') contents lines.
>>>
>>> '/tmp/lines.txt' asFileReference contents lines.
>>>
>>> '/tmp/lines.txt' asFileReference readStreamDo: [ :in |
>>>  Array streamContents: [ :out |
>>>    [ in atEnd ] whileFalse: [ out nextPut: in nextLine ] ] ].
>>>
>>> (File named: '/tmp/lines.txt') readStreamDo: [ :in |
>>>  | characterStream |
>>>  characterStream := ZnCharacterReadStream on: in.
>>>  Array streamContents: [ :out |
>>>    [ characterStream atEnd ] whileFalse: [ out nextPut: characterStream nextLine ] ] ].
>>>
>>> They all return #('one' 'two' 'three').
>>>
>>> In the last, more complex example, you first get a binary stream (and a 'line' is a character based concept), so wrapping the binary stream in a character read stream (which does know about lines) solves the problem.
>>>
>>> HTH,
>>>
>>> Sven
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

Guillermo Polito
Yes, in my todo, but changing FileReference like that will break a lot of backwards compatibility :)

On Mon, Oct 2, 2017 at 10:22 AM, Stephane Ducasse <[hidden email]> wrote:
Yes this is why we should continue to clean and remove cruft. Now I
remember that guille did that for File.

Stef

On Mon, Oct 2, 2017 at 3:20 PM, Sven Van Caekenberghe <[hidden email]> wrote:
> If you do
>
>   (File named: '/tmp/lines.txt') readStream[Do:]
>
> you seem to get a binary stream (this is the new implementation I guess), when you go via FileReference you get a character stream (but that are old ones).
>
> I know, very confusing. We're always in the midst of transitions.
>
>> On 2 Oct 2017, at 15:17, Stephane Ducasse <[hidden email]> wrote:
>>
>> Sven I do not see the binary stream. Is it ZnCharacterReadStream?
>>
>> Stef
>>
>> On Mon, Oct 2, 2017 at 1:22 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>>> Hi,
>>>
>>>> On 2 Oct 2017, at 13:07, Dirk Olmes <[hidden email]> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm trying to get started with Pharo doing something really simple - at
>>>> least that's what I thought ... I'm trying to read a text file line by line.
>>>>
>>>> If I use  File named: '/tmp/linex.txt' readStream nextLine I'll get a
>>>> debugger telling me that BinaryFileStream does not understand nextLine.
>>>>
>>>> Now I've tried my best to find a stream that may be reading plain text
>>>> lines but to no avail ...
>>>>
>>>> Help!
>>>>
>>>> -dirk
>>>
>>> $ cat > /tmp/lines.txt
>>> one
>>> two
>>> three
>>>
>>> (FileLocator temp / 'lines.txt') contents lines.
>>>
>>> '/tmp/lines.txt' asFileReference contents lines.
>>>
>>> '/tmp/lines.txt' asFileReference readStreamDo: [ :in |
>>>  Array streamContents: [ :out |
>>>    [ in atEnd ] whileFalse: [ out nextPut: in nextLine ] ] ].
>>>
>>> (File named: '/tmp/lines.txt') readStreamDo: [ :in |
>>>  | characterStream |
>>>  characterStream := ZnCharacterReadStream on: in.
>>>  Array streamContents: [ :out |
>>>    [ characterStream atEnd ] whileFalse: [ out nextPut: characterStream nextLine ] ] ].
>>>
>>> They all return #('one' 'two' 'three').
>>>
>>> In the last, more complex example, you first get a binary stream (and a 'line' is a character based concept), so wrapping the binary stream in a character read stream (which does know about lines) solves the problem.
>>>
>>> HTH,
>>>
>>> Sven
>>
>
>




--

   

Guille Polito

Research Engineer

Centre de Recherche en Informatique, Signal et Automatique de Lille

CRIStAL - UMR 9189

French National Center for Scientific Research - http://www.cnrs.fr


Web: http://guillep.github.io

Phone: +33 06 52 70 66 13

Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

Sven Van Caekenberghe-2


> On 3 Oct 2017, at 10:53, Guillermo Polito <[hidden email]> wrote:
>
> Yes, in my todo, but changing FileReference like that will break a lot of backwards compatibility :)

Yes it will.

I have said this before: the problem is that the current stream API is much too wide, we need to trim it to something closer to what a stream is (and not assume that a stream always lives on top of a collection).

We have compassable streams now, they work well. But they cannot implement the full API (since they are not streaming over collections). The biggest issue are the positioning message (like #skip: and #position:) which assume you known where you are and can move around at will, which is not possible for a real, indefinite stream.

> On Mon, Oct 2, 2017 at 10:22 AM, Stephane Ducasse <[hidden email]> wrote:
> Yes this is why we should continue to clean and remove cruft. Now I
> remember that guille did that for File.
>
> Stef
>
> On Mon, Oct 2, 2017 at 3:20 PM, Sven Van Caekenberghe <[hidden email]> wrote:
> > If you do
> >
> >   (File named: '/tmp/lines.txt') readStream[Do:]
> >
> > you seem to get a binary stream (this is the new implementation I guess), when you go via FileReference you get a character stream (but that are old ones).
> >
> > I know, very confusing. We're always in the midst of transitions.
> >
> >> On 2 Oct 2017, at 15:17, Stephane Ducasse <[hidden email]> wrote:
> >>
> >> Sven I do not see the binary stream. Is it ZnCharacterReadStream?
> >>
> >> Stef
> >>
> >> On Mon, Oct 2, 2017 at 1:22 PM, Sven Van Caekenberghe <[hidden email]> wrote:
> >>> Hi,
> >>>
> >>>> On 2 Oct 2017, at 13:07, Dirk Olmes <[hidden email]> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> I'm trying to get started with Pharo doing something really simple - at
> >>>> least that's what I thought ... I'm trying to read a text file line by line.
> >>>>
> >>>> If I use  File named: '/tmp/linex.txt' readStream nextLine I'll get a
> >>>> debugger telling me that BinaryFileStream does not understand nextLine.
> >>>>
> >>>> Now I've tried my best to find a stream that may be reading plain text
> >>>> lines but to no avail ...
> >>>>
> >>>> Help!
> >>>>
> >>>> -dirk
> >>>
> >>> $ cat > /tmp/lines.txt
> >>> one
> >>> two
> >>> three
> >>>
> >>> (FileLocator temp / 'lines.txt') contents lines.
> >>>
> >>> '/tmp/lines.txt' asFileReference contents lines.
> >>>
> >>> '/tmp/lines.txt' asFileReference readStreamDo: [ :in |
> >>>  Array streamContents: [ :out |
> >>>    [ in atEnd ] whileFalse: [ out nextPut: in nextLine ] ] ].
> >>>
> >>> (File named: '/tmp/lines.txt') readStreamDo: [ :in |
> >>>  | characterStream |
> >>>  characterStream := ZnCharacterReadStream on: in.
> >>>  Array streamContents: [ :out |
> >>>    [ characterStream atEnd ] whileFalse: [ out nextPut: characterStream nextLine ] ] ].
> >>>
> >>> They all return #('one' 'two' 'three').
> >>>
> >>> In the last, more complex example, you first get a binary stream (and a 'line' is a character based concept), so wrapping the binary stream in a character read stream (which does know about lines) solves the problem.
> >>>
> >>> HTH,
> >>>
> >>> Sven
> >>
> >
> >
>
>
>
>
> --
>    
> Guille Polito
> Research Engineer
>
> Centre de Recherche en Informatique, Signal et Automatique de Lille
> CRIStAL - UMR 9189
> French National Center for Scientific Research - http://www.cnrs.fr
>
> Web: http://guillep.github.io
> Phone: +33 06 52 70 66 13


Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

Dirk Olmes
In reply to this post by Sven Van Caekenberghe-2
Hi Sven,

>> I'm trying to get started with Pharo doing something really simple - at
>> least that's what I thought ... I'm trying to read a text file line by line.
>>
>> If I use  File named: '/tmp/linex.txt' readStream nextLine I'll get a
>> debugger telling me that BinaryFileStream does not understand nextLine.
>
> (File named: '/tmp/lines.txt') readStreamDo: [ :in |
>   | characterStream |
>   characterStream := ZnCharacterReadStream on: in.
>   Array streamContents: [ :out |
>     [ characterStream atEnd ] whileFalse: [ out nextPut: characterStream nextLine ] ] ].

Thanks for the hint, Sven. Of all alternatives you gave I like the last
one best as it does not load the entire contents of the (probably large)
text file into memory.

ZnCharacterReadStream belongs to the Zinc classes. I would have expected
a generic character reading stream to be part of the core classes
(whatever that means :-) and not a part of some HTTP component classes.
But that may only be my limited understanding of Pharo so far :-)

-dirk

Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

Sven Van Caekenberghe-2


> On 4 Oct 2017, at 09:23, Dirk Olmes <[hidden email]> wrote:
>
> Hi Sven,
>
>>> I'm trying to get started with Pharo doing something really simple - at
>>> least that's what I thought ... I'm trying to read a text file line by line.
>>>
>>> If I use  File named: '/tmp/linex.txt' readStream nextLine I'll get a
>>> debugger telling me that BinaryFileStream does not understand nextLine.
>>
>> (File named: '/tmp/lines.txt') readStreamDo: [ :in |
>>  | characterStream |
>>  characterStream := ZnCharacterReadStream on: in.
>>  Array streamContents: [ :out |
>>    [ characterStream atEnd ] whileFalse: [ out nextPut: characterStream nextLine ] ] ].
>
> Thanks for the hint, Sven. Of all alternatives you gave I like the last
> one best as it does not load the entire contents of the (probably large)
> text file into memory.

Yes, that is true, the first ones are more for quick and dirty scripting, real streaming is better for production code.

> ZnCharacterReadStream belongs to the Zinc classes. I would have expected
> a generic character reading stream to be part of the core classes
> (whatever that means :-) and not a part of some HTTP component classes.
> But that may only be my limited understanding of Pharo so far :-)

Well, the package is 'Zinc-Character-Encoding-Core' not 'Zinc-HTTP'. Character encoding stands on its own and has nothing to do with HTTP. See also this book chapter https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html

> -dirk
>


Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

Dirk Olmes
On 10/04/2017 10:22 AM, Sven Van Caekenberghe wrote:
>> ZnCharacterReadStream belongs to the Zinc classes. I would have expected
>> a generic character reading stream to be part of the core classes
>> (whatever that means :-) and not a part of some HTTP component classes.
>> But that may only be my limited understanding of Pharo so far :-)
>
> Well, the package is 'Zinc-Character-Encoding-Core' not 'Zinc-HTTP'. Character encoding stands on its own and has nothing to do with HTTP. See also this book chapter https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html

Understood.

However if I find classes in a package with prefix "Zinc" and I google
for "Zinc Pharo" the first thing that comes up is
http://zn.stfx.eu/zn/index.html - Zinc HTTP Components

Ok I'm really just nitpicking here but IMHO it would be nice to decouple
these very useful classes from the Zinc namespace.

-dirk

Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

Sven Van Caekenberghe-2
Dirk,

> On 4 Oct 2017, at 15:43, Dirk Olmes <[hidden email]> wrote:
>
> On 10/04/2017 10:22 AM, Sven Van Caekenberghe wrote:
>>> ZnCharacterReadStream belongs to the Zinc classes. I would have expected
>>> a generic character reading stream to be part of the core classes
>>> (whatever that means :-) and not a part of some HTTP component classes.
>>> But that may only be my limited understanding of Pharo so far :-)
>>
>> Well, the package is 'Zinc-Character-Encoding-Core' not 'Zinc-HTTP'. Character encoding stands on its own and has nothing to do with HTTP. See also this book chapter https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/Zinc-Encoding-Meta/Zinc-Encoding-Meta.html
>
> Understood.
>
> However if I find classes in a package with prefix "Zinc" and I google
> for "Zinc Pharo" the first thing that comes up is
> http://zn.stfx.eu/zn/index.html - Zinc HTTP Components
>
> Ok I'm really just nitpicking here but IMHO it would be nice to decouple
> these very useful classes from the Zinc namespace.
>
> -dirk

One day we will have package level comments that will make clear the scope and purpose of groups of code.

Anyway, thanks for the feedback.

Sven


Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

Sean P. DeNigris
Administrator
In reply to this post by Dirk Olmes
Dirk Olmes wrote
> However if I find classes in a package with prefix "Zinc" and I google
> for "Zinc Pharo" the first thing that comes up is
> http://zn.stfx.eu/zn/index.html - Zinc HTTP Components

The problem is that Sven has the nasty habit of turning frustrating areas of
the system into top-notch solutions which then become core parts of Pharo ha
ha ;)



-----
Cheers,
Sean
--
Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html

Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

CyrilFerlicot
In reply to this post by Sven Van Caekenberghe-2
Le 04/10/2017 à 16:31, Sven Van Caekenberghe a écrit :
> Dirk,
>
>
> One day we will have package level comments that will make clear the scope and purpose of groups of code.
>

Hi,

Package level comments are possible since Pharo 6. In Nautilus just open
the comment pane while selecting a package.

But that's another subject. Sorry for the parenthesis.

> Anyway, thanks for the feedback.
>
> Sven
>
>


--
Cyril Ferlicot
https://ferlicot.fr

http://www.synectique.eu
2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France


signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

Stephane Ducasse-3
In reply to this post by Sven Van Caekenberghe-2
Hi sven

I wonder how we can make progress on this front. Because we should move on.
I'm fed up to get all this crap of the old stream around.
Could not we just keep positionable stream and start to get rest nice?
Stef

On Tue, Oct 3, 2017 at 4:22 PM, Sven Van Caekenberghe <[hidden email]> wrote:

>
>
>> On 3 Oct 2017, at 10:53, Guillermo Polito <[hidden email]> wrote:
>>
>> Yes, in my todo, but changing FileReference like that will break a lot of backwards compatibility :)
>
> Yes it will.
>
> I have said this before: the problem is that the current stream API is much too wide, we need to trim it to something closer to what a stream is (and not assume that a stream always lives on top of a collection).
>
> We have compassable streams now, they work well. But they cannot implement the full API (since they are not streaming over collections). The biggest issue are the positioning message (like #skip: and #position:) which assume you known where you are and can move around at will, which is not possible for a real, indefinite stream.
>
>> On Mon, Oct 2, 2017 at 10:22 AM, Stephane Ducasse <[hidden email]> wrote:
>> Yes this is why we should continue to clean and remove cruft. Now I
>> remember that guille did that for File.
>>
>> Stef
>>
>> On Mon, Oct 2, 2017 at 3:20 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>> > If you do
>> >
>> >   (File named: '/tmp/lines.txt') readStream[Do:]
>> >
>> > you seem to get a binary stream (this is the new implementation I guess), when you go via FileReference you get a character stream (but that are old ones).
>> >
>> > I know, very confusing. We're always in the midst of transitions.
>> >
>> >> On 2 Oct 2017, at 15:17, Stephane Ducasse <[hidden email]> wrote:
>> >>
>> >> Sven I do not see the binary stream. Is it ZnCharacterReadStream?
>> >>
>> >> Stef
>> >>
>> >> On Mon, Oct 2, 2017 at 1:22 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>> >>> Hi,
>> >>>
>> >>>> On 2 Oct 2017, at 13:07, Dirk Olmes <[hidden email]> wrote:
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> I'm trying to get started with Pharo doing something really simple - at
>> >>>> least that's what I thought ... I'm trying to read a text file line by line.
>> >>>>
>> >>>> If I use  File named: '/tmp/linex.txt' readStream nextLine I'll get a
>> >>>> debugger telling me that BinaryFileStream does not understand nextLine.
>> >>>>
>> >>>> Now I've tried my best to find a stream that may be reading plain text
>> >>>> lines but to no avail ...
>> >>>>
>> >>>> Help!
>> >>>>
>> >>>> -dirk
>> >>>
>> >>> $ cat > /tmp/lines.txt
>> >>> one
>> >>> two
>> >>> three
>> >>>
>> >>> (FileLocator temp / 'lines.txt') contents lines.
>> >>>
>> >>> '/tmp/lines.txt' asFileReference contents lines.
>> >>>
>> >>> '/tmp/lines.txt' asFileReference readStreamDo: [ :in |
>> >>>  Array streamContents: [ :out |
>> >>>    [ in atEnd ] whileFalse: [ out nextPut: in nextLine ] ] ].
>> >>>
>> >>> (File named: '/tmp/lines.txt') readStreamDo: [ :in |
>> >>>  | characterStream |
>> >>>  characterStream := ZnCharacterReadStream on: in.
>> >>>  Array streamContents: [ :out |
>> >>>    [ characterStream atEnd ] whileFalse: [ out nextPut: characterStream nextLine ] ] ].
>> >>>
>> >>> They all return #('one' 'two' 'three').
>> >>>
>> >>> In the last, more complex example, you first get a binary stream (and a 'line' is a character based concept), so wrapping the binary stream in a character read stream (which does know about lines) solves the problem.
>> >>>
>> >>> HTH,
>> >>>
>> >>> Sven
>> >>
>> >
>> >
>>
>>
>>
>>
>> --
>>
>> Guille Polito
>> Research Engineer
>>
>> Centre de Recherche en Informatique, Signal et Automatique de Lille
>> CRIStAL - UMR 9189
>> French National Center for Scientific Research - http://www.cnrs.fr
>>
>> Web: http://guillep.github.io
>> Phone: +33 06 52 70 66 13
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

Guillermo Polito


On Wed, Oct 4, 2017 at 10:12 PM, Stephane Ducasse <[hidden email]> wrote:
Hi sven

I wonder how we can make progress on this front. Because we should move on.
I'm fed up to get all this crap of the old stream around.
Could not we just keep positionable stream and start to get rest nice?

Could we point it as a sprint task?
 
Stef

On Tue, Oct 3, 2017 at 4:22 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>
>
>> On 3 Oct 2017, at 10:53, Guillermo Polito <[hidden email]> wrote:
>>
>> Yes, in my todo, but changing FileReference like that will break a lot of backwards compatibility :)
>
> Yes it will.
>
> I have said this before: the problem is that the current stream API is much too wide, we need to trim it to something closer to what a stream is (and not assume that a stream always lives on top of a collection).
>
> We have compassable streams now, they work well. But they cannot implement the full API (since they are not streaming over collections). The biggest issue are the positioning message (like #skip: and #position:) which assume you known where you are and can move around at will, which is not possible for a real, indefinite stream.
>
>> On Mon, Oct 2, 2017 at 10:22 AM, Stephane Ducasse <[hidden email]> wrote:
>> Yes this is why we should continue to clean and remove cruft. Now I
>> remember that guille did that for File.
>>
>> Stef
>>
>> On Mon, Oct 2, 2017 at 3:20 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>> > If you do
>> >
>> >   (File named: '/tmp/lines.txt') readStream[Do:]
>> >
>> > you seem to get a binary stream (this is the new implementation I guess), when you go via FileReference you get a character stream (but that are old ones).
>> >
>> > I know, very confusing. We're always in the midst of transitions.
>> >
>> >> On 2 Oct 2017, at 15:17, Stephane Ducasse <[hidden email]> wrote:
>> >>
>> >> Sven I do not see the binary stream. Is it ZnCharacterReadStream?
>> >>
>> >> Stef
>> >>
>> >> On Mon, Oct 2, 2017 at 1:22 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>> >>> Hi,
>> >>>
>> >>>> On 2 Oct 2017, at 13:07, Dirk Olmes <[hidden email]> wrote:
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> I'm trying to get started with Pharo doing something really simple - at
>> >>>> least that's what I thought ... I'm trying to read a text file line by line.
>> >>>>
>> >>>> If I use  File named: '/tmp/linex.txt' readStream nextLine I'll get a
>> >>>> debugger telling me that BinaryFileStream does not understand nextLine.
>> >>>>
>> >>>> Now I've tried my best to find a stream that may be reading plain text
>> >>>> lines but to no avail ...
>> >>>>
>> >>>> Help!
>> >>>>
>> >>>> -dirk
>> >>>
>> >>> $ cat > /tmp/lines.txt
>> >>> one
>> >>> two
>> >>> three
>> >>>
>> >>> (FileLocator temp / 'lines.txt') contents lines.
>> >>>
>> >>> '/tmp/lines.txt' asFileReference contents lines.
>> >>>
>> >>> '/tmp/lines.txt' asFileReference readStreamDo: [ :in |
>> >>>  Array streamContents: [ :out |
>> >>>    [ in atEnd ] whileFalse: [ out nextPut: in nextLine ] ] ].
>> >>>
>> >>> (File named: '/tmp/lines.txt') readStreamDo: [ :in |
>> >>>  | characterStream |
>> >>>  characterStream := ZnCharacterReadStream on: in.
>> >>>  Array streamContents: [ :out |
>> >>>    [ characterStream atEnd ] whileFalse: [ out nextPut: characterStream nextLine ] ] ].
>> >>>
>> >>> They all return #('one' 'two' 'three').
>> >>>
>> >>> In the last, more complex example, you first get a binary stream (and a 'line' is a character based concept), so wrapping the binary stream in a character read stream (which does know about lines) solves the problem.
>> >>>
>> >>> HTH,
>> >>>
>> >>> Sven
>> >>
>> >
>> >
>>
>>
>>
>>
>> --
>>
>> Guille Polito
>> Research Engineer
>>
>> Centre de Recherche en Informatique, Signal et Automatique de Lille
>> CRIStAL - UMR 9189
>> French National Center for Scientific Research - http://www.cnrs.fr
>>
>> Web: http://guillep.github.io
>> Phone: <a href="tel:%2B33%2006%2052%2070%2066%2013" value="+33652706613">+33 06 52 70 66 13
>
>




--

   

Guille Polito

Research Engineer

Centre de Recherche en Informatique, Signal et Automatique de Lille

CRIStAL - UMR 9189

French National Center for Scientific Research - http://www.cnrs.fr


Web: http://guillep.github.io

Phone: +33 06 52 70 66 13

Reply | Threaded
Open this post in threaded view
|

Re: Reading a text file line by line

Stephane Ducasse-3
Yes I would like to have an action list.

Stef

On Thu, Oct 5, 2017 at 9:37 AM, Guillermo Polito <[hidden email]> wrote:


On Wed, Oct 4, 2017 at 10:12 PM, Stephane Ducasse <[hidden email]> wrote:
Hi sven

I wonder how we can make progress on this front. Because we should move on.
I'm fed up to get all this crap of the old stream around.
Could not we just keep positionable stream and start to get rest nice?

Could we point it as a sprint task?
 
Stef

On Tue, Oct 3, 2017 at 4:22 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>
>
>> On 3 Oct 2017, at 10:53, Guillermo Polito <[hidden email]> wrote:
>>
>> Yes, in my todo, but changing FileReference like that will break a lot of backwards compatibility :)
>
> Yes it will.
>
> I have said this before: the problem is that the current stream API is much too wide, we need to trim it to something closer to what a stream is (and not assume that a stream always lives on top of a collection).
>
> We have compassable streams now, they work well. But they cannot implement the full API (since they are not streaming over collections). The biggest issue are the positioning message (like #skip: and #position:) which assume you known where you are and can move around at will, which is not possible for a real, indefinite stream.
>
>> On Mon, Oct 2, 2017 at 10:22 AM, Stephane Ducasse <[hidden email]> wrote:
>> Yes this is why we should continue to clean and remove cruft. Now I
>> remember that guille did that for File.
>>
>> Stef
>>
>> On Mon, Oct 2, 2017 at 3:20 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>> > If you do
>> >
>> >   (File named: '/tmp/lines.txt') readStream[Do:]
>> >
>> > you seem to get a binary stream (this is the new implementation I guess), when you go via FileReference you get a character stream (but that are old ones).
>> >
>> > I know, very confusing. We're always in the midst of transitions.
>> >
>> >> On 2 Oct 2017, at 15:17, Stephane Ducasse <[hidden email]> wrote:
>> >>
>> >> Sven I do not see the binary stream. Is it ZnCharacterReadStream?
>> >>
>> >> Stef
>> >>
>> >> On Mon, Oct 2, 2017 at 1:22 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>> >>> Hi,
>> >>>
>> >>>> On 2 Oct 2017, at 13:07, Dirk Olmes <[hidden email]> wrote:
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> I'm trying to get started with Pharo doing something really simple - at
>> >>>> least that's what I thought ... I'm trying to read a text file line by line.
>> >>>>
>> >>>> If I use  File named: '/tmp/linex.txt' readStream nextLine I'll get a
>> >>>> debugger telling me that BinaryFileStream does not understand nextLine.
>> >>>>
>> >>>> Now I've tried my best to find a stream that may be reading plain text
>> >>>> lines but to no avail ...
>> >>>>
>> >>>> Help!
>> >>>>
>> >>>> -dirk
>> >>>
>> >>> $ cat > /tmp/lines.txt
>> >>> one
>> >>> two
>> >>> three
>> >>>
>> >>> (FileLocator temp / 'lines.txt') contents lines.
>> >>>
>> >>> '/tmp/lines.txt' asFileReference contents lines.
>> >>>
>> >>> '/tmp/lines.txt' asFileReference readStreamDo: [ :in |
>> >>>  Array streamContents: [ :out |
>> >>>    [ in atEnd ] whileFalse: [ out nextPut: in nextLine ] ] ].
>> >>>
>> >>> (File named: '/tmp/lines.txt') readStreamDo: [ :in |
>> >>>  | characterStream |
>> >>>  characterStream := ZnCharacterReadStream on: in.
>> >>>  Array streamContents: [ :out |
>> >>>    [ characterStream atEnd ] whileFalse: [ out nextPut: characterStream nextLine ] ] ].
>> >>>
>> >>> They all return #('one' 'two' 'three').
>> >>>
>> >>> In the last, more complex example, you first get a binary stream (and a 'line' is a character based concept), so wrapping the binary stream in a character read stream (which does know about lines) solves the problem.
>> >>>
>> >>> HTH,
>> >>>
>> >>> Sven
>> >>
>> >
>> >
>>
>>
>>
>>
>> --
>>
>> Guille Polito
>> Research Engineer
>>
>> Centre de Recherche en Informatique, Signal et Automatique de Lille
>> CRIStAL - UMR 9189
>> French National Center for Scientific Research - http://www.cnrs.fr
>>
>> Web: http://guillep.github.io
>> Phone: <a href="tel:%2B33%2006%2052%2070%2066%2013" value="+33652706613" target="_blank">+33 06 52 70 66 13
>
>




--

   

Guille Polito

Research Engineer

Centre de Recherche en Informatique, Signal et Automatique de Lille

CRIStAL - UMR 9189

French National Center for Scientific Research - http://www.cnrs.fr


Web: http://guillep.github.io

Phone: <a href="tel:+33%206%2052%2070%2066%2013" value="+33652706613" target="_blank">+33 06 52 70 66 13