Smalltalk › Pharo › Pharo Smalltalk Users

NeoCSV and special handling for some columns

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

9 messages Options

Nicolai Hess

NeoCSV and special handling for some columns

Hi,

I have two problems I could not solve:

1. I would like to read only some columns: I have a large file with

~30 colums, but I am only interested on ~5 columns

(not that important, I could pre-process the file) but anyway it would be

nice to do it in smalltalk.

2. some columns will contain only strings (quoted values) and some only numbers,

but the field may be empty, is it possible to define the default "emptyValue" as an

- empty string for empty fields in the "string column"

- 0 for empty fields in the "number column"?

thanks

in advance

nicolai

Mariano Martinez Peck

Re: NeoCSV and special handling for some columns

Imagine something like this:

neoCSVReader := (NeoCSVReader on: stream).

neoCSVReader

separator: $,;

recordClass: PriceRecord;

addIgnoredField; "<name>"

addField: #securityUniqueId: ; "<ticker>"

addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"

addFloatField: #open: ; "<open>"

addFloatField: #high: ; "<high>"

addFloatField: #low: ; "<low>"

addFloatField: #close: ; "<close>"

addIntegerField: #volume: . "<vol>"

neoCSVReader skipHeader.

priceRecords := neoCSVReader upToEnd.

The #recordClass: is optional. If not, you can get an array of arrays instead.

You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc. To write a default empty value, I would use my own converter. Something like:

addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ] ];

addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ] ];

Hope this helps.

Cheers,

On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:

Hi,

I have two problems I could not solve:

1. I would like to read only some columns: I have a large file with
    ~30 colums, but I am only interested on ~5 columns
    (not that important, I could pre-process the file) but anyway it would be
    nice to do it in smalltalk.

2. some columns will contain only strings (quoted values) and some only numbers,
    but the field may be empty, is it possible to define the default "emptyValue" as an
    - empty string for empty fields in the "string column"
    - 0 for empty fields in the "number column"?

thanks
in advance

nicolai

Mariano
http://marianopeck.wordpress.com

Sven Van Caekenberghe-2

Re: NeoCSV and special handling for some columns

> On 28 Sep 2015, at 15:04, Mariano Martinez Peck <[hidden email]> wrote:
>
> Imagine something like this:
>
> neoCSVReader := (NeoCSVReader on: stream).
> neoCSVReader
> separator: $,;
> recordClass: PriceRecord;
> addIgnoredField; "<name>"
> addField: #securityUniqueId: ; "<ticker>"
> addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
> addFloatField: #open: ; "<open>"
> addFloatField: #high: ; "<high>"
> addFloatField: #low: ; "<low>"
> addFloatField: #close: ; "<close>"
> addIntegerField: #volume: . "<vol>"
> neoCSVReader skipHeader.
> priceRecords := neoCSVReader upToEnd.
>
>
> The #recordClass: is optional. If not, you can get an array of arrays instead.
> You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc.

Correct.

> To write a default empty value, I would use my own converter. Something like:
>
> addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ] ];
> addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ] ];

Actually there is NeoCSVReader>>#emptyFieldValue: which you can use the configure the reader (but it counts for all fields).

> Hope this helps.
>
> Cheers,
>
>
>
> On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
> Hi,
>
> I have two problems I could not solve:
>
> 1. I would like to read only some columns: I have a large file with
> ~30 colums, but I am only interested on ~5 columns
> (not that important, I could pre-process the file) but anyway it would be
> nice to do it in smalltalk.
>
> 2. some columns will contain only strings (quoted values) and some only numbers,
> but the field may be empty, is it possible to define the default "emptyValue" as an
> - empty string for empty fields in the "string column"
> - 0 for empty fields in the "number column"?
>
>
> thanks
> in advance
>
> nicolai
>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com

Nicolai Hess

Re: NeoCSV and special handling for some columns

Thanks Mariano, Sven

2015-09-28 16:42 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:

> On 28 Sep 2015, at 15:04, Mariano Martinez Peck <[hidden email]> wrote:
>
> Imagine something like this:
>
> neoCSVReader := (NeoCSVReader on: stream).
> neoCSVReader
> separator: $,;
> recordClass: PriceRecord;
> addIgnoredField; "<name>"
> addField: #securityUniqueId: ; "<ticker>"
> addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
> addFloatField: #open: ; "<open>"
> addFloatField: #high: ; "<high>"
> addFloatField: #low: ; "<low>"
> addFloatField: #close: ; "<close>"
> addIntegerField: #volume: . "<vol>"
> neoCSVReader skipHeader.
> priceRecords := neoCSVReader upToEnd.
>
>
> The #recordClass: is optional. If not, you can get an array of arrays instead.
> You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc.

Correct.

But I have to call addIgnoredField ~ 25 times.

I hope there were a way to specifiy column index and field descirption only

for those columns I am interested in.

> To write a default empty value, I would use my own converter. Something like:
>
> addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ] ];
> addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ] ];

Yes this helps

Actually there is NeoCSVReader>>#emptyFieldValue: which you can use the configure the reader (but it counts for all fields).

> Hope this helps.

>
> Cheers,
>
>
>
> On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
> Hi,
>
> I have two problems I could not solve:
>
> 1. I would like to read only some columns: I have a large file with
> ~30 colums, but I am only interested on ~5 columns
> (not that important, I could pre-process the file) but anyway it would be
> nice to do it in smalltalk.
>
> 2. some columns will contain only strings (quoted values) and some only numbers,
> but the field may be empty, is it possible to define the default "emptyValue" as an
> - empty string for empty fields in the "string column"
> - 0 for empty fields in the "number column"?
>
>
> thanks
> in advance
>
> nicolai
>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com

Sven Van Caekenberghe-2

Re: NeoCSV and special handling for some columns

> On 28 Sep 2015, at 17:21, Nicolai Hess <[hidden email]> wrote:
>
> Thanks Mariano, Sven
>
>
>
> 2015-09-28 16:42 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>
> > On 28 Sep 2015, at 15:04, Mariano Martinez Peck <[hidden email]> wrote:
> >
> > Imagine something like this:
> >
> > neoCSVReader := (NeoCSVReader on: stream).
> > neoCSVReader
> > separator: $,;
> > recordClass: PriceRecord;
> > addIgnoredField; "<name>"
> > addField: #securityUniqueId: ; "<ticker>"
> > addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
> > addFloatField: #open: ; "<open>"
> > addFloatField: #high: ; "<high>"
> > addFloatField: #low: ; "<low>"
> > addFloatField: #close: ; "<close>"
> > addIntegerField: #volume: . "<vol>"
> > neoCSVReader skipHeader.
> > priceRecords := neoCSVReader upToEnd.
> >
> >
> > The #recordClass: is optional. If not, you can get an array of arrays instead.
> > You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc.
>
> Correct.
>
> But I have to call addIgnoredField ~ 25 times.
> I hope there were a way to specifiy column index and field descirption only
> for those columns I am interested in.

NeoCSVReader>>#addIgnoredFields: count
"Add a count of consecutive ignored fields to receiver."

;-)

>
> > To write a default empty value, I would use my own converter. Something like:
> >
> > addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ] ];
> > addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ] ];
>
> Yes this helps
>
>
> Actually there is NeoCSVReader>>#emptyFieldValue: which you can use the configure the reader (but it counts for all fields).
>
>
>
>
> > Hope this helps.
>
>
>
>
> >
> > Cheers,
> >
> >
> >
> > On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
> > Hi,
> >
> > I have two problems I could not solve:
> >
> > 1. I would like to read only some columns: I have a large file with
> > ~30 colums, but I am only interested on ~5 columns
> > (not that important, I could pre-process the file) but anyway it would be
> > nice to do it in smalltalk.
> >
> > 2. some columns will contain only strings (quoted values) and some only numbers,
> > but the field may be empty, is it possible to define the default "emptyValue" as an
> > - empty string for empty fields in the "string column"
> > - 0 for empty fields in the "number column"?
> >
> >
> > thanks
> > in advance
> >
> > nicolai
> >
> >
> >
> > --
> > Mariano
> > http://marianopeck.wordpress.com

Nicolai Hess

Re: NeoCSV and special handling for some columns

2015-09-28 17:33 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:

> On 28 Sep 2015, at 17:21, Nicolai Hess <[hidden email]> wrote:
>
> Thanks Mariano, Sven
>
>
>
> 2015-09-28 16:42 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>
> > On 28 Sep 2015, at 15:04, Mariano Martinez Peck <[hidden email]> wrote:
> >
> > Imagine something like this:
> >
> > neoCSVReader := (NeoCSVReader on: stream).
> > neoCSVReader
> > separator: $,;
> > recordClass: PriceRecord;
> > addIgnoredField; "<name>"
> > addField: #securityUniqueId: ; "<ticker>"
> > addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
> > addFloatField: #open: ; "<open>"
> > addFloatField: #high: ; "<high>"
> > addFloatField: #low: ; "<low>"
> > addFloatField: #close: ; "<close>"
> > addIntegerField: #volume: . "<vol>"
> > neoCSVReader skipHeader.
> > priceRecords := neoCSVReader upToEnd.
> >
> >
> > The #recordClass: is optional. If not, you can get an array of arrays instead.
> > You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc.
>
> Correct.
>
> But I have to call addIgnoredField ~ 25 times.
> I hope there were a way to specifiy column index and field descirption only
> for those columns I am interested in.

NeoCSVReader>>#addIgnoredFields: count
"Add a count of consecutive ignored fields to receiver."

;-)

great!

>
> > To write a default empty value, I would use my own converter. Something like:
> >
> > addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ] ];
> > addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ] ];
>
> Yes this helps
>
>
> Actually there is NeoCSVReader>>#emptyFieldValue: which you can use the configure the reader (but it counts for all fields).
>
>
>
>
> > Hope this helps.
>
>
>
>
> >
> > Cheers,
> >
> >
> >
> > On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
> > Hi,
> >
> > I have two problems I could not solve:
> >
> > 1. I would like to read only some columns: I have a large file with
> > ~30 colums, but I am only interested on ~5 columns
> > (not that important, I could pre-process the file) but anyway it would be
> > nice to do it in smalltalk.
> >
> > 2. some columns will contain only strings (quoted values) and some only numbers,
> > but the field may be empty, is it possible to define the default "emptyValue" as an
> > - empty string for empty fields in the "string column"
> > - 0 for empty fields in the "number column"?
> >
> >
> > thanks
> > in advance
> >
> > nicolai
> >
> >
> >
> > --
> > Mariano
> > http://marianopeck.wordpress.com

stepharo

Re: NeoCSV and special handling for some columns

In reply to this post by Sven Van Caekenberghe-2

sven do you think that we should add this to the chapter?

Le 28/9/15 16:42, Sven Van Caekenberghe a écrit :

>> On 28 Sep 2015, at 15:04, Mariano Martinez Peck <[hidden email]> wrote:
>>
>> Imagine something like this:
>>
>> neoCSVReader := (NeoCSVReader on: stream).
>> neoCSVReader
>> separator: $,;
>> recordClass: PriceRecord;
>> addIgnoredField; "<name>"
>> addField: #securityUniqueId: ; "<ticker>"
>> addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
>> addFloatField: #open: ; "<open>"
>> addFloatField: #high: ; "<high>"
>> addFloatField: #low: ; "<low>"
>> addFloatField: #close: ; "<close>"
>> addIntegerField: #volume: . "<vol>"
>> neoCSVReader skipHeader.
>> priceRecords := neoCSVReader upToEnd.
>>
>>
>> The #recordClass: is optional. If not, you can get an array of arrays instead.
>> You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc.
> Correct.
>
>> To write a default empty value, I would use my own converter. Something like:
>>
>> addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ] ];
>> addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ] ];
> Actually there is NeoCSVReader>>#emptyFieldValue: which you can use the configure the reader (but it counts for all fields).
>
>> Hope this helps.
>>
>> Cheers,
>>
>>
>>
>> On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
>> Hi,
>>
>> I have two problems I could not solve:
>>
>> 1. I would like to read only some columns: I have a large file with
>> ~30 colums, but I am only interested on ~5 columns
>> (not that important, I could pre-process the file) but anyway it would be
>> nice to do it in smalltalk.
>>
>> 2. some columns will contain only strings (quoted values) and some only numbers,
>> but the field may be empty, is it possible to define the default "emptyValue" as an
>> - empty string for empty fields in the "string column"
>> - 0 for empty fields in the "number column"?
>>
>>
>> thanks
>> in advance
>>
>> nicolai
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>
>

Sven Van Caekenberghe-2

Re: NeoCSV and special handling for some columns

Yes, there are a couple of newer features that were added after the documentation was written. Keeping documentation up to date is also a PITA.

> On 03 Oct 2015, at 08:54, stepharo <[hidden email]> wrote:
>
> sven do you think that we should add this to the chapter?
>
>
> Le 28/9/15 16:42, Sven Van Caekenberghe a écrit :
>>> On 28 Sep 2015, at 15:04, Mariano Martinez Peck <[hidden email]> wrote:
>>>
>>> Imagine something like this:
>>>
>>> neoCSVReader := (NeoCSVReader on: stream).
>>> neoCSVReader
>>> separator: $,;
>>> recordClass: PriceRecord;
>>> addIgnoredField; "<name>"
>>> addField: #securityUniqueId: ; "<ticker>"
>>> addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
>>> addFloatField: #open: ; "<open>"
>>> addFloatField: #high: ; "<high>"
>>> addFloatField: #low: ; "<low>"
>>> addFloatField: #close: ; "<close>"
>>> addIntegerField: #volume: . "<vol>"
>>> neoCSVReader skipHeader.
>>> priceRecords := neoCSVReader upToEnd.
>>>
>>>
>>> The #recordClass: is optional. If not, you can get an array of arrays instead.
>>> You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc.
>> Correct.
>>
>>> To write a default empty value, I would use my own converter. Something like:
>>>
>>> addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ] ];
>>> addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ] ];
>> Actually there is NeoCSVReader>>#emptyFieldValue: which you can use the configure the reader (but it counts for all fields).
>>
>>> Hope this helps.
>>>
>>> Cheers,
>>>
>>>
>>>
>>> On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
>>> Hi,
>>>
>>> I have two problems I could not solve:
>>>
>>> 1. I would like to read only some columns: I have a large file with
>>> ~30 colums, but I am only interested on ~5 columns
>>> (not that important, I could pre-process the file) but anyway it would be
>>> nice to do it in smalltalk.
>>>
>>> 2. some columns will contain only strings (quoted values) and some only numbers,
>>> but the field may be empty, is it possible to define the default "emptyValue" as an
>>> - empty string for empty fields in the "string column"
>>> - 0 for empty fields in the "number column"?
>>>
>>>
>>> thanks
>>> in advance
>>>
>>> nicolai
>>>
>>>
>>>
>>> --
>>> Mariano
>>> http://marianopeck.wordpress.com
>>
>>
>
>

stepharo

Re: NeoCSV and special handling for some columns

Do you commit the new section?
Because I'm full.

Le 3/10/15 09:39, Sven Van Caekenberghe a écrit :

> Yes, there are a couple of newer features that were added after the documentation was written. Keeping documentation up to date is also a PITA.
>
>> On 03 Oct 2015, at 08:54, stepharo <[hidden email]> wrote:
>>
>> sven do you think that we should add this to the chapter?
>>
>>
>> Le 28/9/15 16:42, Sven Van Caekenberghe a écrit :
>>>> On 28 Sep 2015, at 15:04, Mariano Martinez Peck <[hidden email]> wrote:
>>>>
>>>> Imagine something like this:
>>>>
>>>> neoCSVReader := (NeoCSVReader on: stream).
>>>> neoCSVReader
>>>> separator: $,;
>>>> recordClass: PriceRecord;
>>>> addIgnoredField; "<name>"
>>>> addField: #securityUniqueId: ; "<ticker>"
>>>> addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
>>>> addFloatField: #open: ; "<open>"
>>>> addFloatField: #high: ; "<high>"
>>>> addFloatField: #low: ; "<low>"
>>>> addFloatField: #close: ; "<close>"
>>>> addIntegerField: #volume: . "<vol>"
>>>> neoCSVReader skipHeader.
>>>> priceRecords := neoCSVReader upToEnd.
>>>>
>>>>
>>>> The #recordClass: is optional. If not, you can get an array of arrays instead.
>>>> You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc.
>>> Correct.
>>>
>>>> To write a default empty value, I would use my own converter. Something like:
>>>>
>>>> addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ] ];
>>>> addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ] ];
>>> Actually there is NeoCSVReader>>#emptyFieldValue: which you can use the configure the reader (but it counts for all fields).
>>>
>>>> Hope this helps.
>>>>
>>>> Cheers,
>>>>
>>>>
>>>>
>>>> On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
>>>> Hi,
>>>>
>>>> I have two problems I could not solve:
>>>>
>>>> 1. I would like to read only some columns: I have a large file with
>>>> ~30 colums, but I am only interested on ~5 columns
>>>> (not that important, I could pre-process the file) but anyway it would be
>>>> nice to do it in smalltalk.
>>>>
>>>> 2. some columns will contain only strings (quoted values) and some only numbers,
>>>> but the field may be empty, is it possible to define the default "emptyValue" as an
>>>> - empty string for empty fields in the "string column"
>>>> - 0 for empty fields in the "number column"?
>>>>
>>>>
>>>> thanks
>>>> in advance
>>>>
>>>> nicolai
>>>>
>>>>
>>>>
>>>> --
>>>> Mariano
>>>> http://marianopeck.wordpress.com
>>>
>>
>
>