NeoCSV and special handling for some columns

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

NeoCSV and special handling for some columns

Nicolai Hess
Hi,

I have two problems I could not solve:

1. I would like to read only some columns: I have a large file with
    ~30 colums, but I am only interested on ~5 columns
    (not that important, I could pre-process the file) but anyway it would be
    nice to do it in smalltalk.

2. some columns will contain only strings (quoted values) and some only numbers,
    but the field may be empty, is it possible to define the default "emptyValue" as an
    - empty string for empty fields in the "string column"
    - 0 for empty fields in the "number column"?


thanks
 in advance

nicolai
Reply | Threaded
Open this post in threaded view
|

Re: NeoCSV and special handling for some columns

Mariano Martinez Peck
Imagine something like this:

neoCSVReader := (NeoCSVReader on: stream).
neoCSVReader 
separator: $,;
recordClass: PriceRecord;
addIgnoredField; "<name>"
addField:  #securityUniqueId: ; "<ticker>"
addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
addFloatField: #open: ; "<open>"
addFloatField: #high: ; "<high>"
addFloatField: #low: ; "<low>"
addFloatField: #close: ; "<close>"
addIntegerField: #volume: . "<vol>"
neoCSVReader skipHeader.
priceRecords := neoCSVReader upToEnd.


The #recordClass: is optional. If not, you can get an array of arrays instead. 
You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc. To write a default empty value, I would use my own converter. Something like:

addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ]  ]; 
addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ]  ];

Hope this helps.

Cheers, 



On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
Hi,

I have two problems I could not solve:

1. I would like to read only some columns: I have a large file with
    ~30 colums, but I am only interested on ~5 columns
    (not that important, I could pre-process the file) but anyway it would be
    nice to do it in smalltalk.

2. some columns will contain only strings (quoted values) and some only numbers,
    but the field may be empty, is it possible to define the default "emptyValue" as an
    - empty string for empty fields in the "string column"
    - 0 for empty fields in the "number column"?


thanks
 in advance

nicolai



--
Reply | Threaded
Open this post in threaded view
|

Re: NeoCSV and special handling for some columns

Sven Van Caekenberghe-2

> On 28 Sep 2015, at 15:04, Mariano Martinez Peck <[hidden email]> wrote:
>
> Imagine something like this:
>
> neoCSVReader := (NeoCSVReader on: stream).
> neoCSVReader
> separator: $,;
> recordClass: PriceRecord;
> addIgnoredField; "<name>"
> addField:  #securityUniqueId: ; "<ticker>"
> addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
> addFloatField: #open: ; "<open>"
> addFloatField: #high: ; "<high>"
> addFloatField: #low: ; "<low>"
> addFloatField: #close: ; "<close>"
> addIntegerField: #volume: . "<vol>"
> neoCSVReader skipHeader.
> priceRecords := neoCSVReader upToEnd.
>
>
> The #recordClass: is optional. If not, you can get an array of arrays instead.
> You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc.

Correct.

> To write a default empty value, I would use my own converter. Something like:
>
> addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ]  ];
> addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ]  ];

Actually there is NeoCSVReader>>#emptyFieldValue: which you can use the configure the reader (but it counts for all fields).

> Hope this helps.
>
> Cheers,
>
>
>
> On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
> Hi,
>
> I have two problems I could not solve:
>
> 1. I would like to read only some columns: I have a large file with
>     ~30 colums, but I am only interested on ~5 columns
>     (not that important, I could pre-process the file) but anyway it would be
>     nice to do it in smalltalk.
>
> 2. some columns will contain only strings (quoted values) and some only numbers,
>     but the field may be empty, is it possible to define the default "emptyValue" as an
>     - empty string for empty fields in the "string column"
>     - 0 for empty fields in the "number column"?
>
>
> thanks
>  in advance
>
> nicolai
>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com


Reply | Threaded
Open this post in threaded view
|

Re: NeoCSV and special handling for some columns

Nicolai Hess
Thanks Mariano, Sven



2015-09-28 16:42 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:

> On 28 Sep 2015, at 15:04, Mariano Martinez Peck <[hidden email]> wrote:
>
> Imagine something like this:
>
> neoCSVReader := (NeoCSVReader on: stream).
>       neoCSVReader
>               separator: $,;
>               recordClass: PriceRecord;
>               addIgnoredField; "<name>"
>               addField:  #securityUniqueId: ; "<ticker>"
>               addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
>               addFloatField: #open: ; "<open>"
>               addFloatField: #high: ; "<high>"
>               addFloatField: #low: ; "<low>"
>               addFloatField: #close: ; "<close>"
>               addIntegerField: #volume: . "<vol>"
>       neoCSVReader    skipHeader.
>       priceRecords := neoCSVReader upToEnd.
>
>
> The #recordClass: is optional. If not, you can get an array of arrays instead.
> You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc.

Correct.

But I have to call addIgnoredField ~ 25 times.
I hope there were a way to specifiy column index and field descirption only
for those columns I am interested in.

 

> To write a default empty value, I would use my own converter. Something like:
>
> addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ]  ];
> addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ]  ];

Yes this helps
 

Actually there is NeoCSVReader>>#emptyFieldValue: which you can use the configure the reader (but it counts for all fields).


 

> Hope this helps.



 
>
> Cheers,
>
>
>
> On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
> Hi,
>
> I have two problems I could not solve:
>
> 1. I would like to read only some columns: I have a large file with
>     ~30 colums, but I am only interested on ~5 columns
>     (not that important, I could pre-process the file) but anyway it would be
>     nice to do it in smalltalk.
>
> 2. some columns will contain only strings (quoted values) and some only numbers,
>     but the field may be empty, is it possible to define the default "emptyValue" as an
>     - empty string for empty fields in the "string column"
>     - 0 for empty fields in the "number column"?
>
>
> thanks
>  in advance
>
> nicolai
>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com


Reply | Threaded
Open this post in threaded view
|

Re: NeoCSV and special handling for some columns

Sven Van Caekenberghe-2

> On 28 Sep 2015, at 17:21, Nicolai Hess <[hidden email]> wrote:
>
> Thanks Mariano, Sven
>
>
>
> 2015-09-28 16:42 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>
> > On 28 Sep 2015, at 15:04, Mariano Martinez Peck <[hidden email]> wrote:
> >
> > Imagine something like this:
> >
> > neoCSVReader := (NeoCSVReader on: stream).
> >       neoCSVReader
> >               separator: $,;
> >               recordClass: PriceRecord;
> >               addIgnoredField; "<name>"
> >               addField:  #securityUniqueId: ; "<ticker>"
> >               addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
> >               addFloatField: #open: ; "<open>"
> >               addFloatField: #high: ; "<high>"
> >               addFloatField: #low: ; "<low>"
> >               addFloatField: #close: ; "<close>"
> >               addIntegerField: #volume: . "<vol>"
> >       neoCSVReader    skipHeader.
> >       priceRecords := neoCSVReader upToEnd.
> >
> >
> > The #recordClass: is optional. If not, you can get an array of arrays instead.
> > You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc.
>
> Correct.
>
> But I have to call addIgnoredField ~ 25 times.
> I hope there were a way to specifiy column index and field descirption only
> for those columns I am interested in.

NeoCSVReader>>#addIgnoredFields: count
        "Add a count of consecutive ignored fields to receiver."

;-)

>
> > To write a default empty value, I would use my own converter. Something like:
> >
> > addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ]  ];
> > addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ]  ];
>
> Yes this helps
>  
>
> Actually there is NeoCSVReader>>#emptyFieldValue: which you can use the configure the reader (but it counts for all fields).
>
>
>  
>
> > Hope this helps.
>
>
>
>  
> >
> > Cheers,
> >
> >
> >
> > On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
> > Hi,
> >
> > I have two problems I could not solve:
> >
> > 1. I would like to read only some columns: I have a large file with
> >     ~30 colums, but I am only interested on ~5 columns
> >     (not that important, I could pre-process the file) but anyway it would be
> >     nice to do it in smalltalk.
> >
> > 2. some columns will contain only strings (quoted values) and some only numbers,
> >     but the field may be empty, is it possible to define the default "emptyValue" as an
> >     - empty string for empty fields in the "string column"
> >     - 0 for empty fields in the "number column"?
> >
> >
> > thanks
> >  in advance
> >
> > nicolai
> >
> >
> >
> > --
> > Mariano
> > http://marianopeck.wordpress.com


Reply | Threaded
Open this post in threaded view
|

Re: NeoCSV and special handling for some columns

Nicolai Hess


2015-09-28 17:33 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:

> On 28 Sep 2015, at 17:21, Nicolai Hess <[hidden email]> wrote:
>
> Thanks Mariano, Sven
>
>
>
> 2015-09-28 16:42 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>
> > On 28 Sep 2015, at 15:04, Mariano Martinez Peck <[hidden email]> wrote:
> >
> > Imagine something like this:
> >
> > neoCSVReader := (NeoCSVReader on: stream).
> >       neoCSVReader
> >               separator: $,;
> >               recordClass: PriceRecord;
> >               addIgnoredField; "<name>"
> >               addField:  #securityUniqueId: ; "<ticker>"
> >               addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
> >               addFloatField: #open: ; "<open>"
> >               addFloatField: #high: ; "<high>"
> >               addFloatField: #low: ; "<low>"
> >               addFloatField: #close: ; "<close>"
> >               addIntegerField: #volume: . "<vol>"
> >       neoCSVReader    skipHeader.
> >       priceRecords := neoCSVReader upToEnd.
> >
> >
> > The #recordClass: is optional. If not, you can get an array of arrays instead.
> > You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc.
>
> Correct.
>
> But I have to call addIgnoredField ~ 25 times.
> I hope there were a way to specifiy column index and field descirption only
> for those columns I am interested in.

NeoCSVReader>>#addIgnoredFields: count
        "Add a count of consecutive ignored fields to receiver."

;-)


great!
 

>
> > To write a default empty value, I would use my own converter. Something like:
> >
> > addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ]  ];
> > addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ]  ];
>
> Yes this helps
>
>
> Actually there is NeoCSVReader>>#emptyFieldValue: which you can use the configure the reader (but it counts for all fields).
>
>
>
>
> > Hope this helps.
>
>
>
>
> >
> > Cheers,
> >
> >
> >
> > On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
> > Hi,
> >
> > I have two problems I could not solve:
> >
> > 1. I would like to read only some columns: I have a large file with
> >     ~30 colums, but I am only interested on ~5 columns
> >     (not that important, I could pre-process the file) but anyway it would be
> >     nice to do it in smalltalk.
> >
> > 2. some columns will contain only strings (quoted values) and some only numbers,
> >     but the field may be empty, is it possible to define the default "emptyValue" as an
> >     - empty string for empty fields in the "string column"
> >     - 0 for empty fields in the "number column"?
> >
> >
> > thanks
> >  in advance
> >
> > nicolai
> >
> >
> >
> > --
> > Mariano
> > http://marianopeck.wordpress.com


Reply | Threaded
Open this post in threaded view
|

Re: NeoCSV and special handling for some columns

stepharo
In reply to this post by Sven Van Caekenberghe-2
sven do you think that we should add this to the chapter?


Le 28/9/15 16:42, Sven Van Caekenberghe a écrit :

>> On 28 Sep 2015, at 15:04, Mariano Martinez Peck <[hidden email]> wrote:
>>
>> Imagine something like this:
>>
>> neoCSVReader := (NeoCSVReader on: stream).
>> neoCSVReader
>> separator: $,;
>> recordClass: PriceRecord;
>> addIgnoredField; "<name>"
>> addField:  #securityUniqueId: ; "<ticker>"
>> addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
>> addFloatField: #open: ; "<open>"
>> addFloatField: #high: ; "<high>"
>> addFloatField: #low: ; "<low>"
>> addFloatField: #close: ; "<close>"
>> addIntegerField: #volume: . "<vol>"
>> neoCSVReader skipHeader.
>> priceRecords := neoCSVReader upToEnd.
>>
>>
>> The #recordClass: is optional. If not, you can get an array of arrays instead.
>> You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc.
> Correct.
>
>> To write a default empty value, I would use my own converter. Something like:
>>
>> addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ]  ];
>> addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ]  ];
> Actually there is NeoCSVReader>>#emptyFieldValue: which you can use the configure the reader (but it counts for all fields).
>
>> Hope this helps.
>>
>> Cheers,
>>
>>
>>
>> On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
>> Hi,
>>
>> I have two problems I could not solve:
>>
>> 1. I would like to read only some columns: I have a large file with
>>      ~30 colums, but I am only interested on ~5 columns
>>      (not that important, I could pre-process the file) but anyway it would be
>>      nice to do it in smalltalk.
>>
>> 2. some columns will contain only strings (quoted values) and some only numbers,
>>      but the field may be empty, is it possible to define the default "emptyValue" as an
>>      - empty string for empty fields in the "string column"
>>      - 0 for empty fields in the "number column"?
>>
>>
>> thanks
>>   in advance
>>
>> nicolai
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>
>


Reply | Threaded
Open this post in threaded view
|

Re: NeoCSV and special handling for some columns

Sven Van Caekenberghe-2
Yes, there are a couple of newer features that were added after the documentation was written. Keeping documentation up to date is also a PITA.

> On 03 Oct 2015, at 08:54, stepharo <[hidden email]> wrote:
>
> sven do you think that we should add this to the chapter?
>
>
> Le 28/9/15 16:42, Sven Van Caekenberghe a écrit :
>>> On 28 Sep 2015, at 15:04, Mariano Martinez Peck <[hidden email]> wrote:
>>>
>>> Imagine something like this:
>>>
>>> neoCSVReader := (NeoCSVReader on: stream).
>>> neoCSVReader
>>> separator: $,;
>>> recordClass: PriceRecord;
>>> addIgnoredField; "<name>"
>>> addField:  #securityUniqueId: ; "<ticker>"
>>> addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
>>> addFloatField: #open: ; "<open>"
>>> addFloatField: #high: ; "<high>"
>>> addFloatField: #low: ; "<low>"
>>> addFloatField: #close: ; "<close>"
>>> addIntegerField: #volume: . "<vol>"
>>> neoCSVReader skipHeader.
>>> priceRecords := neoCSVReader upToEnd.
>>>
>>>
>>> The #recordClass: is optional. If not, you can get an array of arrays instead.
>>> You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc.
>> Correct.
>>
>>> To write a default empty value, I would use my own converter. Something like:
>>>
>>> addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ]  ];
>>> addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ]  ];
>> Actually there is NeoCSVReader>>#emptyFieldValue: which you can use the configure the reader (but it counts for all fields).
>>
>>> Hope this helps.
>>>
>>> Cheers,
>>>
>>>
>>>
>>> On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
>>> Hi,
>>>
>>> I have two problems I could not solve:
>>>
>>> 1. I would like to read only some columns: I have a large file with
>>>     ~30 colums, but I am only interested on ~5 columns
>>>     (not that important, I could pre-process the file) but anyway it would be
>>>     nice to do it in smalltalk.
>>>
>>> 2. some columns will contain only strings (quoted values) and some only numbers,
>>>     but the field may be empty, is it possible to define the default "emptyValue" as an
>>>     - empty string for empty fields in the "string column"
>>>     - 0 for empty fields in the "number column"?
>>>
>>>
>>> thanks
>>>  in advance
>>>
>>> nicolai
>>>
>>>
>>>
>>> --
>>> Mariano
>>> http://marianopeck.wordpress.com
>>
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: NeoCSV and special handling for some columns

stepharo
Do you commit the new section?
Because I'm full.

Le 3/10/15 09:39, Sven Van Caekenberghe a écrit :

> Yes, there are a couple of newer features that were added after the documentation was written. Keeping documentation up to date is also a PITA.
>
>> On 03 Oct 2015, at 08:54, stepharo <[hidden email]> wrote:
>>
>> sven do you think that we should add this to the chapter?
>>
>>
>> Le 28/9/15 16:42, Sven Van Caekenberghe a écrit :
>>>> On 28 Sep 2015, at 15:04, Mariano Martinez Peck <[hidden email]> wrote:
>>>>
>>>> Imagine something like this:
>>>>
>>>> neoCSVReader := (NeoCSVReader on: stream).
>>>> neoCSVReader
>>>> separator: $,;
>>>> recordClass: PriceRecord;
>>>> addIgnoredField; "<name>"
>>>> addField:  #securityUniqueId: ; "<ticker>"
>>>> addField: #date: converter: [ :string | Date readFrom: string readStream pattern: 'yyyymmdd' ]; "<date>"
>>>> addFloatField: #open: ; "<open>"
>>>> addFloatField: #high: ; "<high>"
>>>> addFloatField: #low: ; "<low>"
>>>> addFloatField: #close: ; "<close>"
>>>> addIntegerField: #volume: . "<vol>"
>>>> neoCSVReader skipHeader.
>>>> priceRecords := neoCSVReader upToEnd.
>>>>
>>>>
>>>> The #recordClass: is optional. If not, you can get an array of arrays instead.
>>>> You can add #addIgnoredField for all the ones you want to ignore, then add the #addNumber: etc for the number ones, etc.
>>> Correct.
>>>
>>>> To write a default empty value, I would use my own converter. Something like:
>>>>
>>>> addField: #stringcolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ '' ]  ];
>>>> addField: #numbercolumn: converter: [ :string | string isEmptyOrNil ifTrue: [ 0 ] ifFalse: [ NeoNumberParser parse: string ]  ];
>>> Actually there is NeoCSVReader>>#emptyFieldValue: which you can use the configure the reader (but it counts for all fields).
>>>
>>>> Hope this helps.
>>>>
>>>> Cheers,
>>>>
>>>>
>>>>
>>>> On Mon, Sep 28, 2015 at 9:06 AM, Nicolai Hess <[hidden email]> wrote:
>>>> Hi,
>>>>
>>>> I have two problems I could not solve:
>>>>
>>>> 1. I would like to read only some columns: I have a large file with
>>>>      ~30 colums, but I am only interested on ~5 columns
>>>>      (not that important, I could pre-process the file) but anyway it would be
>>>>      nice to do it in smalltalk.
>>>>
>>>> 2. some columns will contain only strings (quoted values) and some only numbers,
>>>>      but the field may be empty, is it possible to define the default "emptyValue" as an
>>>>      - empty string for empty fields in the "string column"
>>>>      - 0 for empty fields in the "number column"?
>>>>
>>>>
>>>> thanks
>>>>   in advance
>>>>
>>>> nicolai
>>>>
>>>>
>>>>
>>>> --
>>>> Mariano
>>>> http://marianopeck.wordpress.com
>>>
>>
>
>