NeoCSVReader skipEmptyLines?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

NeoCSVReader skipEmptyLines?

Esteban A. Maringolo
Sven,

Is it possible to skip blank/empty lines in NeoCSVReader?

So if a line is empty or contains ",,,,,,,," it would be skipped from
the iteration.

Does it already provider such feature? (I couldn't find any setting
related with this).

Esteban A. Maringolo

Reply | Threaded
Open this post in threaded view
|

Re: NeoCSVReader skipEmptyLines?

Sven Van Caekenberghe-2
Esteban,

On 04 Oct 2014, at 02:42, Esteban A. Maringolo <[hidden email]> wrote:

> Sven,
>
> Is it possible to skip blank/empty lines in NeoCSVReader?
>
> So if a line is empty or contains ",,,,,,,," it would be skipped from
> the iteration.
>
> Does it already provider such feature? (I couldn't find any setting
> related with this).
>
> Esteban A. Maringolo

Could you elaborate a bit on the use case ?

My first reaction would be that you have to deal with these yourself afterwards. What would be the definition of an empty record ? What about empty quoted fields ?

Sven

PS: I got your other contribution, I think it is OK to add, thanks.



Reply | Threaded
Open this post in threaded view
|

Re: NeoCSVReader skipEmptyLines?

Esteban A. Maringolo
2014-10-04 10:47 GMT-03:00 Sven Van Caekenberghe <[hidden email]>:

> Esteban,
>
> On 04 Oct 2014, at 02:42, Esteban A. Maringolo <[hidden email]> wrote:
>
>> Sven,
>>
>> Is it possible to skip blank/empty lines in NeoCSVReader?
>>
>> So if a line is empty or contains ",,,,,,,," it would be skipped from
>> the iteration.
>>
>> Does it already provider such feature? (I couldn't find any setting
>> related with this).
>>
>> Esteban A. Maringolo
>
> Could you elaborate a bit on the use case ?

Yes, users upload a CSV file with the proper columns (sometimes they
don't, but that's another story), and because they export/copy the CSV
from Excel most of
the times there are *lots* (might be hundreds) of empty lines at the
end, it is, lines
with separators only, or even blank lines.

> My first reaction would be that you have to deal with these yourself afterwards.

When dealing with "array based" records, it is, with no #recordClass
defined it would be easier. Because you can check whether all the
values of the array are empty or nil. So on each iteration (#next) you
could return the value array if it isn't empty or send #next again
until you get a record with any data.

When working with a record class I think you can't do this, because it
seems you instantiate the record before iterating through each field
of the record.

> What would be the definition of an empty record ? What about empty quoted fields ?

An empty record of a four columns CSV file would look like:
,,,
or (if some columns are quoted)
,"","",

I don't see how a record like that could be of any use.

Best regards!

Esteban A. Maringolo


> PS: I got your other contribution, I think it is OK to add, thanks.

Great!

Reply | Threaded
Open this post in threaded view
|

Re: NeoCSVReader skipEmptyLines?

Sven Van Caekenberghe-2
In #bleedingEdge:

===
Name: Neo-CSV-Core-SvenVanCaekenberghe.21
Author: SvenVanCaekenberghe
Time: 6 October 2014, 5:49:01.691696 pm
UUID: a6faeef3-aaa7-460f-a815-8e77b62f9906
Ancestors: Neo-CSV-Core-SvenVanCaekenberghe.20

Added NeoCSVReader>>#addIgnoredFields:
Added NeoCSVReader>>#select:[thenDo:] to skip empty or other records
Added NeoCSVReaderTests>>#testSkippingEmptyRecords
===
Name: Neo-CSV-Tests-SvenVanCaekenberghe.18
Author: SvenVanCaekenberghe
Time: 6 October 2014, 5:49:36.206743 pm
UUID: 95d87278-4398-41d3-96ed-83705d5488ce
Ancestors: Neo-CSV-Tests-SvenVanCaekenberghe.17

Added NeoCSVReader>>#addIgnoredFields:
Added NeoCSVReader>>#select:[thenDo:] to skip empty or other records
Added NeoCSVReaderTests>>#testSkippingEmptyRecords
===

Here is the test:

testSkippingEmptyRecords
  | input output |
  input := '1,2,3\\4,5,6\,,\7,8,9' withCRs.
  output := (NeoCSVReader on: input readStream)
    select: [ :each | each notEmpty and: [ (each allSatisfy: #isNil) not ] ].
  self assert: output equals: #(#('1' '2' '3') #('4' '5' '6') #('7' '8' '9')).
  output := (NeoCSVReader on: input readStream)
    emptyFieldValue: '';
    select: [ :each | each notEmpty and: [ (each allSatisfy: #isEmpty) not ] ].
  self assert: output equals: #(#('1' '2' '3') #('4' '5' '6') #('7' '8' '9'))

Regards,

Sven

On 05 Oct 2014, at 02:39, Esteban A. Maringolo <[hidden email]> wrote:

> 2014-10-04 10:47 GMT-03:00 Sven Van Caekenberghe <[hidden email]>:
>> Esteban,
>>
>> On 04 Oct 2014, at 02:42, Esteban A. Maringolo <[hidden email]> wrote:
>>
>>> Sven,
>>>
>>> Is it possible to skip blank/empty lines in NeoCSVReader?
>>>
>>> So if a line is empty or contains ",,,,,,,," it would be skipped from
>>> the iteration.
>>>
>>> Does it already provider such feature? (I couldn't find any setting
>>> related with this).
>>>
>>> Esteban A. Maringolo
>>
>> Could you elaborate a bit on the use case ?
>
> Yes, users upload a CSV file with the proper columns (sometimes they
> don't, but that's another story), and because they export/copy the CSV
> from Excel most of
> the times there are *lots* (might be hundreds) of empty lines at the
> end, it is, lines
> with separators only, or even blank lines.
>
>> My first reaction would be that you have to deal with these yourself afterwards.
>
> When dealing with "array based" records, it is, with no #recordClass
> defined it would be easier. Because you can check whether all the
> values of the array are empty or nil. So on each iteration (#next) you
> could return the value array if it isn't empty or send #next again
> until you get a record with any data.
>
> When working with a record class I think you can't do this, because it
> seems you instantiate the record before iterating through each field
> of the record.
>
>> What would be the definition of an empty record ? What about empty quoted fields ?
>
> An empty record of a four columns CSV file would look like:
> ,,,
> or (if some columns are quoted)
> ,"","",
>
> I don't see how a record like that could be of any use.
>
> Best regards!
>
> Esteban A. Maringolo
>
>
>> PS: I got your other contribution, I think it is OK to add, thanks.
>
> Great!


Reply | Threaded
Open this post in threaded view
|

Re: NeoCSVReader skipEmptyLines?

Esteban A. Maringolo
Excellent!

And your implementation is better than simply skipping empty lines :)

Thank you Sven!
Esteban A. Maringolo


2014-10-06 12:52 GMT-03:00 Sven Van Caekenberghe <[hidden email]>:

> In #bleedingEdge:
>
> ===
> Name: Neo-CSV-Core-SvenVanCaekenberghe.21
> Author: SvenVanCaekenberghe
> Time: 6 October 2014, 5:49:01.691696 pm
> UUID: a6faeef3-aaa7-460f-a815-8e77b62f9906
> Ancestors: Neo-CSV-Core-SvenVanCaekenberghe.20
>
> Added NeoCSVReader>>#addIgnoredFields:
> Added NeoCSVReader>>#select:[thenDo:] to skip empty or other records
> Added NeoCSVReaderTests>>#testSkippingEmptyRecords
> ===
> Name: Neo-CSV-Tests-SvenVanCaekenberghe.18
> Author: SvenVanCaekenberghe
> Time: 6 October 2014, 5:49:36.206743 pm
> UUID: 95d87278-4398-41d3-96ed-83705d5488ce
> Ancestors: Neo-CSV-Tests-SvenVanCaekenberghe.17
>
> Added NeoCSVReader>>#addIgnoredFields:
> Added NeoCSVReader>>#select:[thenDo:] to skip empty or other records
> Added NeoCSVReaderTests>>#testSkippingEmptyRecords
> ===
>
> Here is the test:
>
> testSkippingEmptyRecords
>   | input output |
>   input := '1,2,3\\4,5,6\,,\7,8,9' withCRs.
>   output := (NeoCSVReader on: input readStream)
>     select: [ :each | each notEmpty and: [ (each allSatisfy: #isNil) not ] ].
>   self assert: output equals: #(#('1' '2' '3') #('4' '5' '6') #('7' '8' '9')).
>   output := (NeoCSVReader on: input readStream)
>     emptyFieldValue: '';
>     select: [ :each | each notEmpty and: [ (each allSatisfy: #isEmpty) not ] ].
>   self assert: output equals: #(#('1' '2' '3') #('4' '5' '6') #('7' '8' '9'))
>
> Regards,
>
> Sven
>
> On 05 Oct 2014, at 02:39, Esteban A. Maringolo <[hidden email]> wrote:
>
>> 2014-10-04 10:47 GMT-03:00 Sven Van Caekenberghe <[hidden email]>:
>>> Esteban,
>>>
>>> On 04 Oct 2014, at 02:42, Esteban A. Maringolo <[hidden email]> wrote:
>>>
>>>> Sven,
>>>>
>>>> Is it possible to skip blank/empty lines in NeoCSVReader?
>>>>
>>>> So if a line is empty or contains ",,,,,,,," it would be skipped from
>>>> the iteration.
>>>>
>>>> Does it already provider such feature? (I couldn't find any setting
>>>> related with this).
>>>>
>>>> Esteban A. Maringolo
>>>
>>> Could you elaborate a bit on the use case ?
>>
>> Yes, users upload a CSV file with the proper columns (sometimes they
>> don't, but that's another story), and because they export/copy the CSV
>> from Excel most of
>> the times there are *lots* (might be hundreds) of empty lines at the
>> end, it is, lines
>> with separators only, or even blank lines.
>>
>>> My first reaction would be that you have to deal with these yourself afterwards.
>>
>> When dealing with "array based" records, it is, with no #recordClass
>> defined it would be easier. Because you can check whether all the
>> values of the array are empty or nil. So on each iteration (#next) you
>> could return the value array if it isn't empty or send #next again
>> until you get a record with any data.
>>
>> When working with a record class I think you can't do this, because it
>> seems you instantiate the record before iterating through each field
>> of the record.
>>
>>> What would be the definition of an empty record ? What about empty quoted fields ?
>>
>> An empty record of a four columns CSV file would look like:
>> ,,,
>> or (if some columns are quoted)
>> ,"","",
>>
>> I don't see how a record like that could be of any use.
>>
>> Best regards!
>>
>> Esteban A. Maringolo
>>
>>
>>> PS: I got your other contribution, I think it is OK to add, thanks.
>>
>> Great!
>
>