Sven,
Is it possible to skip blank/empty lines in NeoCSVReader? So if a line is empty or contains ",,,,,,,," it would be skipped from the iteration. Does it already provider such feature? (I couldn't find any setting related with this). Esteban A. Maringolo |
Esteban,
On 04 Oct 2014, at 02:42, Esteban A. Maringolo <[hidden email]> wrote: > Sven, > > Is it possible to skip blank/empty lines in NeoCSVReader? > > So if a line is empty or contains ",,,,,,,," it would be skipped from > the iteration. > > Does it already provider such feature? (I couldn't find any setting > related with this). > > Esteban A. Maringolo Could you elaborate a bit on the use case ? My first reaction would be that you have to deal with these yourself afterwards. What would be the definition of an empty record ? What about empty quoted fields ? Sven PS: I got your other contribution, I think it is OK to add, thanks. |
2014-10-04 10:47 GMT-03:00 Sven Van Caekenberghe <[hidden email]>:
> Esteban, > > On 04 Oct 2014, at 02:42, Esteban A. Maringolo <[hidden email]> wrote: > >> Sven, >> >> Is it possible to skip blank/empty lines in NeoCSVReader? >> >> So if a line is empty or contains ",,,,,,,," it would be skipped from >> the iteration. >> >> Does it already provider such feature? (I couldn't find any setting >> related with this). >> >> Esteban A. Maringolo > > Could you elaborate a bit on the use case ? Yes, users upload a CSV file with the proper columns (sometimes they don't, but that's another story), and because they export/copy the CSV from Excel most of the times there are *lots* (might be hundreds) of empty lines at the end, it is, lines with separators only, or even blank lines. > My first reaction would be that you have to deal with these yourself afterwards. When dealing with "array based" records, it is, with no #recordClass defined it would be easier. Because you can check whether all the values of the array are empty or nil. So on each iteration (#next) you could return the value array if it isn't empty or send #next again until you get a record with any data. When working with a record class I think you can't do this, because it seems you instantiate the record before iterating through each field of the record. > What would be the definition of an empty record ? What about empty quoted fields ? An empty record of a four columns CSV file would look like: ,,, or (if some columns are quoted) ,"","", I don't see how a record like that could be of any use. Best regards! Esteban A. Maringolo > PS: I got your other contribution, I think it is OK to add, thanks. Great! |
In #bleedingEdge:
=== Name: Neo-CSV-Core-SvenVanCaekenberghe.21 Author: SvenVanCaekenberghe Time: 6 October 2014, 5:49:01.691696 pm UUID: a6faeef3-aaa7-460f-a815-8e77b62f9906 Ancestors: Neo-CSV-Core-SvenVanCaekenberghe.20 Added NeoCSVReader>>#addIgnoredFields: Added NeoCSVReader>>#select:[thenDo:] to skip empty or other records Added NeoCSVReaderTests>>#testSkippingEmptyRecords === Name: Neo-CSV-Tests-SvenVanCaekenberghe.18 Author: SvenVanCaekenberghe Time: 6 October 2014, 5:49:36.206743 pm UUID: 95d87278-4398-41d3-96ed-83705d5488ce Ancestors: Neo-CSV-Tests-SvenVanCaekenberghe.17 Added NeoCSVReader>>#addIgnoredFields: Added NeoCSVReader>>#select:[thenDo:] to skip empty or other records Added NeoCSVReaderTests>>#testSkippingEmptyRecords === Here is the test: testSkippingEmptyRecords | input output | input := '1,2,3\\4,5,6\,,\7,8,9' withCRs. output := (NeoCSVReader on: input readStream) select: [ :each | each notEmpty and: [ (each allSatisfy: #isNil) not ] ]. self assert: output equals: #(#('1' '2' '3') #('4' '5' '6') #('7' '8' '9')). output := (NeoCSVReader on: input readStream) emptyFieldValue: ''; select: [ :each | each notEmpty and: [ (each allSatisfy: #isEmpty) not ] ]. self assert: output equals: #(#('1' '2' '3') #('4' '5' '6') #('7' '8' '9')) Regards, Sven On 05 Oct 2014, at 02:39, Esteban A. Maringolo <[hidden email]> wrote: > 2014-10-04 10:47 GMT-03:00 Sven Van Caekenberghe <[hidden email]>: >> Esteban, >> >> On 04 Oct 2014, at 02:42, Esteban A. Maringolo <[hidden email]> wrote: >> >>> Sven, >>> >>> Is it possible to skip blank/empty lines in NeoCSVReader? >>> >>> So if a line is empty or contains ",,,,,,,," it would be skipped from >>> the iteration. >>> >>> Does it already provider such feature? (I couldn't find any setting >>> related with this). >>> >>> Esteban A. Maringolo >> >> Could you elaborate a bit on the use case ? > > Yes, users upload a CSV file with the proper columns (sometimes they > don't, but that's another story), and because they export/copy the CSV > from Excel most of > the times there are *lots* (might be hundreds) of empty lines at the > end, it is, lines > with separators only, or even blank lines. > >> My first reaction would be that you have to deal with these yourself afterwards. > > When dealing with "array based" records, it is, with no #recordClass > defined it would be easier. Because you can check whether all the > values of the array are empty or nil. So on each iteration (#next) you > could return the value array if it isn't empty or send #next again > until you get a record with any data. > > When working with a record class I think you can't do this, because it > seems you instantiate the record before iterating through each field > of the record. > >> What would be the definition of an empty record ? What about empty quoted fields ? > > An empty record of a four columns CSV file would look like: > ,,, > or (if some columns are quoted) > ,"","", > > I don't see how a record like that could be of any use. > > Best regards! > > Esteban A. Maringolo > > >> PS: I got your other contribution, I think it is OK to add, thanks. > > Great! |
Excellent!
And your implementation is better than simply skipping empty lines :) Thank you Sven! Esteban A. Maringolo 2014-10-06 12:52 GMT-03:00 Sven Van Caekenberghe <[hidden email]>: > In #bleedingEdge: > > === > Name: Neo-CSV-Core-SvenVanCaekenberghe.21 > Author: SvenVanCaekenberghe > Time: 6 October 2014, 5:49:01.691696 pm > UUID: a6faeef3-aaa7-460f-a815-8e77b62f9906 > Ancestors: Neo-CSV-Core-SvenVanCaekenberghe.20 > > Added NeoCSVReader>>#addIgnoredFields: > Added NeoCSVReader>>#select:[thenDo:] to skip empty or other records > Added NeoCSVReaderTests>>#testSkippingEmptyRecords > === > Name: Neo-CSV-Tests-SvenVanCaekenberghe.18 > Author: SvenVanCaekenberghe > Time: 6 October 2014, 5:49:36.206743 pm > UUID: 95d87278-4398-41d3-96ed-83705d5488ce > Ancestors: Neo-CSV-Tests-SvenVanCaekenberghe.17 > > Added NeoCSVReader>>#addIgnoredFields: > Added NeoCSVReader>>#select:[thenDo:] to skip empty or other records > Added NeoCSVReaderTests>>#testSkippingEmptyRecords > === > > Here is the test: > > testSkippingEmptyRecords > | input output | > input := '1,2,3\\4,5,6\,,\7,8,9' withCRs. > output := (NeoCSVReader on: input readStream) > select: [ :each | each notEmpty and: [ (each allSatisfy: #isNil) not ] ]. > self assert: output equals: #(#('1' '2' '3') #('4' '5' '6') #('7' '8' '9')). > output := (NeoCSVReader on: input readStream) > emptyFieldValue: ''; > select: [ :each | each notEmpty and: [ (each allSatisfy: #isEmpty) not ] ]. > self assert: output equals: #(#('1' '2' '3') #('4' '5' '6') #('7' '8' '9')) > > Regards, > > Sven > > On 05 Oct 2014, at 02:39, Esteban A. Maringolo <[hidden email]> wrote: > >> 2014-10-04 10:47 GMT-03:00 Sven Van Caekenberghe <[hidden email]>: >>> Esteban, >>> >>> On 04 Oct 2014, at 02:42, Esteban A. Maringolo <[hidden email]> wrote: >>> >>>> Sven, >>>> >>>> Is it possible to skip blank/empty lines in NeoCSVReader? >>>> >>>> So if a line is empty or contains ",,,,,,,," it would be skipped from >>>> the iteration. >>>> >>>> Does it already provider such feature? (I couldn't find any setting >>>> related with this). >>>> >>>> Esteban A. Maringolo >>> >>> Could you elaborate a bit on the use case ? >> >> Yes, users upload a CSV file with the proper columns (sometimes they >> don't, but that's another story), and because they export/copy the CSV >> from Excel most of >> the times there are *lots* (might be hundreds) of empty lines at the >> end, it is, lines >> with separators only, or even blank lines. >> >>> My first reaction would be that you have to deal with these yourself afterwards. >> >> When dealing with "array based" records, it is, with no #recordClass >> defined it would be easier. Because you can check whether all the >> values of the array are empty or nil. So on each iteration (#next) you >> could return the value array if it isn't empty or send #next again >> until you get a record with any data. >> >> When working with a record class I think you can't do this, because it >> seems you instantiate the record before iterating through each field >> of the record. >> >>> What would be the definition of an empty record ? What about empty quoted fields ? >> >> An empty record of a four columns CSV file would look like: >> ,,, >> or (if some columns are quoted) >> ,"","", >> >> I don't see how a record like that could be of any use. >> >> Best regards! >> >> Esteban A. Maringolo >> >> >>> PS: I got your other contribution, I think it is OK to add, thanks. >> >> Great! > > |
Free forum by Nabble | Edit this page |