Login  Register

Re: NeoCSV on Irregular Files

Posted by Sven Van Caekenberghe-2 on Jul 26, 2017; 4:04pm
URL: https://forum.world.st/NeoCSV-on-Irregular-Files-tp4956850p4956857.html

I agree.

If the file is non-homegeneous it is not longer CSV by definition.

Holding on to the original stream and creating new readers for each section is one option, an other one could be to add a #reset method.

The big question is how to known when one section begins/ends.

NeoCSVReader holds a one char buffer, so you could peek for something, just maybe. Then you could discover the section switches while parsing (a bit like #atEnd is used from #upToEnd, add a #atSectionEnd). But it all depends on your specific format.

> On 26 Jul 2017, at 17:45, Esteban A. Maringolo <[hidden email]> wrote:
>
> There is no way to perform this with NeoJSON or any other CSV
> framework I'm aware of.
>
> I had to deal with that kind of "format" (which is likely an export
> format), and the way to deal with it was to process each "segment"
> using a different instance of the CSV reader, the segments where
> scanned in the stream using the delimiting heuristics of your choice
> (headers, blank lines, etc.), and then each segment was extracted and
> passed as argument to the reader of that segment.
>
> The drawback was that if the file was big there was no way to have "a
> stream over a stream" (like a window function), so passing the segment
> to the reader implied copying its string contents within the segment
> delimiters.
>
> It's something I already put some thought into, but never had the will
> to code and share publicly.
>
> Regards,
>
> Esteban A. Maringolo
>
>
> 2017-07-26 12:02 GMT-03:00 Sean P. DeNigris <[hidden email]>:
>> I have a CSV file that has several subsections, each with its own format.
>> What I'd like to do is parse one, reset the NeoCSVReader, set it up for the
>> next section, and continue. I didn't see an API for this. Is it possible?
>> Thanks.
>>
>>
>>
>> -----
>> Cheers,
>> Sean
>> --
>> View this message in context: http://forum.world.st/NeoCSV-on-Irregular-Files-tp4956850.html
>> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.
>>
>