CSV file help

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

CSV file help

Joseph Alotta
Greetings,

I am needing conceptual help with parsing these troublesome CSV files.  I was breaking the lines using String >> findToken:
but I found lines where extra deliminators were added.  Notice the three following lines, the first and third line have extra commas from Oak Brook, IL and the second line does not have the comma.


02/04/2016  Thu,,"COSTCO WHSE #0388 000000000990388 - OAK BROOK, IL",,,,,37.00,,,,,,,,
02/05/2016  Fri,,"ELECTRONIC PAYMENT RECEIVED-THANK",,,,,-443.52,,,,,,,,
02/06/2016  Sat,,"COSTCO WHSE #1088 000000000991088 - BOLINGBROOK, IL",,,,,50.86,,,,,,,,

I think I need to parse with the double quotes in mind, but I don’t know where to break it.

Also, I was reading in the squeak pages that someone had already written code for this, but I couldn’t find anything.

http://wiki.squeak.org/squeak/3260 has two methods, CSVSubstrings and SequenceableCollection-asCSVLine.st

Does anyone know where these are now?

Sincerely,

Joe.



_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

RE: CSV file help

Ron Teitelbaum
Hi Joe,

I haven't tried it but I would probably start with Avi's CSV Parser
http://www.squeaksource.com/CSV.html

But in general you want to parse all of your lines, then for each line parse quoted values, then you can parse commas.

All the best,

Ron Teitelbaum

> -----Original Message-----
> From: [hidden email] [mailto:beginners-
> [hidden email]] On Behalf Of Joseph Alotta
> Sent: Friday, June 17, 2016 4:42 PM
> To: [hidden email]
> Subject: [Newbies] CSV file help
>
> Greetings,
>
> I am needing conceptual help with parsing these troublesome CSV files.  I
> was breaking the lines using String >> findToken:
> but I found lines where extra deliminators were added.  Notice the three
> following lines, the first and third line have extra commas from Oak Brook, IL
> and the second line does not have the comma.
>
>
> 02/04/2016  Thu,,"COSTCO WHSE #0388 000000000990388 - OAK BROOK,
> IL",,,,,37.00,,,,,,,,
> 02/05/2016  Fri,,"ELECTRONIC PAYMENT RECEIVED-THANK",,,,,-443.52,,,,,,,,
> 02/06/2016  Sat,,"COSTCO WHSE #1088 000000000991088 - BOLINGBROOK,
> IL",,,,,50.86,,,,,,,,
>
> I think I need to parse with the double quotes in mind, but I don’t know
> where to break it.
>
> Also, I was reading in the squeak pages that someone had already written
> code for this, but I couldn’t find anything.
>
> http://wiki.squeak.org/squeak/3260 has two methods, CSVSubstrings and
> SequenceableCollection-asCSVLine.st
>
> Does anyone know where these are now?
>
> Sincerely,
>
> Joe.
>
>
>
> _______________________________________________
> Beginners mailing list
> [hidden email]
> http://lists.squeakfoundation.org/mailman/listinfo/beginners

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: CSV file help

Joseph Alotta

> On Jun 17, 2016, at 3:45 PM, Ron Teitelbaum [via Smalltalk] <[hidden email]> wrote:
>
> Hi Joe,
>
> I haven't tried it but I would probably start with Avi's CSV Parser
> http://www.squeaksource.com/CSV.html


I tried to download it, but the .mcz file could not be read.

I also tried pasting

MCHttpRepository
    location: 'http://www.squeaksource.com/CSV'
    user: ''
    password: ‘'



Into a workspace and evaluating it and it does nothing.

Can you provide instructions, please?

Sincerely,

Joe.


Reply | Threaded
Open this post in threaded view
|

Re: CSV file help

Ron Teitelbaum

Hi Joe,

 

World menu> open… > Monticello Browser.

 

Press the +Repository button and select HTTP.

 

Paste in the Repository location

 

MCHttpRepository
    location: 'http://www.squeaksource.com/CSV'
    user: ''
    password: ‘'

 

accept and then select open.

 

Select the latest version and press load.

 

Monticello is a key component to managing an image.  You should definitely read up on it.

 

http://wiki.squeak.org/squeak/43

 

All the best,

 

Ron Teitelbaum

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Joseph Alotta
Sent: Friday, June 17, 2016 4:54 PM
To: [hidden email]
Subject: [Newbies] Re: CSV file help

 


> On Jun 17, 2016, at 3:45 PM, Ron Teitelbaum [via Smalltalk] <[hidden email]> wrote:
>
> Hi Joe,
>
> I haven't tried it but I would probably start with Avi's CSV Parser
> http://www.squeaksource.com/CSV.html


I tried to download it, but the .mcz file could not be read.

I also tried pasting

MCHttpRepository
    location: 'http://www.squeaksource.com/CSV'
    user: ''
    password: ‘'



Into a workspace and evaluating it and it does nothing.

Can you provide instructions, please?

Sincerely,

Joe.



View this message in context: Re: CSV file help
Sent from the Squeak - Beginners mailing list archive at Nabble.com.


_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
cbc
Reply | Threaded
Open this post in threaded view
|

Re: CSV file help

cbc
In reply to this post by Joseph Alotta
Hi Joseph,
these methods are now in the package located at:
MCHttpRepository
location: 'http://www.squeaksource.com/CSVData'
user: ''
password: ''
(follow Ron's excellent advice on how to use this).

That said, I would no longer use this package to parse CSV - I've recently switched over to Avi's parser as well.
(My package still has some interesting uses for the data once you pull it in.  Of course, it isn't documented.)

I should clean up the page that you found - I had forgotten that it even existed!

On Fri, Jun 17, 2016 at 1:42 PM, Joseph Alotta <[hidden email]> wrote:
Greetings,

I am needing conceptual help with parsing these troublesome CSV files.  I was breaking the lines using String >> findToken:
but I found lines where extra deliminators were added.  Notice the three following lines, the first and third line have extra commas from Oak Brook, IL and the second line does not have the comma.


02/04/2016  Thu,,"COSTCO WHSE #0388 000000000990388 - OAK BROOK, IL",,,,,37.00,,,,,,,,
02/05/2016  Fri,,"ELECTRONIC PAYMENT RECEIVED-THANK",,,,,-443.52,,,,,,,,
02/06/2016  Sat,,"COSTCO WHSE #1088 000000000991088 - BOLINGBROOK, IL",,,,,50.86,,,,,,,,

I think I need to parse with the double quotes in mind, but I don’t know where to break it.

Also, I was reading in the squeak pages that someone had already written code for this, but I couldn’t find anything.

http://wiki.squeak.org/squeak/3260 has two methods, CSVSubstrings and SequenceableCollection-asCSVLine.st

Does anyone know where these are now?

Sincerely,

Joe.



_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners


_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: CSV file help

Joseph Alotta
So which package should I use to parse CSV data?



> On Jun 17, 2016, at 5:05 PM, cbc [via Smalltalk] <[hidden email]> wrote:
>
> Hi Joseph,
> these methods are now in the package located at:
> MCHttpRepository
>     location: 'http://www.squeaksource.com/CSVData'
>     user: ''
>     password: ''
> (follow Ron's excellent advice on how to use this).
>
> That said, I would no longer use this package to parse CSV - I've recently switched over to Avi's parser as well.
> (My package still has some interesting uses for the data once you pull it in.  Of course, it isn't documented.)
>
> I should clean up the page that you found - I had forgotten that it even existed!
>
> On Fri, Jun 17, 2016 at 1:42 PM, Joseph Alotta <[hidden email]> wrote:
> Greetings,
>
> I am needing conceptual help with parsing these troublesome CSV files.  I was breaking the lines using String >> findToken:
> but I found lines where extra deliminators were added.  Notice the three following lines, the first and third line have extra commas from Oak Brook, IL and the second line does not have the comma.
>
>
> 02/04/2016  Thu,,"COSTCO WHSE #0388 000000000990388 - OAK BROOK, IL",,,,,37.00,,,,,,,,
> 02/05/2016  Fri,,"ELECTRONIC PAYMENT RECEIVED-THANK",,,,,-443.52,,,,,,,,
> 02/06/2016  Sat,,"COSTCO WHSE #1088 000000000991088 - BOLINGBROOK, IL",,,,,50.86,,,,,,,,
>
> I think I need to parse with the double quotes in mind, but I don’t know where to break it.
>
> Also, I was reading in the squeak pages that someone had already written code for this, but I couldn’t find anything.
>
> http://wiki.squeak.org/squeak/3260 has two methods, CSVSubstrings and SequenceableCollection-asCSVLine.st
>
> Does anyone know where these are now?
>
> Sincerely,
>
> Joe.
>
>
>
> _______________________________________________
> Beginners mailing list
> [hidden email]
> http://lists.squeakfoundation.org/mailman/listinfo/beginners
>
>
> _______________________________________________
> Beginners mailing list
> [hidden email]
> http://lists.squeakfoundation.org/mailman/listinfo/beginners
>
>
> If you reply to this email, your message will be added to the discussion below:
> http://forum.world.st/CSV-file-help-tp4901546p4901574.html
> To start a new topic under Squeak - Beginners, email [hidden email]
> To unsubscribe from Squeak - Beginners, click here.
> NAML

cbc
Reply | Threaded
Open this post in threaded view
|

Re: CSV file help

cbc
Use this one for parsing:

MCHttpRepository 
    location: 'http://www.squeaksource.com/CSV'
    user: '' 
    password: ‘'


Sent from my iPhone

On Jun 17, 2016, at 3:35 PM, Joseph Alotta <[hidden email]> wrote:

So which package should I use to parse CSV data?



> On Jun 17, 2016, at 5:05 PM, cbc [via Smalltalk] <[hidden email]> wrote:
>
> Hi Joseph,
> these methods are now in the package located at:
> MCHttpRepository
>     location: 'http://www.squeaksource.com/CSVData'
>     user: ''
>     password: ''
> (follow Ron's excellent advice on how to use this).
>
> That said, I would no longer use this package to parse CSV - I've recently switched over to Avi's parser as well.
> (My package still has some interesting uses for the data once you pull it in.  Of course, it isn't documented.)
>
> I should clean up the page that you found - I had forgotten that it even existed!
>
> On Fri, Jun 17, 2016 at 1:42 PM, Joseph Alotta <[hidden email]> wrote:
> Greetings,
>
> I am needing conceptual help with parsing these troublesome CSV files.  I was breaking the lines using String >> findToken:
> but I found lines where extra deliminators were added.  Notice the three following lines, the first and third line have extra commas from Oak Brook, IL and the second line does not have the comma.
>
>
> 02/04/2016  Thu,,"COSTCO WHSE #0388 000000000990388 - OAK BROOK, IL",,,,,37.00,,,,,,,,
> 02/05/2016  Fri,,"ELECTRONIC PAYMENT RECEIVED-THANK",,,,,-443.52,,,,,,,,
> 02/06/2016  Sat,,"COSTCO WHSE #1088 000000000991088 - BOLINGBROOK, IL",,,,,50.86,,,,,,,,
>
> I think I need to parse with the double quotes in mind, but I don’t know where to break it.
>
> Also, I was reading in the squeak pages that someone had already written code for this, but I couldn’t find anything.
>
> http://wiki.squeak.org/squeak/3260 has two methods, CSVSubstrings and SequenceableCollection-asCSVLine.st
>
> Does anyone know where these are now?
>
> Sincerely,
>
> Joe.
>
>
>
> _______________________________________________
> Beginners mailing list
> [hidden email]
> http://lists.squeakfoundation.org/mailman/listinfo/beginners
>
>
> _______________________________________________
> Beginners mailing list
> [hidden email]
> http://lists.squeakfoundation.org/mailman/listinfo/beginners
>
>
> If you reply to this email, your message will be added to the discussion below:
> http://forum.world.st/CSV-file-help-tp4901546p4901574.html
> To start a new topic under Squeak - Beginners, email [hidden email]
> To unsubscribe from Squeak - Beginners, click here.
> NAML


View this message in context: Re: CSV file help
Sent from the Squeak - Beginners mailing list archive at Nabble.com.
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Aw: [Newbies] CSV file help

Rudolf Rednose
In reply to this post by Joseph Alotta
Hello Joseph, maybe
findTokens:',' escapedBy: '"'
is sufficient.
 
for example:
|data fields|
data := OrderedCollection new.
data add: '02/04/2016  Thu,,"COSTCO WHSE #0388 000000000990388 - OAK BROOK, IL",,,,,37.00,,,,,,,,' copy.
data add: '02/05/2016  Fri,,"ELECTRONIC PAYMENT RECEIVED-THANK",,,,,-443.52,,,,,,,,' copy. 
data add: '02/06/2016  Sat,,"COSTCO WHSE #1088 000000000991088 - BOLINGBROOK, IL",,,,,50.86,,,,,,,,' copy. 
data do:[ :item |
        fields := item findTokens:',' escapedBy: '"'.
        "show result in Transcript"
        Transcript cr; show: 'Line: ', item; cr; show: 'Number of Fields:', fields size asString; cr.
        1 to: fields size do:[:n| Transcript show:( n asString,' [', (fields at: n),']');cr]
        ].
                    
Line: 02/04/2016  Thu,,"COSTCO WHSE #0388 000000000990388 - OAK BROOK, IL",,,,,37.00,,,,,,,,
Number of Fields:16
1 [02/04/2016  Thu]
2 []
3 [COSTCO WHSE #0388 000000000990388 - OAK BROOK, IL]
4 []
5 []
6 []
7 []
8 [37.00]
9 []
10 []
11 []
12 []
13 []
14 []
15 []
16 []
Line: 02/05/2016  Fri,,"ELECTRONIC PAYMENT RECEIVED-THANK",,,,,-443.52,,,,,,,,
Number of Fields:16
1 [02/05/2016  Fri]
2 []
3 [ELECTRONIC PAYMENT RECEIVED-THANK]
4 []
5 []
6 []
7 []
8 [-443.52]
9 []
10 []
11 []
12 []
13 []
14 []
15 []
16 []
Line: 02/06/2016  Sat,,"COSTCO WHSE #1088 000000000991088 - BOLINGBROOK, IL",,,,,50.86,,,,,,,,
Number of Fields:16
1 [02/06/2016  Sat]
2 []
3 [COSTCO WHSE #1088 000000000991088 - BOLINGBROOK, IL]
4 []
5 []
6 []
7 []
8 [50.86]
9 []
10 []
11 []
12 []
13 []
14 []
15 []
16 []
 
Sincerely, 
Rudolf.
Gesendet: Freitag, 17. Juni 2016 um 22:42 Uhr
Von: "Joseph Alotta" <[hidden email]>
An: [hidden email]
Betreff: [Newbies] CSV file help
Greetings,

I am needing conceptual help with parsing these troublesome CSV files. I was breaking the lines using String >> findToken:
but I found lines where extra deliminators were added. Notice the three following lines, the first and third line have extra commas from Oak Brook, IL and the second line does not have the comma.


02/04/2016 Thu,,"COSTCO WHSE #0388 000000000990388 - OAK BROOK, IL",,,,,37.00,,,,,,,,
02/05/2016 Fri,,"ELECTRONIC PAYMENT RECEIVED-THANK",,,,,-443.52,,,,,,,,
02/06/2016 Sat,,"COSTCO WHSE #1088 000000000991088 - BOLINGBROOK, IL",,,,,50.86,,,,,,,,

I think I need to parse with the double quotes in mind, but I don’t know where to break it.

Also, I was reading in the squeak pages that someone had already written code for this, but I couldn’t find anything.

http://wiki.squeak.org/squeak/3260 has two methods, CSVSubstrings and SequenceableCollection-asCSVLine.st

Does anyone know where these are now?

Sincerely,

Joe.



_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: [Newbies] CSV file help

Joseph Alotta
Rudolf,

That is perfect!  Thank you so much.  I love the simple yet elegant solutions.

I tested it with the large file and it works great.  And best of all, it is included in the basic system.

Sincerely,

Joe.


> On Jun 18, 2016, at 12:25 PM, Rudolf Rednose [via Smalltalk] <[hidden email]> wrote:
>
> Hello Joseph, maybe
> findTokens:',' escapedBy: '"'
> is sufficient.
>  
> for example:
> |data fields|
> data := OrderedCollection new.
> data add: '02/04/2016  Thu,,"COSTCO WHSE #0388 000000000990388 - OAK BROOK, IL",,,,,37.00,,,,,,,,' copy.
> data add: '02/05/2016  Fri,,"ELECTRONIC PAYMENT RECEIVED-THANK",,,,,-443.52,,,,,,,,' copy.
> data add: '02/06/2016  Sat,,"COSTCO WHSE #1088 000000000991088 - BOLINGBROOK, IL",,,,,50.86,,,,,,,,' copy.
> data do:[ :item |
>         fields := item findTokens:',' escapedBy: '"'.
>         "show result in Transcript"
>         Transcript cr; show: 'Line: ', item; cr; show: 'Number of Fields:', fields size asString; cr.
>         1 to: fields size do:[:n| Transcript show:( n asString,' [', (fields at: n),']');cr]
>         ].
>                    
> Line: 02/04/2016  Thu,,"COSTCO WHSE #0388 000000000990388 - OAK BROOK, IL",,,,,37.00,,,,,,,,
> Number of Fields:16
> 1 [02/04/2016  Thu]
> 2 []
> 3 [COSTCO WHSE #0388 000000000990388 - OAK BROOK, IL]
> 4 []
> 5 []
> 6 []
> 7 []
> 8 [37.00]
> 9 []
> 10 []
> 11 []
> 12 []
> 13 []
> 14 []
> 15 []
> 16 []
> Line: 02/05/2016  Fri,,"ELECTRONIC PAYMENT RECEIVED-THANK",,,,,-443.52,,,,,,,,
> Number of Fields:16
> 1 [02/05/2016  Fri]
> 2 []
> 3 [ELECTRONIC PAYMENT RECEIVED-THANK]
> 4 []
> 5 []
> 6 []
> 7 []
> 8 [-443.52]
> 9 []
> 10 []
> 11 []
> 12 []
> 13 []
> 14 []
> 15 []
> 16 []
> Line: 02/06/2016  Sat,,"COSTCO WHSE #1088 000000000991088 - BOLINGBROOK, IL",,,,,50.86,,,,,,,,
> Number of Fields:16
> 1 [02/06/2016  Sat]
> 2 []
> 3 [COSTCO WHSE #1088 000000000991088 - BOLINGBROOK, IL]
> 4 []
> 5 []
> 6 []
> 7 []
> 8 [50.86]
> 9 []
> 10 []
> 11 []
> 12 []
> 13 []
> 14 []
> 15 []
> 16 []
>  
> Sincerely,
> Rudolf.
> Gesendet: Freitag, 17. Juni 2016 um 22:42 Uhr
> Von: "Joseph Alotta" <[hidden email]>
> An: [hidden email]
> Betreff: [Newbies] CSV file help
> Greetings,
>
> I am needing conceptual help with parsing these troublesome CSV files. I was breaking the lines using String >> findToken:
> but I found lines where extra deliminators were added. Notice the three following lines, the first and third line have extra commas from Oak Brook, IL and the second line does not have the comma.
>
>
> 02/04/2016 Thu,,"COSTCO WHSE #0388 000000000990388 - OAK BROOK, IL",,,,,37.00,,,,,,,,
> 02/05/2016 Fri,,"ELECTRONIC PAYMENT RECEIVED-THANK",,,,,-443.52,,,,,,,,
> 02/06/2016 Sat,,"COSTCO WHSE #1088 000000000991088 - BOLINGBROOK, IL",,,,,50.86,,,,,,,,
>
> I think I need to parse with the double quotes in mind, but I don’t know where to break it.
>
> Also, I was reading in the squeak pages that someone had already written code for this, but I couldn’t find anything.
>
> http://wiki.squeak.org/squeak/3260 has two methods, CSVSubstrings and SequenceableCollection-asCSVLine.st
>
> Does anyone know where these are now?
>
> Sincerely,
>
> Joe.
>
>
>
> _______________________________________________
> Beginners mailing list
> [hidden email]
> http://lists.squeakfoundation.org/mailman/listinfo/beginners
>
> _______________________________________________
> Beginners mailing list
> [hidden email]
> http://lists.squeakfoundation.org/mailman/listinfo/beginners
>
>
> If you reply to this email, your message will be added to the discussion below:
> http://forum.world.st/CSV-file-help-tp4901546p4901691.html
> To start a new topic under Squeak - Beginners, email [hidden email]
> To unsubscribe from Squeak - Beginners, click here.
> NAML