[Glass] Processing large text files / statmonitor tips

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Glass] Processing large text files / statmonitor tips

Mariano Martinez Peck
Hi guys,

I am doing a quick bench to see how GemStone behaves for some use-cases I have in which lots of large csv (or similar text based) files are read and processed. Notice that so far this read is NOT a bulk load (I am not persisting this data in GemStone..I am just processing it). 

I am trying to see how can I make it a bit faster (I am sure I can make it). 
So...first question...is there anything I could do in FileStream and friends that could have a big impact in the performance of reading text files?

I used ProfMonitor to see which places of my code were the bottleneck, and of course, the files reading is most of it. So I wonder if there is anything I could do in the Gem or Stone configuration parameters?

Finally, I run statmonitor and I have the vsd file. But since I don't know much so far, I wanted to ask if there were some key statistics I should look at for my usecase. 

Thanks a lot in advance, 

--
Mariano
http://marianopeck.wordpress.com

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Processing large text files / statmonitor tips

Richard Sargent
Administrator
>> So...first question...is there anything I could do in FileStream and friends that
>> could have a big impact in the performance of reading text files?

Ultimately, it comes down to reading from the fastest file system available to you. In other words, do not use GsFile's client-side capability for large volumes of data. Likewise, NFS mounted volumes will be slower than volumes local to the Gem's host.

And, of course, if you can have the files on a local SSD, that should be the fastest of all.



On Mon, Dec 9, 2013 at 11:32 AM, Mariano Martinez Peck <[hidden email]> wrote:
Hi guys,

I am doing a quick bench to see how GemStone behaves for some use-cases I have in which lots of large csv (or similar text based) files are read and processed. Notice that so far this read is NOT a bulk load (I am not persisting this data in GemStone..I am just processing it). 

I am trying to see how can I make it a bit faster (I am sure I can make it). 
So...first question...is there anything I could do in FileStream and friends that could have a big impact in the performance of reading text files?

I used ProfMonitor to see which places of my code were the bottleneck, and of course, the files reading is most of it. So I wonder if there is anything I could do in the Gem or Stone configuration parameters?

Finally, I run statmonitor and I have the vsd file. But since I don't know much so far, I wanted to ask if there were some key statistics I should look at for my usecase. 

Thanks a lot in advance, 

--
Mariano
http://marianopeck.wordpress.com

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass




--
Richard Sargent
Business Development Manager
503-766-4719
[hidden email]
GemTalk Systems
15220 NW Greenbrier Parkway #240
Beaverton, OR 97006

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Processing large text files / statmonitor tips

hernanmd
In reply to this post by Mariano Martinez Peck
El 09/12/2013 16:32, Mariano Martinez Peck escribió:
> Hi guys,
>
> I am doing a quick bench to see how GemStone behaves for some use-cases
> I have in which lots of large csv (or similar text based) files are read
> and processed. Notice that so far this read is NOT a bulk load (I am not
> persisting this data in GemStone..I am just processing it).
>

It is one-time processing or data will be re-used?

> I am trying to see how can I make it a bit faster (I am sure I can make
> it).
> So...first question...is there anything I could do in FileStream and
> friends that could have a big impact in the performance of reading text
> files?
>

Can you post CSV details? like file size and number of lines?

Hernán

> I used ProfMonitor to see which places of my code were the bottleneck,
> and of course, the files reading is most of it. So I wonder if there is
> anything I could do in the Gem or Stone configuration parameters?
>
> Finally, I run statmonitor and I have the vsd file. But since I don't
> know much so far, I wanted to ask if there were some key statistics I
> should look at for my usecase.
>
> Thanks a lot in advance,
>
> --
> Mariano
> http://marianopeck.wordpress.com
>
>
> _______________________________________________
> Glass mailing list
> [hidden email]
> http://lists.gemtalksystems.com/mailman/listinfo/glass
>

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Processing large text files / statmonitor tips

marten
Am 10.12.2013 05:25, schrieb Hernán Morales Durand:
> El 09/12/2013 16:32, Mariano Martinez Peck escribió:
>> Hi guys,
>>
>> I am doing a quick bench to see how GemStone behaves for some use-cases
>> I have in which lots of large csv (or similar text based) files are read
>> and processed. Notice that so far this read is NOT a bulk load (I am not
>> persisting this data in GemStone..I am just processing it).
>>

I had the same problems - originally I used a class FilePortability (or
something like this), but now I switched to GsFile, which seems to be
much faster.
_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass
Reply | Threaded
Open this post in threaded view
|

Re: [Glass] Processing large text files / statmonitor tips

Mariano Martinez Peck



On Tue, Dec 10, 2013 at 3:10 AM, [hidden email] <[hidden email]> wrote:
Am 10.12.2013 05:25, schrieb Hernán Morales Durand:
> El 09/12/2013 16:32, Mariano Martinez Peck escribió:
>> Hi guys,
>>
>> I am doing a quick bench to see how GemStone behaves for some use-cases
>> I have in which lots of large csv (or similar text based) files are read
>> and processed. Notice that so far this read is NOT a bulk load (I am not
>> persisting this data in GemStone..I am just processing it).
>>

I had the same problems - originally I used a class FilePortability (or
something like this), but now I switched to GsFile, which seems to be
much faster.

Wow...this was a good tip. I was also using FileStream through my compatibility layer.
I have just tried by using something like:

GsFile openReadOnServer: aFilename

And that was 20% faster than

FileStream fileNamed:  aFilename

So cool. My API usage of the stream is quite small so I can use GsFile polimorphically with FileStream. 

Thanks, 


 
_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass



--
Mariano
http://marianopeck.wordpress.com

_______________________________________________
Glass mailing list
[hidden email]
http://lists.gemtalksystems.com/mailman/listinfo/glass