Hi all,
I'm working with visualizations a external dataset which contains 270k records. So the best strategy seems to bridge pharo with SQLite to keep requirements low while using Roassal to visualize aggregated information that is obtained from querying the database. That seemed working fine for a while, but just a few minutes ago my image started to crash and after a while it crashed beyond being useful (it doesn't start anymore). I have added the crash dump file, to see if anyone of you can help me making sense of it and trying to detect where is the failure. I imagine that there is something wrong about the way I'm making the query This is the suspect method: ==================== totalOffshoresByCountry "I load the offshores data from a SQLite database or cvs file" | queryResults | self dataLocation exists ifFalse: [ self downloadData ] ifTrue: [ database := (SQLiteConnection fileNamed: dataLocation fullName ) open. queryResults := database executeQuery: 'SELECT country_name, COUNT(*) AS "total_offshores" FROM (SELECT country_name, Description_ FROM nodesNW INNER JOIN node_countriesNW ON nodesNW.Unique_ID = node_countriesNW.NODEID1 ORDER BY country_name) GROUP BY country_name']. ^ queryResults ==================== I have restarted a new image, but I can not remember which package provides SQLiteConnection. I imagine is some migration of NBSQLite under UFFI, but I can not find it in the pharo catalog. Any help with this is welcomed. Cheers, Offray crash.dmp (48K) Download Attachment |
On 12/04/16 22:44, Offray Vladimir Luna Cárdenas wrote:
> I'm working with visualizations a external dataset which contains 270k > records. So the best strategy seems to bridge pharo with SQLite to keep > requirements low while using Roassal to visualize aggregated information > that is obtained from querying the database. It won't fit in image? Stephan |
Hi,
On 12/04/16 16:51, Stephan Eggermont wrote: > On 12/04/16 22:44, Offray Vladimir Luna Cárdenas wrote: >> I'm working with visualizations a external dataset which contains 270k >> records. So the best strategy seems to bridge pharo with SQLite to keep >> requirements low while using Roassal to visualize aggregated information >> that is obtained from querying the database. > > It won't fit in image? > I tried with RTTabTable and NeoCVS but they can not load the data. I made a test drawing 150k points and the image starts to lag and trying to query the data becomes inefficient compared to query the data on SQLite. For the moment I'll export the query results to CVS, but I'll hope to have the SQLite bridge working soon. Offray |
Smalltalk stack dump: 0xffc655d4 SqliteLibrary>close: 0x9726108: a(n) SqliteLibrary 0xffc655f4 SQLiteConnection>close 0xc672d08: a(n) SQLiteConnection 0xffc65610 SQLiteConnection>finalize 0xc672d08: a(n) SQLiteConnection 0xffc65630 WeakFinalizerItem>finalizeValues 0xc692b60: a(n) WeakFinalizerItem The idea is obvious...and I did exactly the same for SqueakDBX (now OpenDBXDriver / Garage).... register the connections into the WeakRegistry so that they are automatically closed without having the user to manually do so. I remember I had a case where the user WAS already doing an explicit close and I would crash very much like this case when I was trying to close a connection that was already closed. I don't have Sqlite library handy, but could it be something similar? On Tue, Apr 12, 2016 at 9:49 PM, Offray Vladimir Luna Cárdenas <[hidden email]> wrote: Hi, |
In reply to this post by Offray Vladimir Luna Cárdenas-2
but you can have your objects in memory and you do not have to blindly
print everything or put everything in a table. with moose sometimes we have 600K objects in an array. I think that we need more clever stream to filter what we want to load. Le 13/4/16 02:49, Offray Vladimir Luna Cárdenas a écrit : > Hi, > > On 12/04/16 16:51, Stephan Eggermont wrote: >> On 12/04/16 22:44, Offray Vladimir Luna Cárdenas wrote: >>> I'm working with visualizations a external dataset which contains 270k >>> records. So the best strategy seems to bridge pharo with SQLite to keep >>> requirements low while using Roassal to visualize aggregated >>> information >>> that is obtained from querying the database. >> >> It won't fit in image? >> > > I tried with RTTabTable and NeoCVS but they can not load the data. I > made a test drawing 150k points and the image starts to lag and trying > to query the data becomes inefficient compared to query the data on > SQLite. For the moment I'll export the query results to CVS, but I'll > hope to have the SQLite bridge working soon. > > Offray > > |
Hi,
I don't not want to blindly load/print *everything* but I do want swiftly query *anything*, which is usual in exploratory computing, especially in the first phases. That's where a proper interface between Pharo and SQLite is needed and I would like to help in having it, at least reporting the issues on Pharo 4 and now on Pharo 5. Meanwhile I will query and export results to csv files, which is suboptimal, but lets me advance while I have more clues about what's now working or I'm not doing well. Cheers, Offray On 13/04/16 01:09, stepharo wrote: > but you can have your objects in memory and you do not have to blindly > print everything or put everything in a table. > with moose sometimes we have 600K objects in an array. > > I think that we need more clever stream to filter what we want to load. > > Le 13/4/16 02:49, Offray Vladimir Luna Cárdenas a écrit : >> Hi, >> >> On 12/04/16 16:51, Stephan Eggermont wrote: >>> On 12/04/16 22:44, Offray Vladimir Luna Cárdenas wrote: >>>> I'm working with visualizations a external dataset which contains 270k >>>> records. So the best strategy seems to bridge pharo with SQLite to >>>> keep >>>> requirements low while using Roassal to visualize aggregated >>>> information >>>> that is obtained from querying the database. >>> >>> It won't fit in image? >>> >> >> I tried with RTTabTable and NeoCVS but they can not load the data. I >> made a test drawing 150k points and the image starts to lag and >> trying to query the data becomes inefficient compared to query the >> data on SQLite. For the moment I'll export the query results to CVS, >> but I'll hope to have the SQLite bridge working soon. >> >> Offray >> >> > > |
In reply to this post by Offray Vladimir Luna Cárdenas-2
Hi
I've recently been playing with medical provider data sets which are quite large, also around 270K records. I'm using a Moose image Pharo5.0 Latest update: #50643 on a Mac OS X. The initial issue I had was with memory settings for the VM. This has been increased and the image ranges from 800MB to 1.3GB and has been fine. There have been occasional crashes/hangs but this is to do with memory limits and GC. Typically this occurs when making class format changes to existing instances of data e.g. new variables introduced to a working image with a large data set. To counter this I have a base image which I update the code and then import the data (CSV for now) using NeoCSV. This process takes about 30 seconds so it's not too painful. The other issue I've come across is a slow down in querying the data sets using the Playground. I profiled the code and found that the culprit to be GLMTreeMorphModel>>explicitlySelectMultipleItems: which is terribly slow as it iterates over the entire data set. I've made a modification to prevent the expensive iteration when there are more than 50000 records to be displayed e.g. self roots size > 50000 ifTrue: [ ^ self ]. I'm also using
Anyway the reason for this long-winded email is to hopefully provide some useful feedback but more to thank everyone involved in building a powerful environment. I'd hate to name people, because I'm sure to miss most, but the efforts of people like Sven (Neo*, STON), Doru (Moose*), Avi (ROE, BTREE) are appreciated. I know there are a lot of hands behind the scenes to make Pharo, from the fast VM to the UI, so thanks to all. Regards Carlo On 13 Apr 2016, at 2:49 AM, Offray Vladimir Luna Cárdenas <[hidden email]> wrote: Hi, On 12/04/16 16:51, Stephan Eggermont wrote: On 12/04/16 22:44, Offray Vladimir Luna Cárdenas wrote:I'm working with visualizations a external dataset which contains 270k I tried with RTTabTable and NeoCVS but they can not load the data. I made a test drawing 150k points and the image starts to lag and trying to query the data becomes inefficient compared to query the data on SQLite. For the moment I'll export the query results to CVS, but I'll hope to have the SQLite bridge working soon. Offray |
In reply to this post by Mariano Martinez Peck
Thanks Mariano,
So my suspicion was right and I was managing improperly the SQLite connections and not making a explicit closing of them. I could improve the code, but I can't find the package which provides SQLiteConnection (on pharo 4 it was NBSQLite). How can I find/install it? Cheers, Offray On 12/04/16 20:03, Mariano Martinez
Peck wrote:
|
Torsten has given me the answer on how to install the new SQLite
bridge for UFFI (with the proper disclaimer about being work in
progress ;-) ).
Gofer it
smalltalkhubUser: 'TorstenBergmann' project: 'UDBC'; configuration;
load.
(Smalltalk at: #ConfigurationOfUDBC) loadBleedingEdge Cheers, Offray On 13/04/16 08:33, Offray Vladimir Luna
Cárdenas wrote:
Thanks Mariano, |
Free forum by Nabble | Edit this page |