Hi everyone,
I have a large query that seems to be crashing my image. I'm running it on Postgres, it may be happening when I have a break point in the code. I'm not sure. Does anyone have any suggestion on handling large sets of data coming from the database into a Smalltalk image? thanks, David
_______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
The Postgresql driver does not support streaming right now,
so large queries come back all at once. The easiest workaround is to use
the postgresql offset and limit commands to break the results up into
chunks.
At 07:11 AM 7/7/2009, David Long wrote: Hi everyone, -- CEO SageTea Group 613 722 2091 x 5 http://sageteagroup.com_______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc --
Alan Knight [|], Engineering Manager, Cincom Smalltalk
_______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by david.long
Hi, David,
I don't know what large means for you, but I was able to load at least 80/100.000 rows of data in an image without problem, using postgres. Only I had to change the memory settings, else I had memory limit problems. Obviously, I had also to wait a bit... Using VW 7.6 and postgres parcel found on the 7.6 distro. Generally speaking, in any case is probably better try to do it in step, for example using LIMIT and OFFSET ciao Giorgio
On Tue, Jul 7, 2009 at 1:11 PM, David Long <[hidden email]> wrote:
_______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
ah k, so count the entries and use limit and offset to get them in
smaller size chunks...awesome, thanks fellas.. David -----Original Message----- From: giorgio ferraris <[hidden email]> To: David Long <[hidden email]> Cc: VWNC <[hidden email]> Subject: Re: [vwnc] Large queries with PostgreSQL driver Date: Tue, 7 Jul 2009 16:36:06 +0200 Hi, David, I don't know what large means for you, but I was able to load at least 80/100.000 rows of data in an image without problem, using postgres. Only I had to change the memory settings, else I had memory limit problems. Obviously, I had also to wait a bit... Using VW 7.6 and postgres parcel found on the 7.6 distro. Generally speaking, in any case is probably better try to do it in step, for example using LIMIT and OFFSET ciao Giorgio On Tue, Jul 7, 2009 at 1:11 PM, David Long <[hidden email]> wrote: Hi everyone, I have a large query that seems to be crashing my image. I'm running it on Postgres, it may be happening when I have a break point in the code. I'm not sure. Does anyone have any suggestion on handling large sets of data coming from the database into a Smalltalk image? thanks, David -- CEO SageTea Group 613 722 2091 x 5 http://sageteagroup.com _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
I've found similar problems. I use cursors to chunk over large queries in postgres. I built a simple object that has some of the same api as streams that wrap the cursor and query, and then can stream over the results. It's interesting that I can get the same number of rows into the image using a cursor to return a few (several thousand) at a time with no problem, but pulling them in in one fell swoop causes problems, even though in the end, I still have all the values in the image in one collection no matter which way I get them.
Mike Mike Hales Engineering Manager KnowledgeScape www.kscape.com On Tue, Jul 7, 2009 at 9:11 AM, David Long <[hidden email]> wrote: ah k, so count the entries and use limit and offset to get them in _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Hi Mike,
yup - I have some rudimentary paging running now - its cruising along and is passed where it was blowing up. I might upload this to the public repository - no doubt there's a few people who can improve on this "wrapper", and it would be useful for those Postgres people out there.. David -----Original Message----- From: Mike Hales <[hidden email]> To: David Long <[hidden email]> Cc: VWNC <[hidden email]> Subject: Re: [vwnc] Large queries with PostgreSQL driver Date: Tue, 7 Jul 2009 13:34:41 -0600 I've found similar problems. I use cursors to chunk over large queries in postgres. I built a simple object that has some of the same api as streams that wrap the cursor and query, and then can stream over the results. It's interesting that I can get the same number of rows into the image using a cursor to return a few (several thousand) at a time with no problem, but pulling them in in one fell swoop causes problems, even though in the end, I still have all the values in the image in one collection no matter which way I get them. Mike Mike Hales Engineering Manager KnowledgeScape www.kscape.com On Tue, Jul 7, 2009 at 9:11 AM, David Long <[hidden email]> wrote: ah k, so count the entries and use limit and offset to get them in smaller size chunks...awesome, thanks fellas.. David -----Original Message----- From: giorgio ferraris <[hidden email]> To: David Long <[hidden email]> Cc: VWNC <[hidden email]> Subject: Re: [vwnc] Large queries with PostgreSQL driver Date: Tue, 7 Jul 2009 16:36:06 +0200 Hi, David, I don't know what large means for you, but I was able to load at least 80/100.000 rows of data in an image without problem, using postgres. Only I had to change the memory settings, else I had memory limit problems. Obviously, I had also to wait a bit... Using VW 7.6 and postgres parcel found on the 7.6 distro. Generally speaking, in any case is probably better try to do it in step, for example using LIMIT and OFFSET ciao Giorgio On Tue, Jul 7, 2009 at 1:11 PM, David Long <[hidden email]> wrote: Hi everyone, I have a large query that seems to be crashing my image. I'm running it on Postgres, it may be happening when I have a break point in the code. I'm not sure. Does anyone have any suggestion on handling large sets of data coming from the database into a Smalltalk image? thanks, David -- CEO SageTea Group 613 722 2091 x 5 http://sageteagroup.com _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Alan Knight-2
Does anybody care to elaborate ?
Are you suggesting to issue multiple queries in a form like Select * from foo limit 100 first and then Select * from foo offset 101 limit 100 Select * from foo offset 201 limit 100 ................................ And to ensure that your view on the DB is constant do it inside transaction ? Alan Knight wrote: > The Postgresql driver does not support streaming right now, so large > queries come back all at once. The easiest workaround is to use the > postgresql offset and limit commands to break the results up into chunks. > > At 07:11 AM 7/7/2009, David Long wrote: >> Hi everyone, >> >> I have a large query that seems to be crashing my image. I'm running >> it on Postgres, it may be happening when I have a break point in the >> code. >> I'm not sure. Does anyone have any suggestion on handling large sets >> of data coming from the database into a Smalltalk image? >> >> thanks, >> >> David >> >> -- >> >> CEO >> SageTea Group >> 613 722 2091 x 5 >> http://sageteagroup.com >> >> _______________________________________________ >> vwnc mailing list >> [hidden email] >> http://lists.cs.uiuc.edu/mailman/listinfo/vwnc > > -- > Alan Knight [|], Engineering Manager, Cincom Smalltalk > [hidden email] > [hidden email] > http://www.cincom.com/smalltalk > ------------------------------------------------------------------------ > > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc > _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Mike Hales
Mike that is very interesting ! can you share it with others ?
Mike Hales wrote: > I've found similar problems. I use cursors to chunk over large queries > in postgres. I built a simple object that has some of the same api as > streams that wrap the cursor and query, and then can stream over the > results. It's interesting that I can get the same number of rows into the image using a cursor to return a few (several thousand) at a time with no problem, but pulling them in in one fell swoop causes problems, even though in the end, I still have all the values in the image in one collection no matter which way I get them. > > > Mike > > Mike Hales > Engineering Manager > KnowledgeScape > www.kscape.com <http://www.kscape.com> > > _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Mark Pirogovsky-3
Mark Pirogovsky wrote:
> Does anybody care to elaborate ? > > Are you suggesting to issue multiple queries in a form like > > Select * from foo limit 100 > > first and then > > Select * from foo offset 101 limit 100 > Select * from foo offset 201 limit 100 > ................................ > And to ensure that your view on the DB is constant do it inside > transaction ? > > WebVelocity for doing 'pages' of results from the db. Michael _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
You can use a cursor to give stream like behavior over a large query.
DECLARE mycursor CURSOR FOR SELECT * FROM foo ; Then you can position using the move command or use the fetch command
FETCH FORWARD 100 FROM mycursor ; This returns the next 100 ( or up to the end whichever is less ) and leaves the cursor positioned correctly for the next fetch. You can see the docs at http://www.postgresql.org/docs/8.1/static/sql-fetch.html and http://www.postgresql.org/docs/8.1/static/sql-declare.html
This can be faster than separate select statements and is meant for this kind of thing. My little object wraps a cursor and a ReadStream. When initialized, it builds the cursor and performs the first fetch, then opens a ReadStream on the results. It has a stream-like api, that calls into the real stream, but when the stream is at the end, it does another fetch, and if it returns results, replaces the stream with a new one on the new results, then keeps on going. When the stream is at end, and the fetch returns no more results, I return a nil and close the cursor. It doesn't have a full stream api, but close enough so that my trends, histograms and tables that use it can be passed a ReadStream, or my SQLCursor object and exercise the same api on both. I don't have problems blowing up the image with this technique when the tables are really big.
Mike Mike Hales
Engineering Manager KnowledgeScape www.kscape.com On Tue, Jul 7, 2009 at 5:06 PM, Michael Lucas-Smith <[hidden email]> wrote:
_______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
I gave this a quick run in the ad hoc query tool tool in VW - I had to
add "WITH HOLD" ... DECLARE mycursor CURSOR WITH HOLD FOR SELECT * FROM foo; This will work quite nicely I expect. thanks Mike. David -----Original Message----- From: Mike Hales <[hidden email]> To: VWNC <[hidden email]>, Mark Pirogovsky <[hidden email]>, David Long <[hidden email]> Subject: Re: [vwnc] Large queries with PostgreSQL driver Date: Wed, 8 Jul 2009 09:53:01 -0600 You can use a cursor to give stream like behavior over a large query. DECLARE mycursor CURSOR FOR SELECT * FROM foo ; Then you can position using the move command or use the fetch command FETCH FORWARD 100 FROM mycursor ; This returns the next 100 ( or up to the end whichever is less ) and leaves the cursor positioned correctly for the next fetch. You can see the docs at http://www.postgresql.org/docs/8.1/static/sql-fetch.html and http://www.postgresql.org/docs/8.1/static/sql-declare.html This can be faster than separate select statements and is meant for this kind of thing. My little object wraps a cursor and a ReadStream. When initialized, it builds the cursor and performs the first fetch, then opens a ReadStream on the results. It has a stream-like api, that calls into the real stream, but when the stream is at the end, it does another fetch, and if it returns results, replaces the stream with a new one on the new results, then keeps on going. When the stream is at end, and the fetch returns no more results, I return a nil and close the cursor. It doesn't have a full stream api, but close enough so that my trends, histograms and tables that use it can be passed a ReadStream, or my SQLCursor object and exercise the same api on both. I don't have problems blowing up the image with this technique when the tables are really big. Mike Mike Hales Engineering Manager KnowledgeScape www.kscape.com On Tue, Jul 7, 2009 at 5:06 PM, Michael Lucas-Smith <[hidden email]> wrote: Mark Pirogovsky wrote: > Does anybody care to elaborate ? > > Are you suggesting to issue multiple queries in a form like > > Select * from foo limit 100 > > first and then > > Select * from foo offset 101 limit 100 > Select * from foo offset 201 limit 100 > ................................ > And to ensure that your view on the DB is constant do it inside > transaction ? > > Yes that is pretty much right. That's the technique we use in WebVelocity for doing 'pages' of results from the db. Michael _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Free forum by Nabble | Edit this page |