Steve,
At 07:00 AM 9/6/2006, Steven Kelly wrote: >I don't know about now, but at least in 2002 Omnibase didn't scale. It hit >the performance problem inherent in VW Dictionary hashing, i.e. around >10000 elements. I sent a message to David Gorisek, but never got a reply. >I've attached the Excel graphs I made then, which show how serious the >problem is. (I also have the tests I ran, but I haven't had time to check >there's nothing confidential in the data I used.) It shouldn't be too hard >to correct, but I don't know if it has been sorted out yet. Could you share your test code with us? I can try it in a recent version of OmniBase. Thanks, M omniBench.zip (61K) Download Attachment |
At 04:33 PM 9/6/2006, David Gorisek wrote:
>Nothing was changed in this regard since 2002 so I presume the problem >with large transactions in VW still exists. > >The workaround is simply not to use large transactions. Thanks, David. I'm also wondering about what is meant, exactly, by "large transactions". If it means trying to read or update 10000 elements within a single transaction, that's one thing. If it means trying to read or update one element from a BTreeDictionary that contains 10000+ elements, that's another. The former is something I could imagine working around. The latter is a more serious problem, which doesn't seem quite as easy to work around. Looking at Steve's graphs, I'm wondering if the performance problem isn't so much a function of Dictionary hashing, as it is a function of OrderedCollection>>insert:before:. The curve in the measurement of time-to-commit seems to follow the latter, not the former. M Mark D. Roberts wrote: >>Steve, >> >>At 07:00 AM 9/6/2006, Steven Kelly wrote: >>>I don't know about now, but at least in 2002 Omnibase didn't scale. It >>>hit the performance problem inherent in VW Dictionary hashing, i.e. >>>around 10000 elements. I sent a message to David Gorisek, but never got >>>a reply. I've attached the Excel graphs I made then, which show how >>>serious the problem is. (I also have the tests I ran, but I haven't had >>>time to check there's nothing confidential in the data I used.) It >>>shouldn't be too hard to correct, but I don't know if it has been sorted >>>out yet. >> >>Could you share your test code with us? I can try it in a recent version >>of OmniBase. >> >>Thanks, >> >>M > > |
In reply to this post by Mark Roberts
The test code is attached. I don't really remember the details any more,
but at a quick glance it looks like the test methods are on the class side in the OmniBaseMCC parcel. Steve > -----Original Message----- > From: Mark D. Roberts [mailto:[hidden email]] > Sent: 06 September 2006 05:01 > To: Steven Kelly > Cc: vwnc; David Gorisek > Subject: OmniBase Scaling, Was: RE: Antw: persistence > > Steve, > > At 07:00 AM 9/6/2006, Steven Kelly wrote: > >I don't know about now, but at least in 2002 Omnibase didn't scale. > hit > >the performance problem inherent in VW Dictionary hashing, i.e. > >10000 elements. I sent a message to David Gorisek, but never got a reply. > >I've attached the Excel graphs I made then, which show how serious the > >problem is. (I also have the tests I ran, but I haven't had time to check > >there's nothing confidential in the data I used.) It shouldn't be too > hard > >to correct, but I don't know if it has been sorted out yet. > > Could you share your test code with us? I can try it in a recent version > of > OmniBase. > > Thanks, > > M OmniBaseTests.zip (60K) Download Attachment |
In reply to this post by Mark Roberts
> >The workaround is simply not to use large transactions.
> > I'm also wondering about what is meant, exactly, by "large transactions". > > If it means trying to read or update 10000 elements within a single > transaction, that's one thing. > > If it means trying to read or update one element from a BTreeDictionary > that contains 10000+ elements, that's another. IIRC the problem occurs when the "working set" of the application is large. BTrees probably can avoid this, only reading the branches they need. But even if you don't read 10,000 objects in a single transaction, you may well build up a working set of 10,000 objects over several transactions. This is particularly the case when you're using the database as an OO persistency store, rather than like a relational database. OO persistency stores tend to load objects as needed, and there's no point flushing them if they are going to be used again. They also tend to be used in single user mode, and to work with a fair percentage of the total data set. Limiting yourself to 10,000 objects in that scenario isn't feasible. > Looking at Steve's graphs, I'm wondering if the performance problem isn't > so much a function of Dictionary hashing, as it is a function of > OrderedCollection>>insert:before:. It might be, but note there are three or four Dictionary methods there, so you should probably add the figures for those together. Also, solving the OrderedCollection problem is probably much easier than solving the dictionary problem. The dictionary can probably contain any kind of object, so you're reduced to using identityHash, and that simply has too small a range. Maybe adding the class info into the hash would help, as I mention in the "Add an instance variable in object class" thread on Friday (2006/09/01). Steve |
Thanks you all for the responses.
I`m considering the option of Omnibase now, as I`m not going to have long transactions. Althought these is a simple option for persistence, mappers may help if I need to import data, in the shape of objects, from an old application. Does anyone use one of these mappers? I would like to hear some experiences. Regards Esteban Quoting Steven Kelly <[hidden email]>: >> >The workaround is simply not to use large transactions. >> >> I'm also wondering about what is meant, exactly, by "large > transactions". >> >> If it means trying to read or update 10000 elements within a single >> transaction, that's one thing. >> >> If it means trying to read or update one element from a > BTreeDictionary >> that contains 10000+ elements, that's another. > > IIRC the problem occurs when the "working set" of the application is > large. BTrees probably can avoid this, only reading the branches they > need. But even if you don't read 10,000 objects in a single transaction, > you may well build up a working set of 10,000 objects over several > transactions. This is particularly the case when you're using the > database as an OO persistency store, rather than like a relational > database. OO persistency stores tend to load objects as needed, and > there's no point flushing them if they are going to be used again. They > also tend to be used in single user mode, and to work with a fair > percentage of the total data set. Limiting yourself to 10,000 objects in > that scenario isn't feasible. > >> Looking at Steve's graphs, I'm wondering if the performance problem > isn't >> so much a function of Dictionary hashing, as it is a function of >> OrderedCollection>>insert:before:. > > It might be, but note there are three or four Dictionary methods there, > so you should probably add the figures for those together. Also, solving > the OrderedCollection problem is probably much easier than solving the > dictionary problem. The dictionary can probably contain any kind of > object, so you're reduced to using identityHash, and that simply has too > small a range. Maybe adding the class info into the hash would help, as > I mention in the "Add an instance variable in object class" thread on > Friday (2006/09/01). > > Steve > > ---------------------------------------------------------------- Este mensaje ha sido enviado utilizando IMP desde LIFIA. |
I use Glorp quite a lot, but since I wrote much of it, it's not surprising that it generally works the way I want it to.
At 07:40 AM 9/7/2006, [hidden email] wrote: >Thanks you all for the responses. >I`m considering the option of Omnibase now, as I`m not going to have >long transactions. Althought these is a simple option for persistence, >mappers may help if I need to import data, in the shape of objects, >from an old application. Does anyone use one of these mappers? I would >like to hear some experiences. >Regards >Esteban > >Quoting Steven Kelly <[hidden email]>: > >>>>The workaround is simply not to use large transactions. >>> >>>I'm also wondering about what is meant, exactly, by "large >>transactions". >>> >>>If it means trying to read or update 10000 elements within a single >>>transaction, that's one thing. >>> >>>If it means trying to read or update one element from a >>BTreeDictionary >>>that contains 10000+ elements, that's another. >> >>IIRC the problem occurs when the "working set" of the application is >>large. BTrees probably can avoid this, only reading the branches they >>need. But even if you don't read 10,000 objects in a single transaction, >>you may well build up a working set of 10,000 objects over several >>transactions. This is particularly the case when you're using the >>database as an OO persistency store, rather than like a relational >>database. OO persistency stores tend to load objects as needed, and >>there's no point flushing them if they are going to be used again. They >>also tend to be used in single user mode, and to work with a fair >>percentage of the total data set. Limiting yourself to 10,000 objects in >>that scenario isn't feasible. >> >>>Looking at Steve's graphs, I'm wondering if the performance problem >>isn't >>>so much a function of Dictionary hashing, as it is a function of >>>OrderedCollection>>insert:before:. >> >>It might be, but note there are three or four Dictionary methods there, >>so you should probably add the figures for those together. Also, solving >>the OrderedCollection problem is probably much easier than solving the >>dictionary problem. The dictionary can probably contain any kind of >>object, so you're reduced to using identityHash, and that simply has too >>small a range. Maybe adding the class info into the hash would help, as >>I mention in the "Add an instance variable in object class" thread on >>Friday (2006/09/01). >> >>Steve >> > > > >---------------------------------------------------------------- >Este mensaje ha sido enviado utilizando IMP desde LIFIA. > -- Alan Knight [|], Cincom Smalltalk Development [hidden email] [hidden email] http://www.cincom.com/smalltalk "The Static Typing Philosophy: Make it fast. Make it right. Make it run." - Niall Ross |
In reply to this post by erobles-2
FWIW, I've got an app that uses Omnibase - I'm looking to get rid of
it in favor of a good old relational DB (postgres) with GLORP. On Sep 7, 2006, at 4:40 AM, [hidden email] wrote: > Thanks you all for the responses. > I`m considering the option of Omnibase now, as I`m not going to > have long transactions. Althought these is a simple option for > persistence, mappers may help if I need to import data, in the > shape of objects, from an old application. Does anyone use one of > these mappers? I would like to hear some experiences. > Regards > Esteban > > Quoting Steven Kelly <[hidden email]>: > >>> >The workaround is simply not to use large transactions. >>> >>> I'm also wondering about what is meant, exactly, by "large >> transactions". >>> >>> If it means trying to read or update 10000 elements within a single >>> transaction, that's one thing. >>> >>> If it means trying to read or update one element from a >> BTreeDictionary >>> that contains 10000+ elements, that's another. >> >> IIRC the problem occurs when the "working set" of the application is >> large. BTrees probably can avoid this, only reading the branches they >> need. But even if you don't read 10,000 objects in a single >> transaction, >> you may well build up a working set of 10,000 objects over several >> transactions. This is particularly the case when you're using the >> database as an OO persistency store, rather than like a relational >> database. OO persistency stores tend to load objects as needed, and >> there's no point flushing them if they are going to be used again. >> They >> also tend to be used in single user mode, and to work with a fair >> percentage of the total data set. Limiting yourself to 10,000 >> objects in >> that scenario isn't feasible. >> >>> Looking at Steve's graphs, I'm wondering if the performance problem >> isn't >>> so much a function of Dictionary hashing, as it is a function of >>> OrderedCollection>>insert:before:. >> >> It might be, but note there are three or four Dictionary methods >> there, >> so you should probably add the figures for those together. Also, >> solving >> the OrderedCollection problem is probably much easier than solving >> the >> dictionary problem. The dictionary can probably contain any kind of >> object, so you're reduced to using identityHash, and that simply >> has too >> small a range. Maybe adding the class info into the hash would >> help, as >> I mention in the "Add an instance variable in object class" thread on >> Friday (2006/09/01). >> >> Steve >> >> > > > > ---------------------------------------------------------------- > Este mensaje ha sido enviado utilizando IMP desde LIFIA. > |
Free forum by Nabble | Edit this page |