Smalltalk › Cincom › VisualWorks

OmniBase Scaling, Was: RE: Antw: persistence

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

7 messages Options

Mark Roberts

OmniBase Scaling, Was: RE: Antw: persistence

Steve,

At 07:00 AM 9/6/2006, Steven Kelly wrote:
>I don't know about now, but at least in 2002 Omnibase didn't scale. It hit
>the performance problem inherent in VW Dictionary hashing, i.e. around
>10000 elements. I sent a message to David Gorisek, but never got a reply.
>I've attached the Excel graphs I made then, which show how serious the
>problem is. (I also have the tests I ran, but I haven't had time to check
>there's nothing confidential in the data I used.) It shouldn't be too hard
>to correct, but I don't know if it has been sorted out yet.

Could you share your test code with us? I can try it in a recent version of
OmniBase.

Thanks,

M

omniBench.zip (61K) Download Attachment

Mark Roberts

Re: OmniBase Scaling, Was: RE: Antw: persistence

At 04:33 PM 9/6/2006, David Gorisek wrote:
>Nothing was changed in this regard since 2002 so I presume the problem
>with large transactions in VW still exists.
>
>The workaround is simply not to use large transactions.

Thanks, David.

I'm also wondering about what is meant, exactly, by "large transactions".

If it means trying to read or update 10000 elements within a single
transaction, that's one thing.

If it means trying to read or update one element from a BTreeDictionary
that contains 10000+ elements, that's another.

The former is something I could imagine working around. The latter is a
more serious problem, which doesn't seem quite as easy to work around.

Looking at Steve's graphs, I'm wondering if the performance problem isn't
so much a function of Dictionary hashing, as it is a function of
OrderedCollection>>insert:before:.

The curve in the measurement of time-to-commit seems to follow the latter,
not the former.

M

Mark D. Roberts wrote:

>>Steve,
>>
>>At 07:00 AM 9/6/2006, Steven Kelly wrote:
>>>I don't know about now, but at least in 2002 Omnibase didn't scale. It
>>>hit the performance problem inherent in VW Dictionary hashing, i.e.
>>>around 10000 elements. I sent a message to David Gorisek, but never got
>>>a reply. I've attached the Excel graphs I made then, which show how
>>>serious the problem is. (I also have the tests I ran, but I haven't had
>>>time to check there's nothing confidential in the data I used.) It
>>>shouldn't be too hard to correct, but I don't know if it has been sorted
>>>out yet.
>>
>>Could you share your test code with us? I can try it in a recent version
>>of OmniBase.
>>
>>Thanks,
>>
>>M
>
>

Steven Kelly

RE: OmniBase Scaling, Was: RE: Antw: persistence

In reply to this post by Mark Roberts

The test code is attached. I don't really remember the details any more,
but at a quick glance it looks like the test methods are on the class
side in the OmniBaseMCC parcel.

Steve

> -----Original Message-----
> From: Mark D. Roberts [mailto:[hidden email]]
> Sent: 06 September 2006 05:01
> To: Steven Kelly
> Cc: vwnc; David Gorisek
> Subject: OmniBase Scaling, Was: RE: Antw: persistence
>
> Steve,
>
> At 07:00 AM 9/6/2006, Steven Kelly wrote:
> >I don't know about now, but at least in 2002 Omnibase didn't scale.

It
> hit
> >the performance problem inherent in VW Dictionary hashing, i.e.

around
> >10000 elements. I sent a message to David Gorisek, but never got a
reply.
> >I've attached the Excel graphs I made then, which show how serious
the
> >problem is. (I also have the tests I ran, but I haven't had time to
check
> >there's nothing confidential in the data I used.) It shouldn't be too
> hard
> >to correct, but I don't know if it has been sorted out yet.
>
> Could you share your test code with us? I can try it in a recent
version
> of
> OmniBase.
>
> Thanks,
>
> M

OmniBaseTests.zip (60K) Download Attachment

Steven Kelly

RE: OmniBase Scaling, Was: RE: Antw: persistence

In reply to this post by Mark Roberts

> >The workaround is simply not to use large transactions.
>
> I'm also wondering about what is meant, exactly, by "large
transactions".
>
> If it means trying to read or update 10000 elements within a single
> transaction, that's one thing.
>
> If it means trying to read or update one element from a
BTreeDictionary
> that contains 10000+ elements, that's another.

IIRC the problem occurs when the "working set" of the application is
large. BTrees probably can avoid this, only reading the branches they
need. But even if you don't read 10,000 objects in a single transaction,
you may well build up a working set of 10,000 objects over several
transactions. This is particularly the case when you're using the
database as an OO persistency store, rather than like a relational
database. OO persistency stores tend to load objects as needed, and
there's no point flushing them if they are going to be used again. They
also tend to be used in single user mode, and to work with a fair
percentage of the total data set. Limiting yourself to 10,000 objects in
that scenario isn't feasible.

> Looking at Steve's graphs, I'm wondering if the performance problem
isn't
> so much a function of Dictionary hashing, as it is a function of
> OrderedCollection>>insert:before:.

It might be, but note there are three or four Dictionary methods there,
so you should probably add the figures for those together. Also, solving
the OrderedCollection problem is probably much easier than solving the
dictionary problem. The dictionary can probably contain any kind of
object, so you're reduced to using identityHash, and that simply has too
small a range. Maybe adding the class info into the hash would help, as
I mention in the "Add an instance variable in object class" thread on
Friday (2006/09/01).

Steve

erobles-2

RE: OmniBase Scaling, Was: RE: Antw: persistence

Thanks you all for the responses.
I`m considering the option of Omnibase now, as I`m not going to have
long transactions. Althought these is a simple option for persistence,
mappers may help if I need to import data, in the shape of objects,
from an old application. Does anyone use one of these mappers? I would
like to hear some experiences.
Regards
Esteban

Quoting Steven Kelly <[hidden email]>:

>> >The workaround is simply not to use large transactions.
>>
>> I'm also wondering about what is meant, exactly, by "large
> transactions".
>>
>> If it means trying to read or update 10000 elements within a single
>> transaction, that's one thing.
>>
>> If it means trying to read or update one element from a
> BTreeDictionary
>> that contains 10000+ elements, that's another.
>
> IIRC the problem occurs when the "working set" of the application is
> large. BTrees probably can avoid this, only reading the branches they
> need. But even if you don't read 10,000 objects in a single transaction,
> you may well build up a working set of 10,000 objects over several
> transactions. This is particularly the case when you're using the
> database as an OO persistency store, rather than like a relational
> database. OO persistency stores tend to load objects as needed, and
> there's no point flushing them if they are going to be used again. They
> also tend to be used in single user mode, and to work with a fair
> percentage of the total data set. Limiting yourself to 10,000 objects in
> that scenario isn't feasible.
>
>> Looking at Steve's graphs, I'm wondering if the performance problem
> isn't
>> so much a function of Dictionary hashing, as it is a function of
>> OrderedCollection>>insert:before:.
>
> It might be, but note there are three or four Dictionary methods there,
> so you should probably add the figures for those together. Also, solving
> the OrderedCollection problem is probably much easier than solving the
> dictionary problem. The dictionary can probably contain any kind of
> object, so you're reduced to using identityHash, and that simply has too
> small a range. Maybe adding the class info into the hash would help, as
> I mention in the "Add an instance variable in object class" thread on
> Friday (2006/09/01).
>
> Steve
>
>

----------------------------------------------------------------
Este mensaje ha sido enviado utilizando IMP desde LIFIA.

Alan Knight-2

RE: OmniBase Scaling, Was: RE: Antw: persistence

I use Glorp quite a lot, but since I wrote much of it, it's not surprising that it generally works the way I want it to.

At 07:40 AM 9/7/2006, [hidden email] wrote:

>Thanks you all for the responses.
>I`m considering the option of Omnibase now, as I`m not going to have
>long transactions. Althought these is a simple option for persistence,
>mappers may help if I need to import data, in the shape of objects,
>from an old application. Does anyone use one of these mappers? I would
>like to hear some experiences.
>Regards
>Esteban
>
>Quoting Steven Kelly <[hidden email]>:
>
>>>>The workaround is simply not to use large transactions.
>>>
>>>I'm also wondering about what is meant, exactly, by "large
>>transactions".
>>>
>>>If it means trying to read or update 10000 elements within a single
>>>transaction, that's one thing.
>>>
>>>If it means trying to read or update one element from a
>>BTreeDictionary
>>>that contains 10000+ elements, that's another.
>>
>>IIRC the problem occurs when the "working set" of the application is
>>large. BTrees probably can avoid this, only reading the branches they
>>need. But even if you don't read 10,000 objects in a single transaction,
>>you may well build up a working set of 10,000 objects over several
>>transactions. This is particularly the case when you're using the
>>database as an OO persistency store, rather than like a relational
>>database. OO persistency stores tend to load objects as needed, and
>>there's no point flushing them if they are going to be used again. They
>>also tend to be used in single user mode, and to work with a fair
>>percentage of the total data set. Limiting yourself to 10,000 objects in
>>that scenario isn't feasible.
>>
>>>Looking at Steve's graphs, I'm wondering if the performance problem
>>isn't
>>>so much a function of Dictionary hashing, as it is a function of
>>>OrderedCollection>>insert:before:.
>>
>>It might be, but note there are three or four Dictionary methods there,
>>so you should probably add the figures for those together. Also, solving
>>the OrderedCollection problem is probably much easier than solving the
>>dictionary problem. The dictionary can probably contain any kind of
>>object, so you're reduced to using identityHash, and that simply has too
>>small a range. Maybe adding the class info into the hash would help, as
>>I mention in the "Add an instance variable in object class" thread on
>>Friday (2006/09/01).
>>
>>Steve
>>
>
>
>
>----------------------------------------------------------------
>Este mensaje ha sido enviado utilizando IMP desde LIFIA.
>

--
Alan Knight [|], Cincom Smalltalk Development
[hidden email]
[hidden email]
http://www.cincom.com/smalltalk

"The Static Typing Philosophy: Make it fast. Make it right. Make it run." - Niall Ross

tblanchard

Re: OmniBase Scaling, Was: RE: Antw: persistence

In reply to this post by erobles-2

FWIW, I've got an app that uses Omnibase - I'm looking to get rid of
it in favor of a good old relational DB (postgres) with GLORP.

On Sep 7, 2006, at 4:40 AM, [hidden email] wrote:

> Thanks you all for the responses.
> I`m considering the option of Omnibase now, as I`m not going to
> have long transactions. Althought these is a simple option for
> persistence, mappers may help if I need to import data, in the
> shape of objects, from an old application. Does anyone use one of
> these mappers? I would like to hear some experiences.
> Regards
> Esteban
>
> Quoting Steven Kelly <[hidden email]>:
>
>>> >The workaround is simply not to use large transactions.
>>>
>>> I'm also wondering about what is meant, exactly, by "large
>> transactions".
>>>
>>> If it means trying to read or update 10000 elements within a single
>>> transaction, that's one thing.
>>>
>>> If it means trying to read or update one element from a
>> BTreeDictionary
>>> that contains 10000+ elements, that's another.
>>
>> IIRC the problem occurs when the "working set" of the application is
>> large. BTrees probably can avoid this, only reading the branches they
>> need. But even if you don't read 10,000 objects in a single
>> transaction,
>> you may well build up a working set of 10,000 objects over several
>> transactions. This is particularly the case when you're using the
>> database as an OO persistency store, rather than like a relational
>> database. OO persistency stores tend to load objects as needed, and
>> there's no point flushing them if they are going to be used again.
>> They
>> also tend to be used in single user mode, and to work with a fair
>> percentage of the total data set. Limiting yourself to 10,000
>> objects in
>> that scenario isn't feasible.
>>
>>> Looking at Steve's graphs, I'm wondering if the performance problem
>> isn't
>>> so much a function of Dictionary hashing, as it is a function of
>>> OrderedCollection>>insert:before:.
>>
>> It might be, but note there are three or four Dictionary methods
>> there,
>> so you should probably add the figures for those together. Also,
>> solving
>> the OrderedCollection problem is probably much easier than solving
>> the
>> dictionary problem. The dictionary can probably contain any kind of
>> object, so you're reduced to using identityHash, and that simply
>> has too
>> small a range. Maybe adding the class info into the hash would
>> help, as
>> I mention in the "Add an instance variable in object class" thread on
>> Friday (2006/09/01).
>>
>> Steve
>>
>>
>
>
>
> ----------------------------------------------------------------
> Este mensaje ha sido enviado utilizando IMP desde LIFIA.
>