Persisting IdentityDictionary

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Persisting IdentityDictionary

Elliot Finley
Chris or whoever may be able to answer,

I'm in the process of evaluating Aida/Web.  Part of the evaluation is
obviously the different persistence strategies.  Its default strategy
is to use the image and just take a snapshot every hour.  That works
for some people but I really don't feel comfortable with that strategy
for many reasons I won't go into.

I would really like to use Magma to persist my object graph.  At first
glance it looks simple, just use Magma like normal and persist the
graph.  But upon closer inspection, it turns out that one of the
features that makes Aida/Web so appealing and so easy to use also
makes it difficult (I think, I hope I'm wrong) to persist with
anything other than the image or Gemstone.

Aida/Web keeps two dictionaries:

URL -> Object (Dictionary)
Object -> URL (IdentityDictionary)

If these aren't persisted in an external database, then they'll keep a
reference to every object used in the web application in your image,
thus negating the benefits of using an external database in the first
place.  The Dictionary is easy.  The IdentityDictionary is using the
Object's identityHash as the key and I don't see any easy way to make
that work.

So the question boils down to - Is there a way to persist an
IdentityHash with Magma?  If not, are there any workarounds that you
can think of?

Any help would be appreciated.

Thanks,
Elliot
_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma
Reply | Threaded
Open this post in threaded view
|

Re: Persisting IdentityDictionary

Chris Muller-3
On Mon, Apr 18, 2011 at 9:29 PM, Elliot Finley <[hidden email]> wrote:

> Chris or whoever may be able to answer,
>
> I'm in the process of evaluating Aida/Web.  Part of the evaluation is
> obviously the different persistence strategies.  Its default strategy
> is to use the image and just take a snapshot every hour.  That works
> for some people but I really don't feel comfortable with that strategy
> for many reasons I won't go into.
>
> I would really like to use Magma to persist my object graph.  At first
> glance it looks simple, just use Magma like normal and persist the
> graph.  But upon closer inspection, it turns out that one of the
> features that makes Aida/Web so appealing and so easy to use also
> makes it difficult (I think, I hope I'm wrong) to persist with
> anything other than the image or Gemstone.
>
> Aida/Web keeps two dictionaries:
>
> URL -> Object (Dictionary)
> Object -> URL (IdentityDictionary)
>
> If these aren't persisted in an external database, then they'll keep a
> reference to every object used in the web application in your image,
> thus negating the benefits of using an external database in the first
> place.

I do not care for the particular design choice of several frameworks
using global variable.  That's just my taste, it's not a big problem
for Magma; you can just put your root object as a key in the global
dictionary.  Given these axioms:

  - if the model is small, it'll fit into memory; ACID db properties
are retained.
  - that if the model is large, you might be using large-capable
objects like MagmaCollections, which page out objects.
  - if your model is large and rich (e.g., not using MagmaCollections)
then you have #stubOut: to trim off the branches.

Beyond that, the idea of Squeak is that it's a modifiable system.
This particular aspect of Aida would be easy to improve on if none of
the above is satisfactory.  For example, instead of a global
Dictionary, a intermediate Broker or Listener could solve this issue.
Contribute it back to Janko.

>  The Dictionary is easy.  The IdentityDictionary is using the
> Object's identityHash as the key and I don't see any easy way to make
> that work.

There's nothing you need to do.  IdentityDictionary's work fine.
Magma maintains its own identity "oid" for each object that is totally
independent of the Squeak-image identity, which is what
IdentityDictionary is based on.

If you're curious, just look at the "*ma object serialization" methods
of Dictionary, you'll understand why it works; it simply uses #at:put:
to materialize the IdentityDictionary.

 - Chris
_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma
Reply | Threaded
Open this post in threaded view
|

Re: Persisting IdentityDictionary

Elliot Finley
On Tue, Apr 19, 2011 at 6:23 PM, Chris Muller <[hidden email]> wrote:

> On Mon, Apr 18, 2011 at 9:29 PM, Elliot Finley <[hidden email]> wrote:
>>
>> Aida/Web keeps two dictionaries:
>>
>> URL -> Object (Dictionary)
>> Object -> URL (IdentityDictionary)
>>
>> If these aren't persisted in an external database, then they'll keep a
>> reference to every object used in the web application in your image,
>> thus negating the benefits of using an external database in the first
>> place.
>
> I do not care for the particular design choice of several frameworks
> using global variable.  That's just my taste, it's not a big problem
> for Magma; you can just put your root object as a key in the global
> dictionary.  Given these axioms:

In Aida's (Janko's) defense, these are not true global variables.
They are instance variables in a Class called URLResolver that is used
by the WebRouter.  But they do end up holding a reference to every
object in the model.

>  - if the model is small, it'll fit into memory; ACID db properties
> are retained.
>  - that if the model is large, you might be using large-capable
> objects like MagmaCollections, which page out objects.

So my next question is:  Do you have any plans for a MagmaIdentityDictionary? :)

Basically I need a MagmaCollection where I can use the object as the
key so I don't have to keep everything in memory.

>  - if your model is large and rich (e.g., not using MagmaCollections)
> then you have #stubOut: to trim off the branches.
>
> Beyond that, the idea of Squeak is that it's a modifiable system.
> This particular aspect of Aida would be easy to improve on if none of
> the above is satisfactory.  For example, instead of a global
> Dictionary, a intermediate Broker or Listener could solve this issue.
> Contribute it back to Janko.

Would you be kind enough to expand on the intermediate Broker/Listener approach?

>>  The Dictionary is easy.  The IdentityDictionary is using the
>> Object's identityHash as the key and I don't see any easy way to make
>> that work.
>
> There's nothing you need to do.  IdentityDictionary's work fine.
> Magma maintains its own identity "oid" for each object that is totally
> independent of the Squeak-image identity, which is what
> IdentityDictionary is based on.
>
> If you're curious, just look at the "*ma object serialization" methods
> of Dictionary, you'll understand why it works; it simply uses #at:put:
> to materialize the IdentityDictionary.

This is good to know.  I appreciate you taking the time to explain it.

Thanks,
Elliot
_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma
Reply | Threaded
Open this post in threaded view
|

Re: Persisting IdentityDictionary

Chris Muller-3
> So my next question is:  Do you have any plans for a MagmaIdentityDictionary? :)
>
> Basically I need a MagmaCollection where I can use the object as the
> key so I don't have to keep everything in memory.

Well, MagmaCollections themselves are already identity-based; as in,
#includes: anObject.

But you should explain the larger picture of what you're trying to
solve.  What do you need a huge IdentityDictionary for?

>> Beyond that, the idea of Squeak is that it's a modifiable system.
>> This particular aspect of Aida would be easy to improve on if none of
>> the above is satisfactory.  For example, instead of a global
>> Dictionary, a intermediate Broker or Listener could solve this issue.
>> Contribute it back to Janko.
>
> Would you be kind enough to expand on the intermediate Broker/Listener approach?

No.  The point is sufficient, if Aida is forcing a Dictionary API on
you for persistence, you introduce a layer of indirection that fixes
that.  Whatever design pattern you choose is up to you.
_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma
Reply | Threaded
Open this post in threaded view
|

Re: Persisting IdentityDictionary

Elliot Finley
On Wed, Apr 20, 2011 at 10:28 AM, Chris Muller <[hidden email]> wrote:

>> So my next question is:  Do you have any plans for a MagmaIdentityDictionary? :)
>>
>> Basically I need a MagmaCollection where I can use the object as the
>> key so I don't have to keep everything in memory.
>
> Well, MagmaCollections themselves are already identity-based; as in,
> #includes: anObject.
>
> But you should explain the larger picture of what you're trying to
> solve.  What do you need a huge IdentityDictionary for?

Aida/Web uses a huge IdentityDictionary to keep a stable obj -> url
relationship and a huge Dictionary to keep a stable url -> obj
relationship.  This is how it accomplishes one of it's nicest features
- simple obj to obj refs in code translate to urls in the webpage.

so when I say:

e addLinkTo: address text: address lastName.

it checks the IdentityDictionary to see what URL was used for
'address' last time, and it uses it again.  If this is the first time,
then it creates the entries in the dictionaries.  There are also some
checks to make sure duplicate URLs aren't used, etc... but that's the
jest of it.

If I were to subclass MagmaPreallocatedDictionary and change:

MagmaPreallocatedDictionary>>keyHash: key
     ^ (key hash \\ maxBuckets) + 1

to this:

MagmaPreallocatedDictionary>>keyHash: key
     ^ (key aSelectorThatAnswersMagmaOID \\ maxBuckets) + 1

I would essentially have a MagmaPreallocatedIdentityDictionary, would
I not?  The only problem is, looking through the Magma code, it's not
immediately obvious how to implement aSelectorThatAnswersMagmaOID.  Is
there a simple way to get that?

Thanks,
Elliot
_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma
Reply | Threaded
Open this post in threaded view
|

Re: Persisting IdentityDictionary

Chris Muller-4
Thanks for the explanation.

> Aida/Web uses a huge IdentityDictionary to keep a stable obj -> url
> relationship and a huge Dictionary to keep a stable url -> obj
> relationship.  This is how it accomplishes one of it's nicest features
> - simple obj to obj refs in code translate to urls in the webpage.
>
> so when I say:
>
> e addLinkTo: address text: address lastName.
>
> it checks the IdentityDictionary to see what URL was used for
> 'address' last time, and it uses it again.  If this is the first time,
> then it creates the entries in the dictionaries.  There are also some
> checks to make sure duplicate URLs aren't used, etc... but that's the
> jest of it.

Ok.  I still don't understand why an IdentityDictionary would be used
in this case; maybe Janko knows.  1) What if the user supplied a new,
equivalent-but-not-identical address?  Objects that don't override #=
will be found by identity anyway in a standard Dictionary, and objects
that do override #= WANT to be found by equivalence, not identity.

So, I'm still confused as to the "why" Aida uses an IdentityDictionary
in this case but nevertheless, I see no reason Magma can't or
shouldn't support it.

> If I were to subclass MagmaPreallocatedDictionary and change:
>
> MagmaPreallocatedDictionary>>keyHash: key
>     ^ (key hash \\ maxBuckets) + 1
>
> to this:
>
> MagmaPreallocatedDictionary>>keyHash: key
>     ^ (key aSelectorThatAnswersMagmaOID \\ maxBuckets) + 1

aSelectorThatAnswersMagmaOID is already implemented, it's called
#magmaOid, implemented on Object.

But that won't work, because when you have new, uncommitted objects
that you attach to the persistent model (before commit), they will not
have a magmaOid yet (it would be nil).

The way to do it would be to do the same thing that the standard
IdentityDictionary overrides Dictionary:  Override
MagmaPreallocatedDictionary>>#at: key ifAbsent: aBlock with:

1 | bucket |
2 bucket := self bucketAt: (self keyHash: key).
3 [bucket notNil] whileTrue: [ bucket key == key ifTrue: [^ bucket
value]. bucket := bucket next ].
4 ^ aBlock value

Note the change on line 3, to an identity compare.

You would also want to check all the other methods to make similar
change:  A quick run-through, I see #at:ifAbsentPut:, #at:put:,
#removeKey:, would all need similar change.  But I believe that should
be all that's needed to have a MagmaPreallocatedIdentityDictionary.

Please let us know if you get it working..

 - Chris
_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma
Reply | Threaded
Open this post in threaded view
|

Re: Persisting IdentityDictionary

Elliot Finley
On Tue, Apr 26, 2011 at 9:12 AM, Chris Muller <[hidden email]> wrote:

>> Aida/Web uses a huge IdentityDictionary to keep a stable obj -> url
>> relationship and a huge Dictionary to keep a stable url -> obj
>> relationship.  This is how it accomplishes one of it's nicest features
>> - simple obj to obj refs in code translate to urls in the webpage.
>>
>> so when I say:
>>
>> e addLinkTo: address text: address lastName.
>>
>> it checks the IdentityDictionary to see what URL was used for
>> 'address' last time, and it uses it again.  If this is the first time,
>> then it creates the entries in the dictionaries.  There are also some
>> checks to make sure duplicate URLs aren't used, etc... but that's the
>> jest of it.
>
> Ok.  I still don't understand why an IdentityDictionary would be used
> in this case; maybe Janko knows.  1) What if the user supplied a new,
> equivalent-but-not-identical address?  Objects that don't override #=
> will be found by identity anyway in a standard Dictionary, and objects
> that do override #= WANT to be found by equivalence, not identity.

If you read 5, 5.1 and 5.2 in the Aida Tutorial, this becomes clear.
The link is http://www.aidaweb.si/tutorial.html#h-10 and it's less
than a single page of reading.  Basically if Aida generates a URL like
/object/o4899057.html for a specific object, then forever after if the
user enters that URL, then we want THAT SPECIFIC object to be
accessed.  I don't see any way to do it other than identity.

> So, I'm still confused as to the "why" Aida uses an IdentityDictionary
> in this case but nevertheless, I see no reason Magma can't or
> shouldn't support it.
>
>> If I were to subclass MagmaPreallocatedDictionary and change:
>>
>> MagmaPreallocatedDictionary>>keyHash: key
>>     ^ (key hash \\ maxBuckets) + 1
>>
>> to this:
>>
>> MagmaPreallocatedDictionary>>keyHash: key
>>     ^ (key aSelectorThatAnswersMagmaOID \\ maxBuckets) + 1
>
> aSelectorThatAnswersMagmaOID is already implemented, it's called
> #magmaOid, implemented on Object.
>
> But that won't work, because when you have new, uncommitted objects
> that you attach to the persistent model (before commit), they will not
> have a magmaOid yet (it would be nil).
>
> The way to do it would be to do the same thing that the standard
> IdentityDictionary overrides Dictionary:  Override
> MagmaPreallocatedDictionary>>#at: key ifAbsent: aBlock with:
>
> 1       | bucket |
> 2       bucket := self bucketAt: (self keyHash: key).
> 3       [bucket notNil] whileTrue: [ bucket key == key ifTrue: [^ bucket
> value]. bucket := bucket next ].
> 4       ^ aBlock value
>
> Note the change on line 3, to an identity compare.

This is in addition to overriding
MagmaPreallocatedDictionary>>keyHash: key - right?  From the Magma
documentation:

-=-=-
It is absolutely _essential_ that you do not allow any identityHash to
be part of your hash-calculation. Otherwise, it won't work;
identityHash'es differ between image sessions.
-=-=-

This is why I was asking about using the MagmaOID.  I think that is
the only thing that stays stable between image sessions and across
different images.  Thus it would seem to be the only thing you could
base an IdentityDictionary on.

You're using 'self keyHash: key' to find a specific bucket that should
contain that object.  But 'self keyHash: key' may (and probably will)
answer a different number for the same object each time it is
materialized from the database unless it is somehow using the MagmaOID
in place of the identityHash.

Thanks,
Elliot

P.S.  I appreciate your time on this.  Hopefully I can become familiar
enough with Magma that I'll be able to answer some questions on the
list rather than just ask them.
_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma
Reply | Threaded
Open this post in threaded view
|

Re: Persisting IdentityDictionary

Chris Muller-4
> You're using 'self keyHash: key' to find a specific bucket that should
> contain that object.  But 'self keyHash: key' may (and probably will)
> answer a different number for the same object each time it is
> materialized from the database unless it is somehow using the MagmaOID
> in place of the identityHash.

Right, you definitely cannot use Squeak's #identityHash in the equation.

Hm, let me about this a few days, I think an easy solution will appear..

 - Chris
_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma
Reply | Threaded
Open this post in threaded view
|

Fwd: Persisting IdentityDictionary

Elliot Finley
Oops, forgot to reply-all to get it to the list.  Here it is.

On Tue, Apr 26, 2011 at 8:39 PM, Chris Muller <[hidden email]> wrote:
>> You're using 'self keyHash: key' to find a specific bucket that should
>> contain that object.  But 'self keyHash: key' may (and probably will)
>> answer a different number for the same object each time it is
>> materialized from the database unless it is somehow using the MagmaOID
>> in place of the identityHash.
>
> Right, you definitely cannot use Squeak's #identityHash in the equation.
>
> Hm, let me about this a few days, I think an easy solution will appear..

maybe:

use magmaOid as keyHash.

when finding a key (#at), check if key has magmaOid first, if not,
'self errorKeyNotFound: key' because it doesn't exist in
IdentityDictionary anyway.

when storing a value (#at: put:); before finding its bucket, check if
key has magmaOid, if not, store key in database so it will have a
magmaOid.  I'm not sure what the code looks like on this last one, but
it sounds good :)
_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma
Reply | Threaded
Open this post in threaded view
|

Re: Persisting IdentityDictionary

Chris Muller-3
> when finding a key (#at), check if key has magmaOid first, if not,
> 'self errorKeyNotFound: key' because it doesn't exist in
> IdentityDictionary anyway.

..that won't work for objects that have been added to the dictionary,
but not yet committed.

> when storing a value (#at: put:); before finding its bucket, check if
> key has magmaOid, if not, store key in database so it will have a
> magmaOid.  I'm not sure what the code looks like on this last one, but
> it sounds good :)

This won't work because, maybe you are already in a transaction when
you decide to send #at:put:, but not yet ready to commit.  I'm not
sure if you meant to use #commit to "store key in database".

But you are evaluating exactly the right kinds of ideas, way to go.

With Magma's networking layer ("Ma client server", which is used
independently in other apps, without Magma, BTW), it is very easy to
define new types of requests and responses; they're all first-class
objects.  So I'm thinking of a new request-type that would be an
oid-reservation for any uncommitted object so that, if that object is
ever committed, that's the oid it will get.

The only downside to this is that the whole purpose of
MagmaPreallocatedDictionary is to be very fast, so we don't want to
require an extra trip to the server for each and every at:put:.  So
requests for oid reservations should bring back 100 at a time, and
sessions can then dole them out one by one as they're requested.  We
have plenty of oid space, there is no concern about wasting a few oids
if not all of them get used.

 - Chris
_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma
Reply | Threaded
Open this post in threaded view
|

Re: Persisting IdentityDictionary

Elliot Finley
On Wed, Apr 27, 2011 at 5:02 PM, Chris Muller <[hidden email]> wrote:

> With Magma's networking layer ("Ma client server", which is used
> independently in other apps, without Magma, BTW), it is very easy to
> define new types of requests and responses; they're all first-class
> objects.  So I'm thinking of a new request-type that would be an
> oid-reservation for any uncommitted object so that, if that object is
> ever committed, that's the oid it will get.
>
> The only downside to this is that the whole purpose of
> MagmaPreallocatedDictionary is to be very fast, so we don't want to
> require an extra trip to the server for each and every at:put:.  So
> requests for oid reservations should bring back 100 at a time, and
> sessions can then dole them out one by one as they're requested.  We
> have plenty of oid space, there is no concern about wasting a few oids
> if not all of them get used.

If I were to see a visual diagram of how all the parts of Magma fit
together, I could come up to speed faster.  Coding this is a little
beyond me at the moment.

Elliot
_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma