[7.7.1] Store/Glorp Cache bug & fix & bug

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[7.7.1] Store/Glorp Cache bug & fix & bug

Reinout Heeck-2
  Hi Alan, hi all;
    we ran into a bug with the Glorp cache.


Circumstances:
   we are using our automated build tools to load ~1000 packages in one
go (through the implied prerequisites tree of a top level package).

Symptoms:
   near the end of the load the image being built becomes unresponsive,
the VM is fully loading a CPU core.
after a couple of days of hackery to get 'between' this process we found
that it is the #innerFinalizationLoop that always seems to be busy with
#fixCollisionsFrom: in Glorp's EphemeralValueDictionary.

Analysis:
The Glorp cache uses a dictionary where the keys often are consecutive
integers (they are the #primaryKey of of some table, and are typically
doled out by the DB through a Sequence).
Dictionaries and consecutive integer keys don't mix well in VW; what we
saw was legitimate GC activity (#mourn-ing Ephemerons) but the order in
which this mourning happens means that the EphemeralValueDictionary
spends a huge amount of time in #fixCollisionsFrom:, basically because
there happen to be many collisions.


Fix:
we specialized #initialIndexFor:boundedBy: on EphemeralValueDictionary
so it spreads out entries that happen to have consecutive integer keys.
This way the dictionary will hopefully contain a lot less hash clashes.

initialIndexFor: aHashValue boundedBy: length
     "Find the place where we should start the search.
     Optimize for consecutive integers as keys (generated from a
Database Sequence)"

     | gap |
     gap := 7.
     ^aHashValue*gap \\ length + 1


We tried several small prime numbers for 'gap' and found that 7 gave us
the shortest build time....







During our investigation we also found another potential bug in
Glorp.Cache>>

mournKeyOf: anEphemeron
     policy takeExpiryActionForKey: anEphemeron key withValue:
anEphemeron value in: self.


Here we see code mourning an ephemeron, but unpacking it into hard(!)
references to both its key and value.
During normal operation this will probably not be a problem (because
these references are only in temps and args), but if a GC kicks in
exactly during this code the references will be counted as hard,
unexpectedly reviving one of the weakly referenced objects. This could
happen any time, even after the EphemeralValue has been marked as
unrevivable by setting its manager to nil!

Suggested fix:
   for VW create an alternative code path to
#takeExpiryActionForKey:withValue:in: that will accept an EphemeralValue
instead of separate key and object references --> never reference the
#value object of the EphemeralValue unless you really really want to
#revive it.
Nota bene: this also holds for some inherited behavior, e.g.
#privateRemoveKey:ifAbsent: should be specialized to *not* return (or
create) a hard reference to the #value of the EphemeralValue being removed.



HTH,

Reinout Heeck

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [7.7.1] Store/Glorp Cache bug & fix & bug

Alan Knight-2
One thing you can do that would avoid this is setting the CachePolicy of the GlorpSession that you're using to not use a weak dictionary. That's something we've been looking at internally, although it was more in the publishing area than loading.  That would be
 session system cachePolicy: CachePolicy new.

Another one would be to use a different session for loading prerequisites. That's perhaps something we ought to look at, so we don't overload the one.

But I suspect that the other thing you could do is modify a bunch of code. This isn't so much of a problem with Glorp as that VisualWorks dictionaries don't reall like long runs of consecutive integer keys, especially those that start low. That's something it would be nice to do something about, but in the Store context, it means that you're probably seeing this because this is a brand new database.

So, something like
d := Dictionary new.
(1 to: 1000) do: [:each | d at: each put: each] .

runs fast, but if you make it 1 to: 100000 it runs slowly, but if you make it 100000 to: 200000 it runs fast again.

At 09:53 AM 2010-10-18, Reinout Heeck wrote:
 Hi Alan, hi all;
   we ran into a bug with the Glorp cache.


Circumstances:
  we are using our automated build tools to load ~1000 packages in one go (through the implied prerequisites tree of a top level package).

Symptoms:
  near the end of the load the image being built becomes unresponsive, the VM is fully loading a CPU core.
after a couple of days of hackery to get 'between' this process we found that it is the #innerFinalizationLoop that always seems to be busy with #fixCollisionsFrom: in Glorp's EphemeralValueDictionary.

Analysis:
The Glorp cache uses a dictionary where the keys often are consecutive integers (they are the #primaryKey of of some table, and are typically doled out by the DB through a Sequence).
Dictionaries and consecutive integer keys don't mix well in VW; what we saw was legitimate GC activity (#mourn-ing Ephemerons) but the order in which this mourning happens means that the EphemeralValueDictionary spends a huge amount of time in #fixCollisionsFrom:, basically because there happen to be many collisions.


Fix:
we specialized #initialIndexFor:boundedBy: on EphemeralValueDictionary so it spreads out entries that happen to have consecutive integer keys. This way the dictionary will hopefully contain a lot less hash clashes.

initialIndexFor: aHashValue boundedBy: length
    "Find the place where we should start the search.
    Optimize for consecutive integers as keys (generated from a Database Sequence)"

    | gap |
    gap := 7.
    ^aHashValue*gap \\ length + 1


We tried several small prime numbers for 'gap' and found that 7 gave us the shortest build time....







During our investigation we also found another potential bug in Glorp.Cache>>

mournKeyOf: anEphemeron
    policy takeExpiryActionForKey: anEphemeron key withValue: anEphemeron value in: self.


Here we see code mourning an ephemeron, but unpacking it into hard(!) references to both its key and value.
During normal operation this will probably not be a problem (because these references are only in temps and args), but if a GC kicks in exactly during this code the references will be counted as hard, unexpectedly reviving one of the weakly referenced objects. This could happen any time, even after the EphemeralValue has been marked as unrevivable by setting its manager to nil!

Suggested fix:
  for VW create an alternative code path to #takeExpiryActionForKey:withValue:in: that will accept an EphemeralValue instead of separate key and object references --> never reference the #value object of the EphemeralValue unless you really really want to #revive it.
Nota bene: this also holds for some inherited behavior, e.g. #privateRemoveKey:ifAbsent: should be specialized to *not* return (or create) a hard reference to the #value of the EphemeralValue being removed.



HTH,

Reinout Heeck

--
Alan Knight [|], Engineering Manager, Cincom Smalltalk

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc