Smalltalk › Cincom › VisualWorks

RE: [gemstone-smalltalk] Re: Very slow GS communication ...

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

2 messages Options

Paul Baumann

RE: [gemstone-smalltalk] Re: Very slow GS communication ...

Jaroslaw,

We've also discovered that VW+GBS becomes inefficient once GS oops
exceed the VW max small integer 536870911. We started to experience the
problem a couple days ago and your post was very timely and useful.
Thank you.

I'm looking for opinions on how to best address this problem. Here is a
little background on the problem:

VW crosses into LargePositiveIntegers above 536870911. The GBS GsToSt
cache dictionary has GS oop values for keys and is used to lookup
objects or proxies for a given oop for a session. Dictionaries (like the
GBS large capacity dictionaries) index keys by their response to #hash.
VW LargePositiveInteger>>hash returns a range of only 256. This causes
poor distribution of keys into GBS cache dictionaries and results in
slow sequential searches through abnormally large dictionary buckets.

((1 to: 10000000)
inject: Set new
into: [:all :i | all add: (536870911 + i) hash; yourself]) size
-> 256

The problem is in part due to the design of GBS caches, but also due to
the limited range of values returned from LargePositiveInteger>>hash.
There are two ways to fix the problem. One is to change GBS to do
something like this:

GbxGsToStDictionary>>hashForKey: aKey
"The #isInteger is because I seem to recall that delegates could
also be keys for special objects"
^(aKey isInteger ifTrue: [aKey] ifFalse: [aKey hash]) \\
tableSize + 1

I think this is what Jaroslaw did except that he used polymorphism with
#gsOopHash and modified #hashForKey: in a superclass. What we quickly
found was that more than just this method needs to be changed. GBS uses
some regular dictionaries with oop keys for traversals and the
performance of those are an issue too.

The other way to address the problem is to change the implementation of
the VW method LargePositiveInteger>>hash. A hash value is just an
(unsigned?) integer that normally remains the same for a given object
and that is usually used for indexing operations. The max identity hash
value is 16K for VW and 32K for VA, but #hash can and does return values
well beyond those ranges for objects like SmallInteger. Having a larger
range of hash values can significantly improve the distribution and
performance of GBS cache dictionaries--which brings me back to
LargePositiveInteger>>hash with a range of only 256. A better way to fix
the problem would be to get VW to change the implementation of #hash for
LargePositiveInteger so that it returns "self" just as small integers do
now.

The obvious risk involved with changing LargePositiveInteger>>#hash is
that existing collections may longer find large integers. GBS caches
won't have this problem as long as you are logged out when the code is
changed. It doesn't seem likely that other hashed-based collections
would have LargePositiveIntegers, but it is possible. My questions are:

Have you experienced this problem, and how did you work around
it?
Does anyone know of existing hash-based collections that would
be affected by changing LargePositiveInteger>>#hash to return "self"?
What other uses of #hash might be affected by this change?

This is a big problem that has a potentially trivial fix. Your response
is appreciated.

Paul Baumann
IntercontinentalExchange | ICE
2100 RiverEdge Pkwy | 5th Floor | Atlanta, GA 30328
Tel: 770.738.2137 | Fax: 770.951.1307 | Cel: 505.780.1470
[hidden email]

24-hour ice helpdesk 770.738.2101
www.theice.com

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of
[hidden email]
Sent: Wednesday, September 13, 2006 6:01 AM
To: GemStone Smalltalk Customer Forum
Subject: [gemstone-smalltalk] Re: Very slow GS communication ...

Hi Dennis,

We used to have a problem similar in nature - we dubbed it the Friday
Syndrome. The images were suddenly getting very slow on Fridays and
recovered the original speed after weekend processing after MFC.

What we found was that as we were getting towards the end of the week we
entered LargePositiveInteger realm with our oops. That broke the
distribution of the hashing function on the GS oop mapping dictionaries.

Compare the contents of <gsObjectCache> in your session in fast and slow
image. See what sits in there.

Our solution was to change the implementation of
GbxLargeKeyedCollection>>#hashForKey: to rely on a special #gsOopHash
method, which for large integer answered <self> instead the default hash
value.

--
Regards, Jaroslaw.
3-9593

Dennis Smith

<[hidden email]> To:
"GemStone Smalltalk Customer Forum"
Sent by:
<[hidden email]>
bounce-gemstone-smalltalk-5958528@eart cc:

h.lyris.net
Subject: [gemstone-smalltalk] Very slow GS communication ...

11/09/06 16:29

Please respond to "GemStone Smalltalk

Customer Forum"

I have asked this once before, but want to ask again before I dive into
digging into it.

We sometimes get a VW image, which when logged into GS becomes very
slow.
A factor of 10x approx.

The slowness apears to be in messages to/from the Gem.

Once an image starts doing this, logging off and on, or even onto a
different GS database has no effect, it remains "slow".

Exiting and restarting the image clears the problem.

We can tell when an image runs into this, we do an early connect of
about 10 classes, which usually takes about 100ms or less and suddenly
takes 4000 ms.

This is now new, been going on for a couple of years now, across VW
versions and GBS versions.

Has anyone else seen anything like this? If not, then I will start
looking at our own code and GBS overrides before digging deeper.

One problem in looking into this is that only occurs every 50 to 100 or
so logins, so not very often.

--
Dennis Smith [hidden email]
Cherniak Software Development Corporation +1 905.771.7011
400-10 Commerce Valley Dr E Fax: +1 905.771.6288
Thornhill, ON Canada L3T 7N7 http://www.CherniakSoftware.com

--------------------------------------------------------
This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.

Paul Baumann

RE: [gemstone-smalltalk] Re: Very slow GS communication ...

Hi Robert,

Thanks for the response. I did some performance comparisions between
current (t0) implementation and the cost of keeping the full large
positive integer (t1). You are right that it is slower to do a modulo on
a large integer than a small one. It took 373 ms longer for 3,999,999
large positive integers. It was half as fast, but probably not much in
the big scheme of things. In the earlier discussions, do you recall if
there was mention of other uses of #hash on LPI where this performance
difference could be a consideration?

I also compared performance with another approach of converting the
LargePositiveIntegers to strings. I found that computing the hash for a
string was more efficient (t2) than I'd expected and only slightly
slower than using the LPI itself (t1), but look at the hidden costs of
conversion (t3). If the oop ever exists in the image as a LPI then it
will need conversion to something else (like a string or small int), and
the conversion performance then becomes relevant. Of course there is
also some hidden cost to GC the intermediate objects if they aren't
small integers. If LPI>>hash returned a small integer then it would
probably be the result of some form of modulo operation--it would be an
extra unnecessary modulo operation that only serves to constrain
precision.

| ints intstrings t0 t1 t2 t3 |
ints := (536870912 to: 540870911) asArray.
t3 := Core.Time millisecondsToRun: [intstrings := ints collect: [:i | i
printString ]].
t0 := Core.Time millisecondsToRun: [ints do: [:i | (i hash \\ 1024) + 1
]].
t1 := Core.Time millisecondsToRun: [ints do: [:i | (i yourself \\ 1024)
+ 1 ]].
t2 := Core.Time millisecondsToRun: [intstrings do: [:i | (i hash \\
1024) + 1 ]].
Core.Array with: t0 with: t1 with: t2 with: t3

t0: 367
t1: 740
t2: 750
t3: 25602

I'm thinking modulo costs are incurred either way and that
LargePositiveInteger>>hash should return "self" without an arbitrary
(and perhaps unnecessary) reduction in precision. Am I missing
something?

By the way, the performance impact of this bug on our application was
that requests that normally took 3 seconds to process suddenly took 300
seconds. It was significant, and it all came down to the narrow range of
values returned from #hash for LargePositiveInteger.

Paul Baumann
IntercontinentalExchange | ICE
2100 RiverEdge Pkwy | 5th Floor | Atlanta, GA 30328
Tel: 770.738.2137 | Fax: 770.951.1307 | Cel: 505.780.1470
[hidden email]

24-hour ice helpdesk 770.738.2101
www.theice.com

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of
Robert Rosenbaum
Sent: Thursday, September 14, 2006 10:50 AM
To: GemStone Smalltalk Customer Forum
Subject: [gemstone-smalltalk] Re: Very slow GS communication ...
Importance: Low

I discussed the idea of changing LargePositiveInteger>>#hash on another
forum (I think it was vwnc). The advantage is that it gives better
performance in the case of sets with a mix of large SmallIntegers &
small LargePositiveIntegers (which is exactly what happens in the GBS
caches) because of all the hash collisions. The disadvantage is that
the #modulo of a LargePositiveInteger is more expensive, which is
required for each insertion of a new element (and every element for a
rehash). The extra cost of the #modulo depends on exactly which integer
it is, so it is hard to say exactly what the tradeoff is. In the case
of the GBS caches it is a clear win to use ^self for the hash of a LPI,
but in the general case the concensus was that it is better left as is.

At the time this problem was adressed at JP Morgan, the problem was
brought to the attention of Gemstone customer support. They agreed it
was a problem which needed fixing, and intended to do so for GBS 6.2 if
memory serves. This happened in the sense that the problem does not
occur in the 64 bit product, but that was not supposed to be the end of
the story. I believe they still intended to address this performance
killer in the 32 bit GBS, and I have not yet seen GBS 7.x to see if it
has been resolved.

--- Paul Baumann <[hidden email]> wrote:

> Jaroslaw,
>
> We've also discovered that VW+GBS becomes inefficient once GS oops
> exceed the VW max small integer 536870911. We started to experience
> the problem a couple days ago and your post was very timely and
useful.
> Thank you.
>
> I'm looking for opinions on how to best address this problem. Here is
> a little background on the problem:
>
> VW crosses into LargePositiveIntegers above 536870911. The GBS GsToSt
> cache dictionary has GS oop values for keys and is used to lookup
> objects or proxies for a given oop for a session. Dictionaries (like
> the GBS large capacity dictionaries) index keys by their response to
#hash.
> VW LargePositiveInteger>>hash returns a range of only 256. This causes

> poor distribution of keys into GBS cache dictionaries and results in
> slow sequential searches through abnormally large dictionary buckets.
>
> ((1 to: 10000000)
> inject: Set new
> into: [:all :i | all add: (536870911 + i) hash; yourself]) size
> -> 256
>
> The problem is in part due to the design of GBS caches, but also due
> to the limited range of values returned from

LargePositiveInteger>>hash.
> There are two ways to fix the problem. One is to change GBS to do
> something like this:
>
> GbxGsToStDictionary>>hashForKey: aKey
> "The #isInteger is because I seem to recall that delegates could
also
> be keys for special objects"
> ^(aKey isInteger ifTrue: [aKey] ifFalse: [aKey hash]) \\
tableSize +
> 1
>
> I think this is what Jaroslaw did except that he used polymorphism
> with #gsOopHash and modified #hashForKey: in a superclass. What we
> quickly found was that more than just this method needs to be changed.

> GBS uses some regular dictionaries with oop keys for traversals and
> the performance of those are an issue too.
>
> The other way to address the problem is to change the implementation
> of the VW method LargePositiveInteger>>hash. A hash value is just an
> (unsigned?) integer that normally remains the same for a given object
> and that is usually used for indexing operations. The max identity
> hash value is 16K for VW and 32K for VA, but #hash can and does return

> values well beyond those ranges for objects like SmallInteger. Having
> a larger range of hash values can significantly improve the
> distribution and performance of GBS cache dictionaries--which brings
> me back to
> LargePositiveInteger>>hash with a range of only 256. A better way to
> LargePositiveInteger>>fix
> the problem would be to get VW to change the implementation of #hash
> for LargePositiveInteger so that it returns "self" just as small
> integers do now.
>
> The obvious risk involved with changing LargePositiveInteger>>#hash is

> that existing collections may longer find large integers. GBS caches
> won't have this problem as long as you are logged out when the code is

> changed. It doesn't seem likely that other hashed-based collections
> would have LargePositiveIntegers, but it is possible. My questions
are:
>
> Have you experienced this problem, and how did you work around
it?
> Does anyone know of existing hash-based collections that would
be

> affected by changing LargePositiveInteger>>#hash to return "self"?
> What other uses of #hash might be affected by this change?
>
> This is a big problem that has a potentially trivial fix. Your
> response is appreciated.
>
> Paul Baumann
> IntercontinentalExchange | ICE
> 2100 RiverEdge Pkwy | 5th Floor | Atlanta, GA 30328
> Tel: 770.738.2137 | Fax: 770.951.1307 | Cel: 505.780.1470
> [hidden email]
>
> 24-hour ice helpdesk 770.738.2101
> www.theice.com
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf
> Of [hidden email]
> Sent: Wednesday, September 13, 2006 6:01 AM
> To: GemStone Smalltalk Customer Forum
> Subject: [gemstone-smalltalk] Re: Very slow GS communication ...
>
> Hi Dennis,
>
> We used to have a problem similar in nature - we dubbed it the Friday
> Syndrome. The images were suddenly getting very slow on Fridays and
> recovered the original speed after weekend processing after MFC.
>
> What we found was that as we were getting towards the end of the week
> we entered LargePositiveInteger realm with our oops. That broke the
> distribution of the hashing function on the GS oop mapping

dictionaries.

>
> Compare the contents of <gsObjectCache> in your session in fast and
> slow image. See what sits in there.
>
> Our solution was to change the implementation of
> GbxLargeKeyedCollection>>#hashForKey: to rely on a special #gsOopHash
> method, which for large integer answered <self> instead the default
> hash value.
>
>
> --
> Regards, Jaroslaw.
> 3-9593
>
>
>
>
> Dennis Smith
>
> <[hidden email]>

To:
> "GemStone Smalltalk Customer Forum"
> Sent by:
> <[hidden email]>
> bounce-gemstone-smalltalk-5958528@eart
cc:

>
> h.lyris.net
> Subject: [gemstone-smalltalk] Very slow GS communication ...
>
>
>
>
> 11/09/06 16:29
>
> Please respond to "GemStone Smalltalk
>
> Customer Forum"
>
>
>
>
>
>
>
>
>
> I have asked this once before, but want to ask again before I dive
> into digging into it.
>
> We sometimes get a VW image, which when logged into GS becomes very
> slow.
> A factor of 10x approx.
>
> The slowness apears to be in messages to/from the Gem.
>
> Once an image starts doing this, logging off and on, or even onto a
> different GS database has no effect, it remains "slow".
>
> Exiting and restarting the image clears the problem.
>
> We can tell when an image runs into this, we do an early connect of
> about 10 classes, which usually takes about 100ms or less and suddenly

> takes 4000 ms.
>
> This is now new, been going on for a couple of years now, across VW
> versions and GBS versions.
>
> Has anyone else seen anything like this? If not, then I will start
> looking at our own code and GBS overrides before digging deeper.
>
> One problem in looking into this is that only occurs every 50 to 100
> or so logins, so not very often.
>
> --
> Dennis Smith [hidden email]
> Cherniak Software Development Corporation +1 905.771.7011
> 400-10 Commerce Valley Dr E Fax: +1 905.771.6288
> Thornhill, ON Canada L3T 7N7 http://www.CherniakSoftware.com
>
>
>
> --------------------------------------------------------
> This message may contain confidential information and is intended for
> specific recipients unless explicitly noted otherwise. If you have
> reason to believe you are not an intended recipient of this message,
> please delete it and notify the sender. This message may not represent

> the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries
> or affiliates, and does not constitute a contract or guarantee.
> Unencrypted electronic mail is not secure and the recipient of this
message is expected to provide safeguards from viruses and pursue
alternate means of communication where privacy or a binding message is
desired.
>
>
>

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

--------------------------------------------------------
This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.