Smalltalk › Frameworks & Tools › Magma

Re: WriteBarrier etc

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

3 messages Options

Chris Muller-3

Re: WriteBarrier etc

Hi!

> I have made some further discoveries and actually see the degradation on the
> Magma server (vs in my client image). I am doing some tests with SSD disk,
> commit sizes and will update this thread more about this as I progress and
> write it up on a blog.
> I think this is key to the performance:
>
> 68.9% {11294ms} [] MaHashIndex>>add:at:
> 25.1% {4112ms} MaHashIndexRecord>>canAdd:
> ...
> 24.2% {3968ms} MaHashIndex>>selectRecord:
> 24.0% {3931ms} MaHashIndex>>readRecord:at:
> ...
> 17.2% {2814ms} MaHashIndex>>writeRecord:
> 17.1% {2799ms} MaFileRecordBroker>>writeFrom:at:

Some background:

- MaHashIndex is a file-structure that provides access to Integer
key->value pairs.

- MaHashIndex is the back-end representation of a single
MagmaCollection object. It is named with the oid of the
MagmaCollection object to ensure filename uniqueness. The keys are
the oids of the objects added to the collection, the values are
unused.

- Each index added to that MagmaCollection causes TWO additional
MaHashIndex files to be created on the backend server. They are named
with the oid+[index attribute name]. The keys are an integer
representation of the indexed "value" object, while the values are the
oids at that key. The additional MaHashIndexFile simply reverses the
key->value pairs; necessary to support complex queries.

So, for example, if there are two indexes on a MagmaCollection, each
object added to that collection cause MaHashIndex>>#add:at: to be
called FIVE times (once for MC, plus twice for each index).

On top of this, MaHashIndexes are inefficient at adding duplicate keys
because, even though it is quick to locate the first of any key in the
file, it has to read sequentially the remainder to find the end where
to append the new duplicates. It may be possible to design a fix to
this.

But before that, one can see how MaHashIndex[Record]Tester runs every
combination of key and record sizes, and use that to determine that
one that performs best for the keySize you need and value distribution
of those keys (you'd have to feed it a sampling of your input data).
Also important is to reconsider the model itself. It's easy to lean
on MagmaCollections when designing if their costs are not fully
understood, even when other design options could meet the actual
requirements without employing MagmaCollections..

You may have already seen it, but it is still a quite accurate
description of the MaHashIndex file format and algorithm (starting on
page 4).

http://wiki.squeak.org/squeak/uploads/2665/magma_file_format.pdf

> But this has to wait since I have run into something very annoying (and I
> cannot understand how my image became corrupted):
> Basically, I have a MagmaSession disconnect hanging on (this time I tried
> primDisconnect, the stack is almost the same on disconnect):
>
> Mutex>>critical:
> MaNetworkServerLink>>submit:
> [] in [] in [] in MagmaSession>>submit:to:
> [] in [] in BlockClosure>>maOn:do:on:do:on:do:
> BlockClosure>>on:do:

If you have another debugger open inside the Mutex then, yes, the next
one will wait.

You should be able to close the other debugger, then proceed the one
doing the disconnect.

> [] in BlockClosure>>maOn:do:on:do:on:do:
> [] in [] in BlockClosure>>maOn:do:on:do:
> BlockClosure>>on:do:
> [] in BlockClosure>>maOn:do:on:do:
> BlockClosure>>on:do:
> BlockClosure>>maOn:do:on:do:
> BlockClosure>>maOn:do:on:do:on:do:
> [] in [] in MagmaSession>>submit:to:
> BlockClosure>>ensure:
> CursorWithMask(Cursor)>>showWhile:
> [] in MagmaSession>>submit:to:
> Mutex>>critical:
> MagmaSession>>submit:to:
> [] in [] in MagmaSession>>primDisconnect
> MagmaSession>>linksDo:
>
> The mutex is held by a suspended Process in Debugger
> class>>morphicOpenOn:context:label:contents:fullView:
>
> I've tried to resume the process but MethodContext>>cannotReturn: is coming
> back to me and some poking around tells me that the title of the debugger
> was: "'MaObjectSerializationUserError: No createProxyBlock specified. See
> MaObjectSerializer>>toCreateProxies:'"

This can happen if an object changed while it was being serialized.
This is not supported. Are you serializing in a separate Process?

> Now, how can I break this Mutex to clean up the MagmaSession?
>
> If I try to terminate the process - it simply goes into suspend.
> Since I am doing changes, testing and ALT-. to break long commits or other
> vicious cycles - it would be good to know how to drop all Magma tracked
> objects from memory (i.e. all "Magma state", oids etc) and reset the
> database?
> What is the best strategy to do this?

When my image gets this wrecked and I can't abort it (e.g., exit image
without saving), then I use clean up utility methods:

MagmaSession cleanUp. "print it. It closes and cleans up all
in memory MagmaSessions and their references, reports the number of
intstances before->after."
MaInstaller chasePointerToAnyMagmaSession "Pick any
MagmaSession instance and show the trail references to facilitate
getting rid of it with another run of MagmaSession cleanUp."

Best,
Chris
_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma

TimKack

Re: WriteBarrier etc

Hi Chris,

Many thanks for the patience in helping me out!
I have considered using Glorp instead of Magma, but Magma is simply fantastic and I want to see how far I can take it.
I understand that I might need to say “it is not the correct tool” if the dataset is too large of course, but I really want it to work.

> On 30 May 2018, at 22:03, Chris Muller <[hidden email]> wrote:
>
>>
>> 17.2% {2814ms} MaHashIndex>>writeRecord:
>> 17.1% {2799ms} MaFileRecordBroker>>writeFrom:at:
>
> So, for example, if there are two indexes on a MagmaCollection, each
> object added to that collection cause MaHashIndex>>#add:at: to be
> called FIVE times (once for MC, plus twice for each index).
>

Yes - I think this is my biggest problem.
Some more details from my application - consider the following log entry (Yes - I am aware that is a bit of a data blurb):

1.2.3.4 127.0.0.1 connector-26 _2_1 [15/Mar/2018:23:59:59 +1100] "GET /branding/_1_1/unimelb-logo.svg HTTP/1.1" 200 52157 "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:58.0) Gecko/20100101 Firefox/58.0” "session_id=A3EC4123E563D3D2B6DE990D3C9923FF; s_session_id=EFD19A7163EF5C1B071C6D608997D3AB; web_client_cache_guid=9ddf5c31-6554-4439-bb31-2bb8d23c0575; JSESSIONID=8EF936432E3E4E8EED8BBA6AD9E7EAC6; __utmb=100687569.1.10.1521117448" 2 52157

When I started to “magmafying” the application, I decided that the most common things that I need to access/query from the 2-3 million entry log file were the following:
sessionId (A3EC4123E563D3D2B6DE990D3C9923FF)
userId (_2_1)
requestEnd (15/Mar/2018:23:59:59 +1100)
requestStart (requestEnd - 2)
requestTimeInMilliSeconds (2)
status (200)

Hence, I created a bunch of indexes on the MagmaCollection:

MaDateAndTimeIndex attribute: #requestEnd.
MaDateAndTimeIndex attribute: #requestStart.
MaIntegerIndex attribute: #requestTimeInMilliSeconds.
TKMaHTTPStatusCodeIndex attribute: #status
(MaSearchStringIndex attribute: #sessionId) beAscii; yourself.
(MaSearchStringIndex attribute: #userId) beAscii; yourself.

Since #status return symbols corresponding to http return codes (i.e. 200 = #OK, 500=#InternalServerError), I created a custom index and implemented indexHashForIndexObject: to lookup the numeric value for the symbol.
A typical distribution in a random log looks like this (status code, count)

200 779878
206 2469
301 51
302 92769
304 4997
400 3
401 558
403 6
404 816
500 1932

This ties into your comment here:

> On top of this, MaHashIndexes are inefficient at adding duplicate keys
> because, even though it is quick to locate the first of any key in the
> file, it has to read sequentially the remainder to find the end where
> to append the new duplicates. It may be possible to design a fix to
> this.

Hence, the are going to be massive amount of duplicates for 200. This is why I see decrease of commit times here:

| 84.9% {101572ms} [] MaHashIndex>>add:at:
| 34.1% {40852ms} MaHashIndexRecord>>canAdd:
| |34.0% {40710ms} MaHashIndexRecord>>eligibleByteArrayIndexesFor:do:
| | 32.8% {39235ms} SmallInteger(Number)>>to:by:do:
| | 29.8% {35690ms} primitives
| | 3.0% {3545ms} [] MaHashIndexRecord>>eligibleByteArrayIndexesFor:do:
| | 1.8% {2213ms} [] MaHashIndexRecord>>canAdd:
| | |1.2% {1495ms} MaHashIndexRecord>>hasSlotAvailableAtByteArrayIndex:
| | 1.1% {1332ms} primitives
| 30.1% {36041ms} MaHashIndex>>selectRecord:
| |29.9% {35736ms} MaHashIndex>>readRecord:at:
| | 28.6% {34246ms} MaFileRecordBroker>>readInto:at:
| | 28.5% {34126ms} MaAtomicFileStream>>maRead:bytesFromPosition:of:atFilePosition:
| | 28.5% {34049ms} Mutex>>critical:
| | 27.9% {33371ms} BlockClosure>>ensure:
| | 25.8% {30893ms} [] MaAtomicFileStream>>maRead:bytesFromPosition:of:atFilePosition:
| | |24.4% {29143ms} MaAtomicFileStream>>writerAt:
| | | |24.3% {29112ms} BTree>>from:do:
| | | | 24.0% {28682ms} BTree>>from:to:keysAndValuesDo:
| | | | 23.6% {28255ms} BTreeInteriorNode>>leavesFrom:to:do:
| | | | 21.2% {25331ms} BTreeInteriorNode>>leavesFrom:to:do:
| | | | 18.6% {22208ms} BTreeInteriorNode>>leavesFrom:to:do:
| | | | 16.5% {19754ms} BTreeInteriorNode>>leavesFrom:to:do:
| | | | 14.3% {17105ms} BTreeInteriorNode>>leavesFrom:to:do:
| | | | 11.7% {14000ms} BTreeInteriorNode>>leavesFrom:to:do:
| | | | 9.3% {11163ms} BTreeInteriorNode>>leavesFrom:to:do:
| | | | 6.9% {8296ms} BTreeInteriorNode>>leavesFrom:to:do:
| | | | 3.6% {4263ms} BTreeLeafNode>>leavesFrom:to:do:
| | | | |2.7% {3246ms} [] BTree>>from:to:keysAndValuesDo:
| | | | | 2.6% {3125ms} BTreeLeafNode(BTreeNode)>>keysAndValuesDo:
| | | | | 2.1% {2463ms} BTreeKeysArray(BTreeKeys)>>withIndexDo:
| | | | | 1.4% {1632ms} [] BTreeLeafNode(BTreeNode)>>keysAndValuesDo:
| | | | | 1.1% {1324ms} [[]] BTree>>from:to:keysAndValuesDo:
| | | | 1.4% {1626ms} BTreeKeysArray(BTreeKeys)>>findIndexForKey:
| | | | 1.3% {1573ms} BTreeKeysArray(BTreeKeys)>>withIndexDo:
| | |1.4% {1657ms} SmallInteger(Magnitude)>>min:

Now, the distribution of #sessionId and #userId is harder to predict but the amount of sessionId should be far less and hence the lookup in hash index not as taxing (?).

Since log entries represents a point in time, I was wondering whether I could try to “batch” log entries in timespans and use DateTree for storage but there is one problem here:
that class uses a dictionary for events - hence the readSet are going to be very large and it is not easy to “stubOut” unless I subclass DateTree and change storage type somehow.
I would also loose the ability to use a MagmaCollectionReader on the whole dataset and find things such as specific users regardless of when the entry was logged.

The other idea is to change the model to create a tree based on the sessionId (which _might be missing_, unfortuantely), but then I would loose the ability to quickly retrieve entries based on #requestStart regardless of sessionId.

>>
>> Mutex>>critical:
>> MaNetworkServerLink>>submit:
>> [] in [] in [] in MagmaSession>>submit:to:
>> [] in [] in BlockClosure>>maOn:do:on:do:on:do:
>> BlockClosure>>on:do:
>
> If you have another debugger open inside the Mutex then, yes, the next
> one will wait.
>
> You should be able to close the other debugger, then proceed the one
> doing the disconnect.
>

Well, actually this seems to be a left over from that I abandoned "MaObjectSerializationUserError: No createProxyBlock specified.” dialogue. The process was suspended and the I get an unwind exception on the containing MethodContext.

> This can happen if an object changed while it was being serialized.
> This is not supported. Are you serializing in a separate Process?

I had kicked of the reading/parsing/committing in background process, yes - I did not touch any of the entries what I am aware of, but, of course - it might be the case that I changed a participating class while it was mid serialisation. Thanks for this - it helps a lot.

>
> When my image gets this wrecked and I can't abort it (e.g., exit image
> without saving), then I use clean up utility methods:
>
> MagmaSession cleanUp. "print it. It closes and cleans up all
> in memory MagmaSessions and their references, reports the number of
> intstances before->after."
> MaInstaller chasePointerToAnyMagmaSession "Pick any
> MagmaSession instance and show the trail references to facilitate
> getting rid of it with another run of MagmaSession cleanUp.”

Perfect - thanks a lot for this!

If you are further interested, I have attached a server profiling when performance on the Magma db degraded too much).
Oh - the obligatory server statistic charts that visualises the degradation:

_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma

MeanSizes.png (140K) Download Attachment

ProcessingCount.png (119K) Download Attachment

ProcessingMean.png (145K) Download Attachment

MagmaServerProblems.txt (19K) Download Attachment

Chris Muller-3

Re: WriteBarrier etc

Hi,

> Many thanks for the patience in helping me out!
> I have considered using Glorp instead of Magma, but Magma is simply fantastic and I want to see how far I can take it.
> I understand that I might need to say “it is not the correct tool” if the dataset is too large of course, but I really want it to work.

I will help you take Magma as far as I can without ever pressuring you
to stay with it if it's not working. In fact, I will say up front
that, while MagmaCollections are intended to provide some SQL
table-like capabilities for Magma, they don't compete with relational
databases on that point. Magma gains a performance-advantage only for
complex object models that would otherwise require relational to
perform a lot of joins, but both types of systems provide many avenues
for optimization, so there's always some unavoidable commitment of
labor of some kind, just different... :/

(sorry rambling ^^^)

>>> 17.2% {2814ms} MaHashIndex>>writeRecord:
>>> 17.1% {2799ms} MaFileRecordBroker>>writeFrom:at:
>>
>> So, for example, if there are two indexes on a MagmaCollection, each
>> object added to that collection cause MaHashIndex>>#add:at: to be
>> called FIVE times (once for MC, plus twice for each index).
>>
>
> Yes - I think this is my biggest problem.
> Some more details from my application - consider the following log entry (Yes - I am aware that is a bit of a data blurb):
>
> 1.2.3.4 127.0.0.1 connector-26 _2_1 [15/Mar/2018:23:59:59 +1100] "GET /branding/_1_1/unimelb-logo.svg HTTP/1.1" 200 52157 "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:58.0) Gecko/20100101 Firefox/58.0” "session_id=A3EC4123E563D3D2B6DE990D3C9923FF; s_session_id=EFD19A7163EF5C1B071C6D608997D3AB; web_client_cache_guid=9ddf5c31-6554-4439-bb31-2bb8d23c0575; JSESSIONID=8EF936432E3E4E8EED8BBA6AD9E7EAC6; __utmb=100687569.1.10.1521117448" 2 52157
>
> When I started to “magmafying” the application, I decided that the most common things that I need to access/query from the 2-3 million entry log file were the following:
>
> sessionId (A3EC4123E563D3D2B6DE990D3C9923FF)
> userId (_2_1)
> requestEnd (15/Mar/2018:23:59:59 +1100)
> requestStart (requestEnd - 2)
> requestTimeInMilliSeconds (2)
> status (200)
>
> Hence, I created a bunch of indexes on the MagmaCollection:
>
> MaDateAndTimeIndex attribute: #requestEnd.
> MaDateAndTimeIndex attribute: #requestStart.
> MaIntegerIndex attribute: #requestTimeInMilliSeconds.
> TKMaHTTPStatusCodeIndex attribute: #status
> (MaSearchStringIndex attribute: #sessionId) beAscii; yourself.
> (MaSearchStringIndex attribute: #userId) beAscii; yourself.

This is too many. Most of these imposes far more cost than
benefit it gives back. In fact, even without knowing the details of
your application requirements, I'd bet you could get away with as few
as two, one or even none of these...

> Since #status return symbols corresponding to http return codes (i.e. 200 = #OK, 500=#InternalServerError), I created a custom index and implemented indexHashForIndexObject: to lookup the numeric value for the symbol.
> A typical distribution in a random log looks like this (status code, count)
>
> 200 779878
> 206 2469
> 301 51
> 302 92769
> 304 4997
> 400 3
> 401 558
> 403 6
> 404 816
> 500 1932
>
> This ties into your comment here:
>
>> On top of this, MaHashIndexes are inefficient at adding duplicate keys
>> because, even though it is quick to locate the first of any key in the
>> file, it has to read sequentially the remainder to find the end where
>> to append the new duplicates. It may be possible to design a fix to
>> this.
>
> Hence, the are going to be massive amount of duplicates for 200. This is why I see decrease of commit times here:
>
> | 84.9% {101572ms} [] MaHashIndex>>add:at:

You are not the first to want to do that but, yes, indeed, I'm sorry
but an Index with tons of dups is not a good way to go (because it
will kill performance). When there are that few keys it would be
better to have a regular Dictionary keyed by those few keys, and
valued by the [Magma]Collections of TKEntry's(?) which have those
statuses. Then you could remove that index, and the same TKEntry's
can still be put into the original (big) MagmaCollection, too, but
let's see if we can make futher trimmings... (continued)

> | 34.1% {40852ms} MaHashIndexRecord>>canAdd:
> | |34.0% {40710ms} MaHashIndexRecord>>eligibleByteArrayIndexesFor:do:
> | | 32.8% {39235ms} SmallInteger(Number)>>to:by:do:
> | | 29.8% {35690ms} primitives
> | | 3.0% {3545ms} [] MaHashIndexRecord>>eligibleByteArrayIndexesFor:do:
> | | 1.8% {2213ms} [] MaHashIndexRecord>>canAdd:
> ...
>
> Now, the distribution of #sessionId and #userId is harder to predict but the amount of sessionId should be far less and hence the lookup in hash index not as taxing (?).

I would avoid using a userId as an indexed attribute of a
MagmaCollection. It all depends on what access demands put on the
application, but, for example, it would be a lot faster to access
those straight from your User object's 'log' of TKEntry's.

A sessionId might be okay, since it will be unique each time even when
the same user comes back later, right? Still, a lot of dups and if
there was already a user-specific Collection of TKEntry's (as
mentioned above), then they could be further batched into sessions by
session-id there...

> Since log entries represents a point in time, I was wondering whether I could try to “batch” log entries...

You're onto the right kind of thinking for optimizing your app away
from "one giant indexed MagmaCollection". 1) Use multiple smaller
collections, and 2) use the fastest *type* of collection at each place
it can based on the access requirements. This includes even
old-fashioned Smalltalk collections.

Let's open up the toolbox and spread out all the tools available to
apply to your design. As far as raw performance goes, **in general**,
you'd want to prefer these in order listed, and fight tooth-and-nail
from bringing in the lower-listed ones into your app until you
absolutely have to.

(original Smalltalk collections) :: "They load and commit whole,
so very fast, but can't be too big (<1000 max), and they don't support
any concurrency."
MagmaArray :: "unlimited size, blazing fast to add to, but
provide only integral access. Concurrent at:put:'s are allowed as
long as at different index positions."
MagmaPreallocatedDictionary :: "Faster version of
MagmaDictionary, but cannot enumerate all objects, and preallocates
space on server (so don't make a million of them)."
MagmaHashTable :: "Like a MagmaCollection, but faster and only
supports one index (but you could have more than one with a different
index but the same objects, of course)"
MagmaDictionary :: "Relatively-fast single-key access, but the
value can be any object (OrderedCollection, MagmaArray, etc.)"
MagmaCollection :: "relatively slow but theoretically scales at
O(log n), allow concurrent adds by multiple sessions without conflict"
MagmaSet :: "A MagmaCollection that checks #includes: before
adding. Not recommended."

Note: MagmaHashTable and MagmaCollection are the ones that support
"next higher key" access, so you don't need to know the exact key.

I also mentioned concurrency characteristics, above, because it is
what can allow a design to employ multiple client images all
submitting to a single server. I have an application that does this
-- spawns multiple images which each connect to the same server and
perform massive amounts of work. The server is able to support many
clients resulting in a huge gain in throughput relative to a single
client (even with a local db connection). But you either
avoid or handle MagmaCommitConflicts (preferably avoid, since client
did some work but commit was rejected).

> ...in timespans and use DateTree for storage but there is one problem here:
> that class uses a dictionary for events - hence the readSet are going to be very large and it is not easy to “stubOut” unless I subclass DateTree and change storage type somehow.
> I would also loose the ability to use a MagmaCollectionReader on the whole dataset and find things such as specific users regardless of when the entry was logged.

I'm currently looking into whether Dictionary and Set can be made to
work with WriteBarrier... stay tuned...

However, I don't think you need to create your own indexed structure
at this point, I'm pretty certain Magma has enough for what you're
describing.. :)

> The other idea is to change the model to create a tree based on the sessionId (which _might be missing_, unfortuantely), but then I would loose the ability to quickly retrieve entries based on #requestStart regardless of sessionId.

You never "lose" such an ability, since the same exact object instance
can be put into multiple collections where it would expect to be found
in the future. You could keep your one MagmaCollection with one index
on #requestStart (in which case it could be upgraded to a
MagmaHashTable), and then do your userId / sessionId access from
different (smaller) collections...

>>> Mutex>>critical:
>>> MaNetworkServerLink>>submit:
>>> [] in [] in [] in MagmaSession>>submit:to:
>>> [] in [] in BlockClosure>>maOn:do:on:do:on:do:
>>> BlockClosure>>on:do:
>>
>> If you have another debugger open inside the Mutex then, yes, the next
>> one will wait.
>>
>> You should be able to close the other debugger, then proceed the one
>> doing the disconnect.
>>
>
> Well, actually this seems to be a left over from that I abandoned "MaObjectSerializationUserError: No createProxyBlock specified.” dialogue. The process was suspended and the I get an unwind exception on the containing MethodContext.

I know there' s an glitch in the VM that, when debugging, sometimes
requires me to press Command+[dot] to interrupt, and I see it stuck on
the Mutex>>critical:, at which point you may simply click Proceed and
it will continue normally.

>> This can happen if an object changed while it was being serialized.
>> This is not supported. Are you serializing in a separate Process?
>
> I had kicked of the reading/parsing/committing in background process, yes - I did not touch any of the entries what I am aware of, but, of course - it might be the case that I changed a participating class while it was mid serialisation. Thanks for this - it helps a lot.
>
>>
>> When my image gets this wrecked and I can't abort it (e.g., exit image
>> without saving), then I use clean up utility methods:
>>
>> MagmaSession cleanUp. "print it. It closes and cleans up all
>> in memory MagmaSessions and their references, reports the number of
>> intstances before->after."
>> MaInstaller chasePointerToAnyMagmaSession "Pick any
>> MagmaSession instance and show the trail references to facilitate
>> getting rid of it with another run of MagmaSession cleanUp.”
>
> Perfect - thanks a lot for this!
>
> If you are further interested, I have attached a server profiling when performance on the Magma db degraded too much).
> Oh - the obligatory server statistic charts that visualises the degradation:

Thanks for digging in I'm glad I'm not the only one who ever used
those tools! :)

I think you still have a lot of optimization opportunities. Please
feel free to keep asking questions, I'm happy to help.

- Chris
_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma