WriteBarrier etc

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

WriteBarrier etc

TimKack
Hi all,

First let me explain how my application works (and excuse this lengthy post):

It processes a log file and create TKEntry’s (i.e. smalltalk objects representing the log entry).
Since several entries contains the same values for several fields (example: paths, userIds etc) I have LRUCache infront of a MaSharedObject broker (Thanks Chris for this!).
The reasons are two fold:
1) The LRUCache makes it faster to lookup keys when I have a lot of log data on the heap
2) The MaSharedObject broker prevents having the same value stored many times and by that use up valuable heap usage.
Lets call this TKMaLRU<mumble>Cache - and there are several of these depending on their scope and content.
The <mumble> part is because they have some specialities to what they store (literals, dates, times etc).
These TKEntry’s are stored in an OrderedCollection but now when I am “magma:fying” the application, I have decided to use a MagmaCollection (because we talk about 4-5 million entries).
Let call this a TKLog.
The TKLog is stored in a TKProject, which in turn is stored in my root of my Magma database - in my grand TKProjects singleton object.

So, to summarise the architecture somewhat:

TKProjects contain TKProject’s, that contain TKLog’s, where each contains a collection of TKEntry’s.
The TKProjects contain some TKMaLRU<mumble>Caches that are global and shared between projects.
The TKProject contain some TKMaLRU<mumble>Caches that are shared across all TKLogs
The  TKLog contains some TKMaLRU<mumble>Caches that are only relevant for that specific log.

Here is how the commit mechanism work:
1) MagmaSession begin
2) Read and process log entry into a TKLog entry (i.e. also lookup or add values to above mentioned MaSharedObject brokered caches)
3) repeat step 2) X times
4) MagmaSession commitAndBegin.
5) Repeat 2 - 4 until all log has been processed.

Now, in step 3 - I have tested to commit every TKEntry, every 50 TKEntry and every 150 TKEntry - but the results are somewhat disappointing.
Looking at the client side report - it seems to suggest that "Time to Download Full Response” is taking longer and longer:

 Time to Download Full Response                                            

      End Time                              Name                                         Count         min         max         sum        mean        last
      --------                              ----                                         -----         ---         ---         ---        ----        ----
  2018-05-08T21:24:56.756581+02:00      Time to Download Full Response                  98      1439.0      4269.0    197938.0  2019.77551        1703
      2018-05-08T21:29:56.756864+02:00      Time to Download Full Response                  97      1575.0      3388.0    200137.0  2063.26804        1817
      2018-05-08T21:34:56.759649+02:00      Time to Download Full Response                  96      1325.0      3533.0    197440.0  2056.66666        2039
      2018-05-08T21:39:56.761206+02:00      Time to Download Full Response                  95      1134.0      3539.0    193899.0  2041.04210        2061
      2018-05-08T21:44:56.764136+02:00      Time to Download Full Response                  92      1537.0      4060.0    193165.0  2099.61956        1994
      2018-05-08T21:49:56.772738+02:00      Time to Download Full Response                  91      1672.0      4183.0    192310.0  2113.29670        1998
<…snip…>
      2018-05-09T07:29:57.823266+02:00      Time to Download Full Response                  71      2388.0      5202.0    213569.0  3008.01408        3175
      2018-05-09T07:34:57.825258+02:00      Time to Download Full Response                  61      2829.0      5700.0    211781.0  3471.81967        3197
      2018-05-09T07:39:57.825924+02:00      Time to Download Full Response                  55      2385.0      6392.0    193054.0  3510.07272        2965
      2018-05-09T07:44:57.828316+02:00      Time to Download Full Response                  56      2839.0      6606.0    201693.0  3601.66071        3249
      2018-05-09T07:49:57.830572+02:00      Time to Download Full Response                  62      1870.0      6230.0    207707.0  3350.11290        3543
      2018-05-09T07:54:57.832358+02:00      Time to Download Full Response                  58      2790.0      6422.0    199911.0  3446.74137        3215
      2018-05-09T07:59:57.834803+02:00      Time to Download Full Response                  61      2658.0      5832.0    208315.0      3415.0        3839
      2018-05-09T08:04:57.841518+02:00      Time to Download Full Response                  60      2266.0      4742.0    199554.0      3325.9        3122
      2018-05-09T08:09:57.842645+02:00      Time to Download Full Response                  57      2439.0      5669.0    190848.0  3348.21052        3782
      2018-05-09T08:14:57.843325+02:00      Time to Download Full Response                  56      2183.0      5850.0    191640.0  3422.14285        3040
      2018-05-09T08:19:57.845536+02:00      Time to Download Full Response                  56      2803.0      5767.0    196116.0  3502.07142        3225
      2018-05-09T08:24:57.846522+02:00      Time to Download Full Response                  61      2653.0      5717.0    211016.0  3459.27868        3406
      2018-05-09T08:29:57.9646+02:00        Time to Download Full Response                  54      2765.0      5772.0    186693.0  3457.27777        3277
      2018-05-09T08:34:57.967164+02:00      Time to Download Full Response                  54      2970.0      5636.0    193700.0  3587.03703        3401
      2018-05-09T08:39:57.981221+02:00      Time to Download Full Response                  55      2583.0      5411.0    197685.0  3594.27272        3607
      2018-05-09T08:44:57.981925+02:00      Time to Download Full Response                  56      2664.0      5661.0    196917.0    3516.375        3829
      2018-05-09T08:49:57.982564+02:00      Time to Download Full Response                  57      2854.0      6889.0    202831.0  3558.43859        3014
      2018-05-09T08:54:57.983894+02:00      Time to Download Full Response                  63      2761.0      6190.0    214199.0  3399.98412        3226
      2018-05-09T08:59:57.990957+02:00      Time to Download Full Response                  63      2796.0      5130.0    216577.0  3437.73015        3066
      2018-05-09T09:04:57.992464+02:00      Time to Download Full Response                  55      2894.0      6478.0    200221.0  3640.38181        3300
      2018-05-09T09:09:57.994927+02:00      Time to Download Full Response                  51      3173.0      5203.0    194755.0  3818.72549        3276
      2018-05-09T09:14:57.995991+02:00      Time to Download Full Response                  55      3167.0      6066.0    205586.0  3737.92727        3432
      2018-05-09T09:19:57.998695+02:00      Time to Download Full Response                  57      2840.0      6339.0    212726.0  3732.03508        3445
      2018-05-09T09:24:58.010041+02:00      Time to Download Full Response                  53      2737.0      5854.0    203695.0  3843.30188        3212
      2018-05-09T09:29:58.011472+02:00      Time to Download Full Response                  55      1893.0      5003.0    193155.0  3511.90909        3311
      2018-05-09T09:34:58.013342+02:00      Time to Download Full Response                  53      2573.0      6791.0    194739.0  3674.32075        3588
      2018-05-09T09:39:58.016082+02:00      Time to Download Full Response                  55      2810.0      5476.0    195984.0  3563.34545        3254
   
I also see a degradation over time on "Time waiting for MaCommitRequest requests"

      End Time                              Name                                         Count         min         max         sum        mean        last
      --------                              ----                                         -----         ---         ---         ---        ----        ----
      2018-05-08T21:24:56.756627+02:00      Time waiting for MaCommitRequest requ...        98      1443.0      4291.0    199387.0  2034.56122        1706
      2018-05-08T21:29:56.756915+02:00      Time waiting for MaCommitRequest requ...        97      1583.0      3399.0    201187.0  2074.09278        1821
      2018-05-08T21:34:56.759692+02:00      Time waiting for MaCommitRequest requ...        96      1333.0      3538.0    199918.0  2082.47916        2047
      2018-05-08T21:39:56.761261+02:00      Time waiting for MaCommitRequest requ...        95      1143.0      3547.0    196105.0  2064.26315        2070
      2018-05-08T21:44:56.764175+02:00      Time waiting for MaCommitRequest requ...        92      1540.0      4449.0    195004.0  2119.60869        1999
      2018-05-08T21:49:56.77278+02:00       Time waiting for MaCommitRequest requ...        91      1678.0      4190.0    194502.0  2137.38461        2008
      2018-05-08T21:54:56.774304+02:00      Time waiting for MaCommitRequest requ...        89      1389.0      4015.0    190014.0  2134.98876        1858
      2018-05-08T21:59:56.777215+02:00      Time waiting for MaCommitRequest requ...        88      1731.0      4576.0    191360.0  2174.54545        2020
      2018-05-08T22:04:56.779026+02:00      Time waiting for MaCommitRequest requ...        88      1705.0      3551.0    191334.0     2174.25        1859
<…snip…>
      2018-05-09T09:09:57.99499+02:00       Time waiting for MaCommitRequest requ...        51      3185.0      5210.0    195250.0  3828.43137        3286
      2018-05-09T09:14:57.996035+02:00      Time waiting for MaCommitRequest requ...        55      3174.0      6073.0    206588.0  3756.14545        3898
      2018-05-09T09:19:57.998772+02:00      Time waiting for MaCommitRequest requ...        57      2849.0      6346.0    213850.0  3751.75438        3455
      2018-05-09T09:24:58.010092+02:00      Time waiting for MaCommitRequest requ...        53      2740.0      5856.0    205350.0  3874.52830        3216
      2018-05-09T09:29:58.011514+02:00      Time waiting for MaCommitRequest requ...        55      1896.0      5259.0    196371.0  3570.38181        3315
      2018-05-09T09:34:58.013386+02:00      Time waiting for MaCommitRequest requ...        53      2582.0      6796.0    196473.0  3707.03773        3593
      2018-05-09T09:39:58.016132+02:00      Time waiting for MaCommitRequest requ...        55      2818.0      5485.0    196926.0  3580.47272        3268
      nil                                   Time waiting for MaCommitRequest requ...        10      3125.0      4748.0     37967.0      3796.7        3511


This has led me to implement isImmutableInMagma on the TKEntry to keep the readSet low, but now I have run into a problem:
The caches are based on Dictionaries and they can grow to 20000-30000 (or more) entries.

After all of this, my question is simple:
Should I use a WriteBarrier capable object instead (I could not find a WbDictionary unfortunately)?
In fact - how does WriteBarrier work in the context of Magma? I have set  allowWriteBarrier: true when connecting to the Magma DB.

Cheers,
Tim


_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma
Reply | Threaded
Open this post in threaded view
|

Re: WriteBarrier etc

Chris Muller-3
Hi!

First, thanks for the great email, the detail and for your deep
exploration and usage of the tools provided by Magma in the first
place, I'm very impressed!

To cut to the chase about WriteBarrier:  yes.  WriteBarrier is meant
for stable bulk-load's you want to perform faster.  The trade-off of
using it is, if you need to do debugging, it's harder to access the
instances via, i.e., TKEntry allInstances select: [ : each | .... ],
because they're not instances of TKEntry, but of TKEntry*.

Is this database shared with other real-time users?  If not, I would
experiment committing at least 500 or 1000 or 2500 or 5000 or 10000
per commit.

There is a logarithmic rate of degradation to adding new objects
proportional to the number of unique objects already in a
MagmaCollection or MagmaCollectionIndex.  The rate of degradation of
duplicate objects is *linear*, but it doesn't sound like your model is
affected by that.

You were very smart to override #isImmutableInMagma on your TEEntry's,
given how many would be clogging up the readSet unnecessarily due to
their archival nature (unchanging).

Those big Dictionary's of the MaSharedObjectsBroker will still slow
down the changed-detection of your commits whenever you didn't add
something.  Because that means their size won't have changed, and nor
will their contents, but the only way Magma could know that is by
enumerating them.  And, if they DID change, then they have to be
committed.  So, there is a trade-off there, but this is something I
would like to think about further -- a few years ago a made an
advanced feature that could help with this, but I need to research it.

Besides the sizes of those Dictionary caches, you'd want to ensure
those objects IN them are not in the readSet, if possible.
WriteBarrier can definitely help with that for objects which are not
#isImmutableInMagma.  Some of the lower level objects don't can't take
advantage of the WriteBarrier (see #maAllowsWriteBarrier), but normal
objects like the ones from your domain can.

WbDictionary, et al. no longer apply after Closures were redone a few
years ago which disallowed Array's from being made uncompact,
something required for the WriteBarrier to be able to swap its class
via #primitiveChangeClassTo:.

I can help you optimize further if you include a MessageTally profile
dump next time.

Cheers!
  Chris


On Tue, May 22, 2018 at 8:28 AM, Tim Käck <[hidden email]> wrote:

> Hi all,
>
> First let me explain how my application works (and excuse this lengthy post):
>
> It processes a log file and create TKEntry’s (i.e. smalltalk objects representing the log entry).
> Since several entries contains the same values for several fields (example: paths, userIds etc) I have LRUCache infront of a MaSharedObject broker (Thanks Chris for this!).
> The reasons are two fold:
> 1) The LRUCache makes it faster to lookup keys when I have a lot of log data on the heap
> 2) The MaSharedObject broker prevents having the same value stored many times and by that use up valuable heap usage.
> Lets call this TKMaLRU<mumble>Cache - and there are several of these depending on their scope and content.
> The <mumble> part is because they have some specialities to what they store (literals, dates, times etc).
> These TKEntry’s are stored in an OrderedCollection but now when I am “magma:fying” the application, I have decided to use a MagmaCollection (because we talk about 4-5 million entries).
> Let call this a TKLog.
> The TKLog is stored in a TKProject, which in turn is stored in my root of my Magma database - in my grand TKProjects singleton object.
>
> So, to summarise the architecture somewhat:
>
> TKProjects contain TKProject’s, that contain TKLog’s, where each contains a collection of TKEntry’s.
> The TKProjects contain some TKMaLRU<mumble>Caches that are global and shared between projects.
> The TKProject contain some TKMaLRU<mumble>Caches that are shared across all TKLogs
> The  TKLog contains some TKMaLRU<mumble>Caches that are only relevant for that specific log.
>
> Here is how the commit mechanism work:
> 1) MagmaSession begin
> 2) Read and process log entry into a TKLog entry (i.e. also lookup or add values to above mentioned MaSharedObject brokered caches)
> 3) repeat step 2) X times
> 4) MagmaSession commitAndBegin.
> 5) Repeat 2 - 4 until all log has been processed.
>
> Now, in step 3 - I have tested to commit every TKEntry, every 50 TKEntry and every 150 TKEntry - but the results are somewhat disappointing.
> Looking at the client side report - it seems to suggest that "Time to Download Full Response” is taking longer and longer:
>
>  Time to Download Full Response
>
>       End Time                              Name                                         Count         min         max         sum        mean        last
>       --------                              ----                                         -----         ---         ---         ---        ----        ----
>   2018-05-08T21:24:56.756581+02:00      Time to Download Full Response                  98      1439.0      4269.0    197938.0  2019.77551        1703
>       2018-05-08T21:29:56.756864+02:00      Time to Download Full Response                  97      1575.0      3388.0    200137.0  2063.26804        1817
>       2018-05-08T21:34:56.759649+02:00      Time to Download Full Response                  96      1325.0      3533.0    197440.0  2056.66666        2039
>       2018-05-08T21:39:56.761206+02:00      Time to Download Full Response                  95      1134.0      3539.0    193899.0  2041.04210        2061
>       2018-05-08T21:44:56.764136+02:00      Time to Download Full Response                  92      1537.0      4060.0    193165.0  2099.61956        1994
>       2018-05-08T21:49:56.772738+02:00      Time to Download Full Response                  91      1672.0      4183.0    192310.0  2113.29670        1998
> <…snip…>
>       2018-05-09T07:29:57.823266+02:00      Time to Download Full Response                  71      2388.0      5202.0    213569.0  3008.01408        3175
>       2018-05-09T07:34:57.825258+02:00      Time to Download Full Response                  61      2829.0      5700.0    211781.0  3471.81967        3197
>       2018-05-09T07:39:57.825924+02:00      Time to Download Full Response                  55      2385.0      6392.0    193054.0  3510.07272        2965
>       2018-05-09T07:44:57.828316+02:00      Time to Download Full Response                  56      2839.0      6606.0    201693.0  3601.66071        3249
>       2018-05-09T07:49:57.830572+02:00      Time to Download Full Response                  62      1870.0      6230.0    207707.0  3350.11290        3543
>       2018-05-09T07:54:57.832358+02:00      Time to Download Full Response                  58      2790.0      6422.0    199911.0  3446.74137        3215
>       2018-05-09T07:59:57.834803+02:00      Time to Download Full Response                  61      2658.0      5832.0    208315.0      3415.0        3839
>       2018-05-09T08:04:57.841518+02:00      Time to Download Full Response                  60      2266.0      4742.0    199554.0      3325.9        3122
>       2018-05-09T08:09:57.842645+02:00      Time to Download Full Response                  57      2439.0      5669.0    190848.0  3348.21052        3782
>       2018-05-09T08:14:57.843325+02:00      Time to Download Full Response                  56      2183.0      5850.0    191640.0  3422.14285        3040
>       2018-05-09T08:19:57.845536+02:00      Time to Download Full Response                  56      2803.0      5767.0    196116.0  3502.07142        3225
>       2018-05-09T08:24:57.846522+02:00      Time to Download Full Response                  61      2653.0      5717.0    211016.0  3459.27868        3406
>       2018-05-09T08:29:57.9646+02:00        Time to Download Full Response                  54      2765.0      5772.0    186693.0  3457.27777        3277
>       2018-05-09T08:34:57.967164+02:00      Time to Download Full Response                  54      2970.0      5636.0    193700.0  3587.03703        3401
>       2018-05-09T08:39:57.981221+02:00      Time to Download Full Response                  55      2583.0      5411.0    197685.0  3594.27272        3607
>       2018-05-09T08:44:57.981925+02:00      Time to Download Full Response                  56      2664.0      5661.0    196917.0    3516.375        3829
>       2018-05-09T08:49:57.982564+02:00      Time to Download Full Response                  57      2854.0      6889.0    202831.0  3558.43859        3014
>       2018-05-09T08:54:57.983894+02:00      Time to Download Full Response                  63      2761.0      6190.0    214199.0  3399.98412        3226
>       2018-05-09T08:59:57.990957+02:00      Time to Download Full Response                  63      2796.0      5130.0    216577.0  3437.73015        3066
>       2018-05-09T09:04:57.992464+02:00      Time to Download Full Response                  55      2894.0      6478.0    200221.0  3640.38181        3300
>       2018-05-09T09:09:57.994927+02:00      Time to Download Full Response                  51      3173.0      5203.0    194755.0  3818.72549        3276
>       2018-05-09T09:14:57.995991+02:00      Time to Download Full Response                  55      3167.0      6066.0    205586.0  3737.92727        3432
>       2018-05-09T09:19:57.998695+02:00      Time to Download Full Response                  57      2840.0      6339.0    212726.0  3732.03508        3445
>       2018-05-09T09:24:58.010041+02:00      Time to Download Full Response                  53      2737.0      5854.0    203695.0  3843.30188        3212
>       2018-05-09T09:29:58.011472+02:00      Time to Download Full Response                  55      1893.0      5003.0    193155.0  3511.90909        3311
>       2018-05-09T09:34:58.013342+02:00      Time to Download Full Response                  53      2573.0      6791.0    194739.0  3674.32075        3588
>       2018-05-09T09:39:58.016082+02:00      Time to Download Full Response                  55      2810.0      5476.0    195984.0  3563.34545        3254
>
> I also see a degradation over time on "Time waiting for MaCommitRequest requests"
>
>       End Time                              Name                                         Count         min         max         sum        mean        last
>       --------                              ----                                         -----         ---         ---         ---        ----        ----
>       2018-05-08T21:24:56.756627+02:00      Time waiting for MaCommitRequest requ...        98      1443.0      4291.0    199387.0  2034.56122        1706
>       2018-05-08T21:29:56.756915+02:00      Time waiting for MaCommitRequest requ...        97      1583.0      3399.0    201187.0  2074.09278        1821
>       2018-05-08T21:34:56.759692+02:00      Time waiting for MaCommitRequest requ...        96      1333.0      3538.0    199918.0  2082.47916        2047
>       2018-05-08T21:39:56.761261+02:00      Time waiting for MaCommitRequest requ...        95      1143.0      3547.0    196105.0  2064.26315        2070
>       2018-05-08T21:44:56.764175+02:00      Time waiting for MaCommitRequest requ...        92      1540.0      4449.0    195004.0  2119.60869        1999
>       2018-05-08T21:49:56.77278+02:00       Time waiting for MaCommitRequest requ...        91      1678.0      4190.0    194502.0  2137.38461        2008
>       2018-05-08T21:54:56.774304+02:00      Time waiting for MaCommitRequest requ...        89      1389.0      4015.0    190014.0  2134.98876        1858
>       2018-05-08T21:59:56.777215+02:00      Time waiting for MaCommitRequest requ...        88      1731.0      4576.0    191360.0  2174.54545        2020
>       2018-05-08T22:04:56.779026+02:00      Time waiting for MaCommitRequest requ...        88      1705.0      3551.0    191334.0     2174.25        1859
> <…snip…>
>       2018-05-09T09:09:57.99499+02:00       Time waiting for MaCommitRequest requ...        51      3185.0      5210.0    195250.0  3828.43137        3286
>       2018-05-09T09:14:57.996035+02:00      Time waiting for MaCommitRequest requ...        55      3174.0      6073.0    206588.0  3756.14545        3898
>       2018-05-09T09:19:57.998772+02:00      Time waiting for MaCommitRequest requ...        57      2849.0      6346.0    213850.0  3751.75438        3455
>       2018-05-09T09:24:58.010092+02:00      Time waiting for MaCommitRequest requ...        53      2740.0      5856.0    205350.0  3874.52830        3216
>       2018-05-09T09:29:58.011514+02:00      Time waiting for MaCommitRequest requ...        55      1896.0      5259.0    196371.0  3570.38181        3315
>       2018-05-09T09:34:58.013386+02:00      Time waiting for MaCommitRequest requ...        53      2582.0      6796.0    196473.0  3707.03773        3593
>       2018-05-09T09:39:58.016132+02:00      Time waiting for MaCommitRequest requ...        55      2818.0      5485.0    196926.0  3580.47272        3268
>       nil                                   Time waiting for MaCommitRequest requ...        10      3125.0      4748.0     37967.0      3796.7        3511
>
>
> This has led me to implement isImmutableInMagma on the TKEntry to keep the readSet low, but now I have run into a problem:
> The caches are based on Dictionaries and they can grow to 20000-30000 (or more) entries.
>
> After all of this, my question is simple:
> Should I use a WriteBarrier capable object instead (I could not find a WbDictionary unfortunately)?
> In fact - how does WriteBarrier work in the context of Magma? I have set  allowWriteBarrier: true when connecting to the Magma DB.
>
> Cheers,
> Tim
>
>
> _______________________________________________
> Magma mailing list
> [hidden email]
> http://lists.squeakfoundation.org/mailman/listinfo/magma
_______________________________________________
Magma mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/magma