Hi all,
First let me explain how my application works (and excuse this lengthy post): It processes a log file and create TKEntry’s (i.e. smalltalk objects representing the log entry). Since several entries contains the same values for several fields (example: paths, userIds etc) I have LRUCache infront of a MaSharedObject broker (Thanks Chris for this!). The reasons are two fold: 1) The LRUCache makes it faster to lookup keys when I have a lot of log data on the heap 2) The MaSharedObject broker prevents having the same value stored many times and by that use up valuable heap usage. Lets call this TKMaLRU<mumble>Cache - and there are several of these depending on their scope and content. The <mumble> part is because they have some specialities to what they store (literals, dates, times etc). These TKEntry’s are stored in an OrderedCollection but now when I am “magma:fying” the application, I have decided to use a MagmaCollection (because we talk about 4-5 million entries). Let call this a TKLog. The TKLog is stored in a TKProject, which in turn is stored in my root of my Magma database - in my grand TKProjects singleton object. So, to summarise the architecture somewhat: TKProjects contain TKProject’s, that contain TKLog’s, where each contains a collection of TKEntry’s. The TKProjects contain some TKMaLRU<mumble>Caches that are global and shared between projects. The TKProject contain some TKMaLRU<mumble>Caches that are shared across all TKLogs The TKLog contains some TKMaLRU<mumble>Caches that are only relevant for that specific log. Here is how the commit mechanism work: 1) MagmaSession begin 2) Read and process log entry into a TKLog entry (i.e. also lookup or add values to above mentioned MaSharedObject brokered caches) 3) repeat step 2) X times 4) MagmaSession commitAndBegin. 5) Repeat 2 - 4 until all log has been processed. Now, in step 3 - I have tested to commit every TKEntry, every 50 TKEntry and every 150 TKEntry - but the results are somewhat disappointing. Looking at the client side report - it seems to suggest that "Time to Download Full Response” is taking longer and longer: Time to Download Full Response End Time Name Count min max sum mean last -------- ---- ----- --- --- --- ---- ---- 2018-05-08T21:24:56.756581+02:00 Time to Download Full Response 98 1439.0 4269.0 197938.0 2019.77551 1703 2018-05-08T21:29:56.756864+02:00 Time to Download Full Response 97 1575.0 3388.0 200137.0 2063.26804 1817 2018-05-08T21:34:56.759649+02:00 Time to Download Full Response 96 1325.0 3533.0 197440.0 2056.66666 2039 2018-05-08T21:39:56.761206+02:00 Time to Download Full Response 95 1134.0 3539.0 193899.0 2041.04210 2061 2018-05-08T21:44:56.764136+02:00 Time to Download Full Response 92 1537.0 4060.0 193165.0 2099.61956 1994 2018-05-08T21:49:56.772738+02:00 Time to Download Full Response 91 1672.0 4183.0 192310.0 2113.29670 1998 <…snip…> 2018-05-09T07:29:57.823266+02:00 Time to Download Full Response 71 2388.0 5202.0 213569.0 3008.01408 3175 2018-05-09T07:34:57.825258+02:00 Time to Download Full Response 61 2829.0 5700.0 211781.0 3471.81967 3197 2018-05-09T07:39:57.825924+02:00 Time to Download Full Response 55 2385.0 6392.0 193054.0 3510.07272 2965 2018-05-09T07:44:57.828316+02:00 Time to Download Full Response 56 2839.0 6606.0 201693.0 3601.66071 3249 2018-05-09T07:49:57.830572+02:00 Time to Download Full Response 62 1870.0 6230.0 207707.0 3350.11290 3543 2018-05-09T07:54:57.832358+02:00 Time to Download Full Response 58 2790.0 6422.0 199911.0 3446.74137 3215 2018-05-09T07:59:57.834803+02:00 Time to Download Full Response 61 2658.0 5832.0 208315.0 3415.0 3839 2018-05-09T08:04:57.841518+02:00 Time to Download Full Response 60 2266.0 4742.0 199554.0 3325.9 3122 2018-05-09T08:09:57.842645+02:00 Time to Download Full Response 57 2439.0 5669.0 190848.0 3348.21052 3782 2018-05-09T08:14:57.843325+02:00 Time to Download Full Response 56 2183.0 5850.0 191640.0 3422.14285 3040 2018-05-09T08:19:57.845536+02:00 Time to Download Full Response 56 2803.0 5767.0 196116.0 3502.07142 3225 2018-05-09T08:24:57.846522+02:00 Time to Download Full Response 61 2653.0 5717.0 211016.0 3459.27868 3406 2018-05-09T08:29:57.9646+02:00 Time to Download Full Response 54 2765.0 5772.0 186693.0 3457.27777 3277 2018-05-09T08:34:57.967164+02:00 Time to Download Full Response 54 2970.0 5636.0 193700.0 3587.03703 3401 2018-05-09T08:39:57.981221+02:00 Time to Download Full Response 55 2583.0 5411.0 197685.0 3594.27272 3607 2018-05-09T08:44:57.981925+02:00 Time to Download Full Response 56 2664.0 5661.0 196917.0 3516.375 3829 2018-05-09T08:49:57.982564+02:00 Time to Download Full Response 57 2854.0 6889.0 202831.0 3558.43859 3014 2018-05-09T08:54:57.983894+02:00 Time to Download Full Response 63 2761.0 6190.0 214199.0 3399.98412 3226 2018-05-09T08:59:57.990957+02:00 Time to Download Full Response 63 2796.0 5130.0 216577.0 3437.73015 3066 2018-05-09T09:04:57.992464+02:00 Time to Download Full Response 55 2894.0 6478.0 200221.0 3640.38181 3300 2018-05-09T09:09:57.994927+02:00 Time to Download Full Response 51 3173.0 5203.0 194755.0 3818.72549 3276 2018-05-09T09:14:57.995991+02:00 Time to Download Full Response 55 3167.0 6066.0 205586.0 3737.92727 3432 2018-05-09T09:19:57.998695+02:00 Time to Download Full Response 57 2840.0 6339.0 212726.0 3732.03508 3445 2018-05-09T09:24:58.010041+02:00 Time to Download Full Response 53 2737.0 5854.0 203695.0 3843.30188 3212 2018-05-09T09:29:58.011472+02:00 Time to Download Full Response 55 1893.0 5003.0 193155.0 3511.90909 3311 2018-05-09T09:34:58.013342+02:00 Time to Download Full Response 53 2573.0 6791.0 194739.0 3674.32075 3588 2018-05-09T09:39:58.016082+02:00 Time to Download Full Response 55 2810.0 5476.0 195984.0 3563.34545 3254 I also see a degradation over time on "Time waiting for MaCommitRequest requests" End Time Name Count min max sum mean last -------- ---- ----- --- --- --- ---- ---- 2018-05-08T21:24:56.756627+02:00 Time waiting for MaCommitRequest requ... 98 1443.0 4291.0 199387.0 2034.56122 1706 2018-05-08T21:29:56.756915+02:00 Time waiting for MaCommitRequest requ... 97 1583.0 3399.0 201187.0 2074.09278 1821 2018-05-08T21:34:56.759692+02:00 Time waiting for MaCommitRequest requ... 96 1333.0 3538.0 199918.0 2082.47916 2047 2018-05-08T21:39:56.761261+02:00 Time waiting for MaCommitRequest requ... 95 1143.0 3547.0 196105.0 2064.26315 2070 2018-05-08T21:44:56.764175+02:00 Time waiting for MaCommitRequest requ... 92 1540.0 4449.0 195004.0 2119.60869 1999 2018-05-08T21:49:56.77278+02:00 Time waiting for MaCommitRequest requ... 91 1678.0 4190.0 194502.0 2137.38461 2008 2018-05-08T21:54:56.774304+02:00 Time waiting for MaCommitRequest requ... 89 1389.0 4015.0 190014.0 2134.98876 1858 2018-05-08T21:59:56.777215+02:00 Time waiting for MaCommitRequest requ... 88 1731.0 4576.0 191360.0 2174.54545 2020 2018-05-08T22:04:56.779026+02:00 Time waiting for MaCommitRequest requ... 88 1705.0 3551.0 191334.0 2174.25 1859 <…snip…> 2018-05-09T09:09:57.99499+02:00 Time waiting for MaCommitRequest requ... 51 3185.0 5210.0 195250.0 3828.43137 3286 2018-05-09T09:14:57.996035+02:00 Time waiting for MaCommitRequest requ... 55 3174.0 6073.0 206588.0 3756.14545 3898 2018-05-09T09:19:57.998772+02:00 Time waiting for MaCommitRequest requ... 57 2849.0 6346.0 213850.0 3751.75438 3455 2018-05-09T09:24:58.010092+02:00 Time waiting for MaCommitRequest requ... 53 2740.0 5856.0 205350.0 3874.52830 3216 2018-05-09T09:29:58.011514+02:00 Time waiting for MaCommitRequest requ... 55 1896.0 5259.0 196371.0 3570.38181 3315 2018-05-09T09:34:58.013386+02:00 Time waiting for MaCommitRequest requ... 53 2582.0 6796.0 196473.0 3707.03773 3593 2018-05-09T09:39:58.016132+02:00 Time waiting for MaCommitRequest requ... 55 2818.0 5485.0 196926.0 3580.47272 3268 nil Time waiting for MaCommitRequest requ... 10 3125.0 4748.0 37967.0 3796.7 3511 This has led me to implement isImmutableInMagma on the TKEntry to keep the readSet low, but now I have run into a problem: The caches are based on Dictionaries and they can grow to 20000-30000 (or more) entries. After all of this, my question is simple: Should I use a WriteBarrier capable object instead (I could not find a WbDictionary unfortunately)? In fact - how does WriteBarrier work in the context of Magma? I have set allowWriteBarrier: true when connecting to the Magma DB. Cheers, Tim _______________________________________________ Magma mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/magma |
Hi!
First, thanks for the great email, the detail and for your deep exploration and usage of the tools provided by Magma in the first place, I'm very impressed! To cut to the chase about WriteBarrier: yes. WriteBarrier is meant for stable bulk-load's you want to perform faster. The trade-off of using it is, if you need to do debugging, it's harder to access the instances via, i.e., TKEntry allInstances select: [ : each | .... ], because they're not instances of TKEntry, but of TKEntry*. Is this database shared with other real-time users? If not, I would experiment committing at least 500 or 1000 or 2500 or 5000 or 10000 per commit. There is a logarithmic rate of degradation to adding new objects proportional to the number of unique objects already in a MagmaCollection or MagmaCollectionIndex. The rate of degradation of duplicate objects is *linear*, but it doesn't sound like your model is affected by that. You were very smart to override #isImmutableInMagma on your TEEntry's, given how many would be clogging up the readSet unnecessarily due to their archival nature (unchanging). Those big Dictionary's of the MaSharedObjectsBroker will still slow down the changed-detection of your commits whenever you didn't add something. Because that means their size won't have changed, and nor will their contents, but the only way Magma could know that is by enumerating them. And, if they DID change, then they have to be committed. So, there is a trade-off there, but this is something I would like to think about further -- a few years ago a made an advanced feature that could help with this, but I need to research it. Besides the sizes of those Dictionary caches, you'd want to ensure those objects IN them are not in the readSet, if possible. WriteBarrier can definitely help with that for objects which are not #isImmutableInMagma. Some of the lower level objects don't can't take advantage of the WriteBarrier (see #maAllowsWriteBarrier), but normal objects like the ones from your domain can. WbDictionary, et al. no longer apply after Closures were redone a few years ago which disallowed Array's from being made uncompact, something required for the WriteBarrier to be able to swap its class via #primitiveChangeClassTo:. I can help you optimize further if you include a MessageTally profile dump next time. Cheers! Chris On Tue, May 22, 2018 at 8:28 AM, Tim Käck <[hidden email]> wrote: > Hi all, > > First let me explain how my application works (and excuse this lengthy post): > > It processes a log file and create TKEntry’s (i.e. smalltalk objects representing the log entry). > Since several entries contains the same values for several fields (example: paths, userIds etc) I have LRUCache infront of a MaSharedObject broker (Thanks Chris for this!). > The reasons are two fold: > 1) The LRUCache makes it faster to lookup keys when I have a lot of log data on the heap > 2) The MaSharedObject broker prevents having the same value stored many times and by that use up valuable heap usage. > Lets call this TKMaLRU<mumble>Cache - and there are several of these depending on their scope and content. > The <mumble> part is because they have some specialities to what they store (literals, dates, times etc). > These TKEntry’s are stored in an OrderedCollection but now when I am “magma:fying” the application, I have decided to use a MagmaCollection (because we talk about 4-5 million entries). > Let call this a TKLog. > The TKLog is stored in a TKProject, which in turn is stored in my root of my Magma database - in my grand TKProjects singleton object. > > So, to summarise the architecture somewhat: > > TKProjects contain TKProject’s, that contain TKLog’s, where each contains a collection of TKEntry’s. > The TKProjects contain some TKMaLRU<mumble>Caches that are global and shared between projects. > The TKProject contain some TKMaLRU<mumble>Caches that are shared across all TKLogs > The TKLog contains some TKMaLRU<mumble>Caches that are only relevant for that specific log. > > Here is how the commit mechanism work: > 1) MagmaSession begin > 2) Read and process log entry into a TKLog entry (i.e. also lookup or add values to above mentioned MaSharedObject brokered caches) > 3) repeat step 2) X times > 4) MagmaSession commitAndBegin. > 5) Repeat 2 - 4 until all log has been processed. > > Now, in step 3 - I have tested to commit every TKEntry, every 50 TKEntry and every 150 TKEntry - but the results are somewhat disappointing. > Looking at the client side report - it seems to suggest that "Time to Download Full Response” is taking longer and longer: > > Time to Download Full Response > > End Time Name Count min max sum mean last > -------- ---- ----- --- --- --- ---- ---- > 2018-05-08T21:24:56.756581+02:00 Time to Download Full Response 98 1439.0 4269.0 197938.0 2019.77551 1703 > 2018-05-08T21:29:56.756864+02:00 Time to Download Full Response 97 1575.0 3388.0 200137.0 2063.26804 1817 > 2018-05-08T21:34:56.759649+02:00 Time to Download Full Response 96 1325.0 3533.0 197440.0 2056.66666 2039 > 2018-05-08T21:39:56.761206+02:00 Time to Download Full Response 95 1134.0 3539.0 193899.0 2041.04210 2061 > 2018-05-08T21:44:56.764136+02:00 Time to Download Full Response 92 1537.0 4060.0 193165.0 2099.61956 1994 > 2018-05-08T21:49:56.772738+02:00 Time to Download Full Response 91 1672.0 4183.0 192310.0 2113.29670 1998 > <…snip…> > 2018-05-09T07:29:57.823266+02:00 Time to Download Full Response 71 2388.0 5202.0 213569.0 3008.01408 3175 > 2018-05-09T07:34:57.825258+02:00 Time to Download Full Response 61 2829.0 5700.0 211781.0 3471.81967 3197 > 2018-05-09T07:39:57.825924+02:00 Time to Download Full Response 55 2385.0 6392.0 193054.0 3510.07272 2965 > 2018-05-09T07:44:57.828316+02:00 Time to Download Full Response 56 2839.0 6606.0 201693.0 3601.66071 3249 > 2018-05-09T07:49:57.830572+02:00 Time to Download Full Response 62 1870.0 6230.0 207707.0 3350.11290 3543 > 2018-05-09T07:54:57.832358+02:00 Time to Download Full Response 58 2790.0 6422.0 199911.0 3446.74137 3215 > 2018-05-09T07:59:57.834803+02:00 Time to Download Full Response 61 2658.0 5832.0 208315.0 3415.0 3839 > 2018-05-09T08:04:57.841518+02:00 Time to Download Full Response 60 2266.0 4742.0 199554.0 3325.9 3122 > 2018-05-09T08:09:57.842645+02:00 Time to Download Full Response 57 2439.0 5669.0 190848.0 3348.21052 3782 > 2018-05-09T08:14:57.843325+02:00 Time to Download Full Response 56 2183.0 5850.0 191640.0 3422.14285 3040 > 2018-05-09T08:19:57.845536+02:00 Time to Download Full Response 56 2803.0 5767.0 196116.0 3502.07142 3225 > 2018-05-09T08:24:57.846522+02:00 Time to Download Full Response 61 2653.0 5717.0 211016.0 3459.27868 3406 > 2018-05-09T08:29:57.9646+02:00 Time to Download Full Response 54 2765.0 5772.0 186693.0 3457.27777 3277 > 2018-05-09T08:34:57.967164+02:00 Time to Download Full Response 54 2970.0 5636.0 193700.0 3587.03703 3401 > 2018-05-09T08:39:57.981221+02:00 Time to Download Full Response 55 2583.0 5411.0 197685.0 3594.27272 3607 > 2018-05-09T08:44:57.981925+02:00 Time to Download Full Response 56 2664.0 5661.0 196917.0 3516.375 3829 > 2018-05-09T08:49:57.982564+02:00 Time to Download Full Response 57 2854.0 6889.0 202831.0 3558.43859 3014 > 2018-05-09T08:54:57.983894+02:00 Time to Download Full Response 63 2761.0 6190.0 214199.0 3399.98412 3226 > 2018-05-09T08:59:57.990957+02:00 Time to Download Full Response 63 2796.0 5130.0 216577.0 3437.73015 3066 > 2018-05-09T09:04:57.992464+02:00 Time to Download Full Response 55 2894.0 6478.0 200221.0 3640.38181 3300 > 2018-05-09T09:09:57.994927+02:00 Time to Download Full Response 51 3173.0 5203.0 194755.0 3818.72549 3276 > 2018-05-09T09:14:57.995991+02:00 Time to Download Full Response 55 3167.0 6066.0 205586.0 3737.92727 3432 > 2018-05-09T09:19:57.998695+02:00 Time to Download Full Response 57 2840.0 6339.0 212726.0 3732.03508 3445 > 2018-05-09T09:24:58.010041+02:00 Time to Download Full Response 53 2737.0 5854.0 203695.0 3843.30188 3212 > 2018-05-09T09:29:58.011472+02:00 Time to Download Full Response 55 1893.0 5003.0 193155.0 3511.90909 3311 > 2018-05-09T09:34:58.013342+02:00 Time to Download Full Response 53 2573.0 6791.0 194739.0 3674.32075 3588 > 2018-05-09T09:39:58.016082+02:00 Time to Download Full Response 55 2810.0 5476.0 195984.0 3563.34545 3254 > > I also see a degradation over time on "Time waiting for MaCommitRequest requests" > > End Time Name Count min max sum mean last > -------- ---- ----- --- --- --- ---- ---- > 2018-05-08T21:24:56.756627+02:00 Time waiting for MaCommitRequest requ... 98 1443.0 4291.0 199387.0 2034.56122 1706 > 2018-05-08T21:29:56.756915+02:00 Time waiting for MaCommitRequest requ... 97 1583.0 3399.0 201187.0 2074.09278 1821 > 2018-05-08T21:34:56.759692+02:00 Time waiting for MaCommitRequest requ... 96 1333.0 3538.0 199918.0 2082.47916 2047 > 2018-05-08T21:39:56.761261+02:00 Time waiting for MaCommitRequest requ... 95 1143.0 3547.0 196105.0 2064.26315 2070 > 2018-05-08T21:44:56.764175+02:00 Time waiting for MaCommitRequest requ... 92 1540.0 4449.0 195004.0 2119.60869 1999 > 2018-05-08T21:49:56.77278+02:00 Time waiting for MaCommitRequest requ... 91 1678.0 4190.0 194502.0 2137.38461 2008 > 2018-05-08T21:54:56.774304+02:00 Time waiting for MaCommitRequest requ... 89 1389.0 4015.0 190014.0 2134.98876 1858 > 2018-05-08T21:59:56.777215+02:00 Time waiting for MaCommitRequest requ... 88 1731.0 4576.0 191360.0 2174.54545 2020 > 2018-05-08T22:04:56.779026+02:00 Time waiting for MaCommitRequest requ... 88 1705.0 3551.0 191334.0 2174.25 1859 > <…snip…> > 2018-05-09T09:09:57.99499+02:00 Time waiting for MaCommitRequest requ... 51 3185.0 5210.0 195250.0 3828.43137 3286 > 2018-05-09T09:14:57.996035+02:00 Time waiting for MaCommitRequest requ... 55 3174.0 6073.0 206588.0 3756.14545 3898 > 2018-05-09T09:19:57.998772+02:00 Time waiting for MaCommitRequest requ... 57 2849.0 6346.0 213850.0 3751.75438 3455 > 2018-05-09T09:24:58.010092+02:00 Time waiting for MaCommitRequest requ... 53 2740.0 5856.0 205350.0 3874.52830 3216 > 2018-05-09T09:29:58.011514+02:00 Time waiting for MaCommitRequest requ... 55 1896.0 5259.0 196371.0 3570.38181 3315 > 2018-05-09T09:34:58.013386+02:00 Time waiting for MaCommitRequest requ... 53 2582.0 6796.0 196473.0 3707.03773 3593 > 2018-05-09T09:39:58.016132+02:00 Time waiting for MaCommitRequest requ... 55 2818.0 5485.0 196926.0 3580.47272 3268 > nil Time waiting for MaCommitRequest requ... 10 3125.0 4748.0 37967.0 3796.7 3511 > > > This has led me to implement isImmutableInMagma on the TKEntry to keep the readSet low, but now I have run into a problem: > The caches are based on Dictionaries and they can grow to 20000-30000 (or more) entries. > > After all of this, my question is simple: > Should I use a WriteBarrier capable object instead (I could not find a WbDictionary unfortunately)? > In fact - how does WriteBarrier work in the context of Magma? I have set allowWriteBarrier: true when connecting to the Magma DB. > > Cheers, > Tim > > > _______________________________________________ > Magma mailing list > [hidden email] > http://lists.squeakfoundation.org/mailman/listinfo/magma Magma mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/magma |
Free forum by Nabble | Edit this page |