Smalltalk › Frameworks & Tools › Magma

re: Next steps

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

5 messages Options

Chris Muller

re: Next steps

Hey Göran, I don't have the context you have into your
domain, nor experience with Seaside. Nevertheless, my
strong intution suggests we should step back and
consider again having one Magma session per Seaside
session.

I am not sure whether you are trying to optimize for
speed or memory consumption, but I think that this 1:1
approach is good for both.

> > Still, it is probably good to try to keep the
> readSet
> > as small as possible.
>
> Well, I find this recommendation slightly odd *in
> general*. I understand
> how it makes each transaction faster - but on the
> other hand you loose
> the caching benefit. For example, in this app I want
> a significant part
> of the model to be cached at all times - the meta
> model. It will not be
> large (so I can afford to cache it, even in several
> sessions), but it
> will be heavily used so I don't want to end up
> reading it over and over.

It's ok. Go ahead and cache your meta-model in each
session if its not so big, but seriously let
everything else be read dynamically as-needed. Let
every session have only a very small portion of the
domain cached and keep it small via #stubOut.

Reads (proxy materializations) are one of the fastest
things Magma does. You are supposed to *enjoy* the
transparency, not have to worry about such complex
ways to circumvent it.

ReadStrategies and #stubOut: are intended to optimize
read-performance and memory consumption, respectively.
If these are not sufficient, and assuming the
uni-session approach (all Seaside sessions share one
MagmaSession and one copy of the domain) is not
either, *then* these other complex alternatives should
be considered. It's not easy for me to say but I have
to face the truth; if the intended transparency of
Magma cannot be enjoyed then that opens up lots of
other options that are equally less-transparent.

> As a reminder - the reason for my discussion on this
> topic is that I
> feel that the "simplistic approach" of simply using
> a single
> MagmaSession for each Seaside session doesn't scale
> that well. I am
> looking at a possible 100 concurrent users (in the
> end, not from the
> start) using an object model with at least say 50000
> cases - which of
> course each consists of a number of other objects.
> Sure, I can use some
> kind of multi-image clustering with round-robin
> Apache in front etc, but
> still.

Well, it may scale better than you think. Peak
(single-object) read rate is 3149 per second on my
slow laptop, 7.15 per second (see
http://minnow.cc.gatech.edu/squeak/5606 or run your
own MagmaBenchmarker) to read one thousand objects.
So if you have 1000 objects in a Case, 100 users all
requesting a case at exactly the same time then the
longest delay would be ~10 seconds (assuming you're
not serving with my slow, circa 2004 laptop).
Optimizing the ReadStrategy for a Case would allow
better performance.

Any single-image Seaside server where you want to
cache a whole bunch of stuff is going to have this
sort of scalability issue, no matter what DB is used,
right? Remember, you could use the many:1 approach
(all Seaside sessions sharing one Magma session and
single-copy of the domain), how does this differ from
any other solution?.

The 1:1 design, OTOH, is what makes multi-image
clustering possible, so from that aspect risk is
reduced. That's the one I would try very hard to make
work before abandoning TSTTCPW.

> As a sidenote, GemStone has a "shared page cache" so
> that multiple
> sessions actually share a cache of objects in ram.

That's in the server-side GemStone-Smalltalk image
memory though, isn't it? Magma doesn't do that.

> Could we possibly
> contemplate some way of having sessions share a
> cache? Yes, complex
> stuff I know. Btw, could you perhaps explain how the
> caching works
> today? Do you have some kind of low level cache on
> the file level for
> example?

I'm open to ideas. The caching is very simple right
now, it just uses WeakIdentityDictionarys to hold read
objects.

> > A commit is pretty cheap with a small readSet.
> With a
> > large readSet, WriteBarrier will definitely
> improve it
> > dramatically.
>
> I kinda guessed. Otherwise you keep an original
> duplicate of all cached
> objects, right? So WriteBarrier also improves on
> memory consumption I
> guess.

No to the first question, yes to the second (IIRC).
It doesn't keep an original "duplicate", just the
original buffer that was read.

> Very good. :) And also - do you have any clue on how
> the performance is
> affected by using the various security parts?

Authorizing every request seems to have imposed about
a 10% penalty. #cryptInFiles is hard to measure since
writes occur in the background anyway.
#cryptOnNetwork definitely slows down network
transmissions considerably, only use it if you have
to.

Regards,
Chris

Göran Krampe

re: Next steps

Hi Chris!

First - thanks for taking time to answer. :)

Chris Muller <[hidden email]> wrote:
> Hey Göran, I don't have the context you have into your
> domain, nor experience with Seaside. Nevertheless, my
> strong intution suggests we should step back and
> consider again having one Magma session per Seaside
> session.

Ok, well, I can probably do that - I just need to be sure that I feel I
have "ways out" if it turns bad. Call it "precautionary investigations".
Since I am putting myself (and Magma/Seaside/Squeak) on the line here I
don't want to fail.

> I am not sure whether you are trying to optimize for
> speed or memory consumption, but I think that this 1:1
> approach is good for both.

Not optimizing at the moment - mainly "dabbling" in my head. But both
concerns are valid, even though memory consumption was my main worry.

> > > Still, it is probably good to try to keep the
> > readSet
> > > as small as possible.
> >
> > Well, I find this recommendation slightly odd *in
> > general*. I understand
> > how it makes each transaction faster - but on the
> > other hand you loose
> > the caching benefit. For example, in this app I want
> > a significant part
> > of the model to be cached at all times - the meta
> > model. It will not be
> > large (so I can afford to cache it, even in several
> > sessions), but it
> > will be heavily used so I don't want to end up
> > reading it over and over.
>
> It's ok. Go ahead and cache your meta-model in each
> session if its not so big, but seriously let
> everything else be read dynamically as-needed. Let
> every session have only a very small portion of the
> domain cached and keep it small via #stubOut.
>
> Reads (proxy materializations) are one of the fastest
> things Magma does.

Ok, I assume I might still be avoiding actual file access - given OS
file level caching.

> You are supposed to *enjoy* the
> transparency, not have to worry about such complex
> ways to circumvent it.

I am enjoying it! You may recall I am an old GemStone dog - I know how
to enjoy that. :)

> ReadStrategies and #stubOut: are intended to optimize
> read-performance and memory consumption, respectively.

I understand them - the first is similar to GemStone, the second is not
- since it is automatic in GemStone, but whatever.

> If these are not sufficient, and assuming the
> uni-session approach (all Seaside sessions share one
> MagmaSession and one copy of the domain) is not
> either, *then* these other complex alternatives should
> be considered. It's not easy for me to say but I have
> to face the truth; if the intended transparency of
> Magma cannot be enjoyed then that opens up lots of
> other options that are equally less-transparent.

Ok. One huge benefit with using 1-1 instead of Cees' ConnectionPool is
that my Seaside components can hold onto the persistent objects.
Otherwise they can't, because the next request will end up using a
different session.

And I really wonder why I haven't realized that until now. ;) Sigh.

> > As a reminder - the reason for my discussion on this
> > topic is that I
> > feel that the "simplistic approach" of simply using
> > a single
> > MagmaSession for each Seaside session doesn't scale
> > that well. I am
> > looking at a possible 100 concurrent users (in the
> > end, not from the
> > start) using an object model with at least say 50000
> > cases - which of
> > course each consists of a number of other objects.
> > Sure, I can use some
> > kind of multi-image clustering with round-robin
> > Apache in front etc, but
> > still.
>
> Well, it may scale better than you think. Peak
> (single-object) read rate is 3149 per second on my
> slow laptop,

Are we talking cold cache including actual file access? And how does the
size of the files on disk affect that?

> 7.15 per second (see
> http://minnow.cc.gatech.edu/squeak/5606 or run your
> own MagmaBenchmarker) to read one thousand objects.

Not sure I grokked that sentence. :)

> So if you have 1000 objects in a Case, 100 users all
> requesting a case at exactly the same time then the
> longest delay would be ~10 seconds (assuming you're
> not serving with my slow, circa 2004 laptop).

Mmm.

> Optimizing the ReadStrategy for a Case would allow
> better performance.

That I probably will do when the app settles.

> Any single-image Seaside server where you want to
> cache a whole bunch of stuff is going to have this
> sort of scalability issue, no matter what DB is used,
> right? Remember, you could use the many:1 approach
> (all Seaside sessions sharing one Magma session and
> single-copy of the domain), how does this differ from
> any other solution?.

Eh... not sure I follow the logic, but never mind. :)

> The 1:1 design, OTOH, is what makes multi-image
> clustering possible, so from that aspect risk is
> reduced. That's the one I would try very hard to make
> work before abandoning TSTTCPW.

Good point.

> > As a sidenote, GemStone has a "shared page cache" so
> > that multiple
> > sessions actually share a cache of objects in ram.
>
> That's in the server-side GemStone-Smalltalk image
> memory though, isn't it? Magma doesn't do that.

The "server side" GemStone image can run anywhere - so the closest
counterpart in Magma is actually the client image IMHO.

> > Could we possibly
> > contemplate some way of having sessions share a
> > cache? Yes, complex
> > stuff I know. Btw, could you perhaps explain how the
> > caching works
> > today? Do you have some kind of low level cache on
> > the file level for
> > example?
>
> I'm open to ideas. The caching is very simple right
> now, it just uses WeakIdentityDictionarys to hold read
> objects.

And one per session I assume? No cache on any lower level, like on top
of the file code?

> > > A commit is pretty cheap with a small readSet.
> > With a
> > > large readSet, WriteBarrier will definitely
> > improve it
> > > dramatically.
> >
> > I kinda guessed. Otherwise you keep an original
> > duplicate of all cached
> > objects, right? So WriteBarrier also improves on
> > memory consumption I
> > guess.
>
> No to the first question, yes to the second (IIRC).
> It doesn't keep an original "duplicate", just the
> original buffer that was read.

Ah, ok. But you don't need that when using WriteBarrier right?

> > Very good. :) And also - do you have any clue on how
> > the performance is
> > affected by using the various security parts?
>
> Authorizing every request seems to have imposed about
> a 10% penalty. #cryptInFiles is hard to measure since
> writes occur in the background anyway.
> #cryptOnNetwork definitely slows down network
> transmissions considerably, only use it if you have
> to.
>
> Regards,
> Chris

regards, Göran

Cees De Groot

Re: Next steps

On 1/13/06, [hidden email] <[hidden email]> wrote:
> Ok. One huge benefit with using 1-1 instead of Cees' ConnectionPool is
> that my Seaside components can hold onto the persistent objects.
> Otherwise they can't, because the next request will end up using a
> different session.
>
Be glad there's Magma - in OmniBase, you can't even transport your
objects from transaction to transaction...

> Are we talking cold cache including actual file access? And how does the
> size of the files on disk affect that?
>
Hey, run your own benchmarks. They're just as invalid as anyone else's... :-)

One premature optimization you could do is to build a sort of internal
service to access base data (hmm... only know the german/dutch word
here - 'stammdaten') instead of talk to Magma directly. As long as
performance is not an issue, do nothing. As soon as performance
becomes an issue, change that interface to use a single separate Magma
session (and maybe even a separate Magma image, whatever).

That's probably all the premature optimization I would do at this moment.

How many users will use this app, anyway?

Göran Krampe

Re: Next steps

Hi!

Cees De Groot <[hidden email]> wrote:
> On 1/13/06, [hidden email] <[hidden email]> wrote:
> > Ok. One huge benefit with using 1-1 instead of Cees' ConnectionPool is
> > that my Seaside components can hold onto the persistent objects.
> > Otherwise they can't, because the next request will end up using a
> > different session.
>
> Be glad there's Magma - in OmniBase, you can't even transport your
> objects from transaction to transaction...

:)

> > Are we talking cold cache including actual file access? And how does the
> > size of the files on disk affect that?
>
> Hey, run your own benchmarks. They're just as invalid as anyone else's... :-)

Yeah, well - no time for that right now. I was just trying to pick Chris
brain a bit.

> One premature optimization you could do is to build a sort of internal
> service to access base data (hmm... only know the german/dutch word
> here - 'stammdaten') instead of talk to Magma directly. As long as
> performance is not an issue, do nothing. As soon as performance
> becomes an issue, change that interface to use a single separate Magma
> session (and maybe even a separate Magma image, whatever).
>
> That's probably all the premature optimization I would do at this moment.

Might be an idea. But I am actually not going to do even that until
needed. :)

Anyway, I just readapted my Q2Session to use 1-1, but it still uses the
pool. At least that will keep the reuse of the cached model when people
log in/out. ;)

> How many users will use this app, anyway?

350+ and perhaps up to 100 concurrently. I am busy building domain
objects and UIs right now, just want to be "prepared" especially if the
issue turns up in discussions here.

regards, Göran

Brent Pinkney

Re: Next steps

In reply to this post by Göran Krampe

>
> Ok, well, I can probably do that - I just need to be sure that I feel I
> have "ways out" if it turns bad. Call it "precautionary investigations".
> Since I am putting myself (and Magma/Seaside/Squeak) on the line here I
> don't want to fail.
>

Hi, I am finally back from my holiday.

We used Magma + Seaside to demonstrate a configuration management application
for 'the biggest mobile operator in the world'.

We simulated a lot of users >50 doing pretty complex things to graphs of tens
of thousands of objects. The MagmaCollections meant that only a small subset
was materialised at any one time.

We used 1:1 Seaside -> MagmaSession.

The slowest thing was allocating a new MagmaSession as it has to check the
image for compatibility. I have some code somewhere which pre-allocates a
session - too trivial to sumit.

As for Apache round robins, we had 5 computers reading from a single Magma
server on a separate machine (not Seaside) We never managed to saturate the
server.

Let me know if you need help. I have primised Chris I will finish Lava this
year !

Brent