> > What made you decide to use a session pool rather than have a magma
> > > session per Seaside session? Is any of the stuff you've built to > marry > > Currently we have a compromise - we use one magma session per Seaside > session, but it is allocated from a pool. This means we cut out the > session creation time, but more importantly - the session is "hot" so > we > keep the cached objects of the session. > > But the problem with using one Magma session per Seaside session > (which > thus remains for us) is that if you have a large persistent domain > model > AND you wish to cache quite a bit of it - then you get: > > numberOfObjectsInRAM = numberOfSeasideSessions * > objectsCachedPerSession > > So if you have 100 concurrent sessions and 10000 objects cached per > session you get a million objects. Ouch. This is just a physical memory ouch, right? An internal structure of only 10000 objects cached per session will keep each session from getting bogged down with a huge dictionary cache, not to mention more supportive of concurrent processing. I really think a caching 100-thousand objects in one shared session would hurt worse.. Let us clarify that this is a general web programming ouch or, more generically, a "three-tier" ouch rather than anything specific to using Magma. No matter what DB is used, you have to choose to share the model between sessions or each session works on its own copy of the model (or some hybrid of the two). Sharing the model requires the need for thread-safety throughout, not to mention throwing out commit-conflict detection (that one clients changes affect the db transactions of other clients) and consistent db views.. Scary! Has anyone tried the suggested approach to scaling with Magma; using multiple images, CPUs, servers? This permits the simple 1:1 application architecture, is ultimately more scalable and, probably more economical in the end, because cost of h/w < cost of complicated s/w architectures.. Programming to the simple, one session per web session, permits your app to scale by simply adding more hardware with absolutely _no changes_ to the code. In fact, the only thing you have to do is go into each Seaside configuration panel and point the "Magma DB location" to the remotely hosted database rather than a locally-hosted one. This has been demonstrated via the "Magma Seaside" demo available..). And you don't necessarily need additional hardware to at least *try* this approach, in fact, just multiple images on the same machine would prove its viability and leverage any multi-core abilities of that machine in the process. I hope you will at least try it out and report back how it went.. Regards, Chris _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Hi!
Chris Muller <[hidden email]> wrote: > > > What made you decide to use a session pool rather than have a magma > > > > > session per Seaside session? Is any of the stuff you've built to > > marry > > > > Currently we have a compromise - we use one magma session per Seaside > > session, but it is allocated from a pool. This means we cut out the > > session creation time, but more importantly - the session is "hot" so > > we > > keep the cached objects of the session. > > > > But the problem with using one Magma session per Seaside session > > (which > > thus remains for us) is that if you have a large persistent domain > > model > > AND you wish to cache quite a bit of it - then you get: > > > > numberOfObjectsInRAM = numberOfSeasideSessions * > > objectsCachedPerSession > > > > So if you have 100 concurrent sessions and 10000 objects cached per > > session you get a million objects. Ouch. > > This is just a physical memory ouch, right? An internal structure of Yes. > only 10000 objects cached per session will keep each session from > getting bogged down with a huge dictionary cache, not to mention more > supportive of concurrent processing. I really think a caching > 100-thousand objects in one shared session would hurt worse.. Ok. But again, 10000 was just a number out of thin air. Perhaps I want 100000 - I don't know. But I do know that we will need to be able to support about 100 concurrent users approx. > Let us clarify that this is a general web programming ouch or, more > generically, a "three-tier" ouch rather than anything specific to using > Magma. No matter what DB is used, you have to choose to share the > model between sessions or each session works on its own copy of the > model (or some hybrid of the two). Well, I presume you could *in theory* have some kind of copy-on-write mechanism thus sharing non-modified objects and still maintaining the principle of different sessions maintaining their own logical view. Doesn't GemStone use shadow pages in some kind of copy-on-write scheme for example? Not sure, my memory may be wrong. > Sharing the model requires the need for thread-safety throughout, not > to mention throwing out commit-conflict detection (that one clients > changes affect the db transactions of other clients) and consistent db > views.. Scary! Well, the idea was to *not* perform modifications in the shared Magma session. They would be performed in separate allocated sessions. And AFAIK thread safety is not an issue in a readonly model. And the same goes for conflict detection etc - still should work fine. But this approach was more like a mind experiment in how we could make the RAM requirement say 50 times lower. :) > Has anyone tried the suggested approach to scaling with Magma; using > multiple images, CPUs, servers? This permits the simple 1:1 > application architecture, is ultimately more scalable and, probably > more economical in the end, because cost of h/w < cost of complicated > s/w architectures.. Well, we probably will have to. But we are not there yet. ;) > Programming to the simple, one session per web session, permits your > app to scale by simply adding more hardware with absolutely _no > changes_ to the code. > > In fact, the only thing you have to do is go into each Seaside > configuration panel and point the "Magma DB location" to the remotely > hosted database rather than a locally-hosted one. This has been > demonstrated via the "Magma Seaside" demo available..). > > And you don't necessarily need additional hardware to at least *try* > this approach, in fact, just multiple images on the same machine would > prove its viability and leverage any multi-core abilities of that > machine in the process. I hope you will at least try it out and report > back how it went.. Sure, we don't intend to make any experiments in session management at this point. I was just "airing a concern" I have. :) We will see. Btw, the approach of wrapping each request in an commit block (or as we do - just an abort before performing the request (we run modifications to the model in separately started transactions)) has a noticable "feel" penalty. Seaside typically does a redirect so each "click" will result in two http requests - each doing an abort. And even if nothing at all has changed in Magma (and we are still running Magma in the same image even, so there is no roundtrip involved) it gives a sluggish feeling. Doing a cheap trick (which will not work if we go multi-image) by using a transaction counter in the image we can avoid making the aborts if we know that there have been no transactions since last abort. This improved the "feel" a LOT. I did profile the aborts trying to figure out why they take so long even when there are no changes - but can't recall right now what it was. regards, Göran _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
> Btw, the approach of wrapping each request in an commit block
> (or as we do - just an abort before performing the request > (we run modifications to the model in separately started > transactions)) has a noticable "feel" > penalty. > > Seaside typically does a redirect so each "click" will result > in two http requests - each doing an abort. And even if > nothing at all has changed in Magma (and we are still running > Magma in the same image even, so there is no roundtrip > involved) it gives a sluggish feeling. Doing a cheap trick > (which will not work if we go multi-image) by using a > transaction counter in the image we can avoid making the > aborts if we know that there have been no transactions since > last abort. This improved the "feel" a LOT. > > regards, Göran I abandoned this approach as well, committing on every request seems to have a huge penalty when using GOODS. Seems you can't hide transactions completely. I just added a commit method on session that delegates to the db, and simply call self session commit whenever I feel it necessary. Response times are far better and snappier than wrapping the entire request in a commit. _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Göran Krampe
> Well, the idea was to *not* perform modifications in the shared Magma
> session. They would be performed in separate allocated sessions. Sounds interesting. How would you "find" the object(s) to be modified in the other (mutator) session? An oid lookup, perhaps? > Btw, the approach of wrapping each request in an commit block (or as > we > do - just an abort before performing the request (we run > modifications > to the model in separately started transactions)) has a noticable > "feel" > penalty. Did you set refreshPersistentObjectsEvenWhenChangedOnlyByMe: true? That can kill abort performance, I would try really hard to leave that off. Did you know, when you leave that option off, you can: - make changes to the model, outside of a transaction - send #begin, which will refresh only the objects which were changed by others, not revert your own changes - immediately send #commit, which will commit your changes This satisfies the "I forgot to begin" use-case, but maybe also a useful alternative to the performance-killing refreshPersistentObjectsEvenWhenChangedOnlyByMe mode.. > I did profile the aborts trying to figure out why they take so long > even > when there are no changes - but can't recall right now what it was. If the profile shows its in MaTransaction>>#restore then the refreshPersistentObjectsEvenWhenChangedOnlyByMe was turned on, don't do that. Otherwise, please post the profile to the Magma list, I'll look at it promptly. Regards, Chris _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Could, should Magma also use Amazon S3
http://www.amazon.com/s3 as a storage device? I've not thought through what it would take to optimize for it, but it might reduce a lot of data/code/image persistency headaches. Cheers, Darius _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Very interesting. It looks like something that Squeak itself could
benefit from, wrapped in a Stream or Flow interface. Whether an I/O intensive application like a DB server could benefit from that is hard to say, those servers typically want to have close (read: quick) access to the db files, I'm pretty sure there would be performance challenges with remote primitive access. It might be good for backups though.. --- Darius Clarke <[hidden email]> wrote: > Could, should Magma also use Amazon S3 > http://www.amazon.com/s3 > as a storage device? > > I've not thought through what it would take to optimize for it, but > it > might reduce a lot of data/code/image persistency headaches. > > Cheers, > Darius > > _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Darius Clarke
Wow, it actually provides "key" access to ByteArrays.. I didn't see
that at first. Very interesting..! --- Darius Clarke <[hidden email]> wrote: > Could, should Magma also use Amazon S3 > http://www.amazon.com/s3 > as a storage device? > > I've not thought through what it would take to optimize for it, but > it > might reduce a lot of data/code/image persistency headaches. > > Cheers, > Darius > > _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Chris Muller
You should realize that S3 provides availability over consistency. It is quite possible that you can put a chunk of data, ask for it back, and get the previous version due to propagation delays across the replicated store. Great for backups, not so hot for real time usage.
-----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of Chris Muller Sent: Tuesday, August 01, 2006 7:42 AM To: The Squeak Enterprise Aubergines Server - general discussion. Subject: Re: Scaling Seaside apps (was: [Seaside] About SToR) Wow, it actually provides "key" access to ByteArrays.. I didn't see that at first. Very interesting..! --- Darius Clarke <[hidden email]> wrote: > Could, should Magma also use Amazon S3 > http://www.amazon.com/s3 > as a storage device? > > I've not thought through what it would take to optimize for it, but it > might reduce a lot of data/code/image persistency headaches. > > Cheers, > Darius > > _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Chris Muller
I should also point out that the ability to make torrents makes it a great publishing mechanism.
-----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of Blanchard, Todd Sent: Tuesday, August 01, 2006 11:00 AM To: [hidden email]; The Squeak Enterprise Aubergines Server - general discussion. Subject: RE: Scaling Seaside apps (was: [Seaside] About SToR) You should realize that S3 provides availability over consistency. It is quite possible that you can put a chunk of data, ask for it back, and get the previous version due to propagation delays across the replicated store. Great for backups, not so hot for real time usage. -----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of Chris Muller Sent: Tuesday, August 01, 2006 7:42 AM To: The Squeak Enterprise Aubergines Server - general discussion. Subject: Re: Scaling Seaside apps (was: [Seaside] About SToR) Wow, it actually provides "key" access to ByteArrays.. I didn't see that at first. Very interesting..! --- Darius Clarke <[hidden email]> wrote: > Could, should Magma also use Amazon S3 > http://www.amazon.com/s3 > as a storage device? > > I've not thought through what it would take to optimize for it, but it > might reduce a lot of data/code/image persistency headaches. > > Cheers, > Darius > > _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
On Aug 1, 2006, at 11:03 AM, Blanchard, Todd wrote: > I should also point out that the ability to make torrents makes it > a great publishing mechanism. In the Squeak world, a more appropriate use of S3 might be as a Monticello repository (Colin suggested this to me recently). Its architecture and permissions model is probably just about right for source control. Avi _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Blanchard, Todd
On Aug 1, 2006, at 2:00 PM, Blanchard, Todd wrote: > You should realize that S3 provides availability over consistency. > It is quite possible that you can put a chunk of data, ask for it > back, and get the previous version due to propagation delays across > the replicated store. Great for backups, not so hot for real time > usage. Ah, but it's a very good match for Monticello usage patterns. In Monticello, each version of a package is immutable, and identified by a UUID. This means that clients don't have to coordinate when creating new versions and we can't get "old" versions. A version is either available or it isn't; its content can never change. _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Chris Muller
Another datapoint, for those interested...
I hacked up a quick implementation of some parsing code I have in several other languages, using GLORP. The process is single-threaded, but forked off as a background thread that is monitored from a Seaside "control panel". There are several, somewhat structured scrapings from web pages that I want stored on disk. This data should be approximately 1GB when the process finishes. I wrote a lightweight proxy for GLORP that makes session access atomic, and everything works like a charm. What I was amazed to find was that the Squeak image, with one process running mind you, is CPU limited! (I have tried a variety of different priorities for the forked process, including the IO priorities.) It's been difficult for me to figure out exactly how to count the number of message sends (looking in the Seaside profiler, I know it's quite possible), however, looking at the Process panel seems to point the finger at GLORP, constructing a ton of queries on-the-fly. Opening the task manager and watching bandwidth consumption agrees... Brief periods of activity followed by pauses as my program tries to figure out what to do with the data it pulled. The running Postgres image, too, is sitting there with 5% CPU usage, not breaking a sweat. GLORP is a dream to work with. It almost makes those spurious object-access patterns look free. :-) But, if you don't want to store a whole table in memory and you don't want to go twiddling down the whole B-tree every time you do an object access, you want a cursor, and I haven't quite figured out how to get that to work... On a side note, I achieved 10-12x the throughput with my prototype program (written in a different language and dumping the serialized representation to disk), and I have moved on to yet another language to finish the job. *Sigh* One day I'll be able to use Squeak. Jeremy > Very interesting. It looks like something that Squeak itself could > benefit from, wrapped in a Stream or Flow interface. > > Whether an I/O intensive application like a DB server could benefit > from that is hard to say, those servers typically want to have close > (read: quick) access to the db files, I'm pretty sure there would be > performance challenges with remote primitive access. > > It might be good for backups though.. > > --- Darius Clarke <[hidden email]> wrote: > >> Could, should Magma also use Amazon S3 >> http://www.amazon.com/s3 >> as a storage device? >> >> I've not thought through what it would take to optimize for it, but >> it >> might reduce a lot of data/code/image persistency headaches. >> >> Cheers, >> Darius >> >> > > _______________________________________________ > Seaside mailing list > [hidden email] > http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside > GPG PUBLIC KEY: 0xA2B36CE5 _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Jeremy Shute wrote:
> GLORP is a dream to work with. It almost makes those spurious > object-access patterns look free. :-) But, if you don't want to store a > whole table in memory and you don't want to go twiddling down the whole > B-tree every time you do an object access, you want a cursor, and I > haven't quite figured out how to get that to work... > > On a side note, I achieved 10-12x the throughput with my prototype program > (written in a different language and dumping the serialized representation > to disk), and I have moved on to yet another language to finish the job. > *Sigh* One day I'll be able to use Squeak. Do you have to have GLORP in the critical path? If you only have a few tables, maybe coding the SQL directly is possible. Or, use GLORP for the bulk of your model, but isolate the performance critical portion of the model in a separate subsystem, and use custom SQL for that portion. Maybe GLORP is not appropriate for your data set. Your use case does not sound ideal for any O-R framework. Even in Java using Hibernate O-R, the recommend you NOT use it for bulk data processing. But they do suggest a workaround suitable for some cases, which is to use a "report" query. What that does is bypass all the object instantiation and caching framework needed for O-R, (i.e. you don't need to create an actual object, you just want the data values to push out a report). Having said that, unless you use cursors, the postgres driver will pull the entire result set into memory. This behaviour is an artifact of the communication protocol between the postgres server and a client process. However, the newer version 3 of this protocol does not pull in the entire data set. I'd be interested to know whether you can in fact avoid pulling in the entire data set by using cursors, with the current postgres driver (which implements version 2 protocol). Assuming you can get cursors working, I'd be surprised if you couldn't match the 10-12x increase you got using another language. Basically, the postgres driver just pulls bytes off the socket and makes arrays of strings. _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Yanni Chiu wrote:
> Having said that, unless you use cursors, the postgres > driver will pull the entire result set into memory. > This behaviour is an artifact of the communication protocol > between the postgres server and a client process. However, > the newer version 3 of this protocol does not pull in the > entire data set. I'd be interested to know whether you can > in fact avoid pulling in the entire data set by using cursors, > with the current postgres driver (which implements version 2 > protocol). I'm not sure this has changed in v3 of the protocol, PG has always returned all the rows you request. I certainly can't find any mention of it here: http://www.postgresql.org/docs/8.1/static/protocol-changes.html As you say, cursors sound like the way to go. -- Richard Huxton Archonet Ltd _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Richard Huxton wrote:
> I'm not sure this has changed in v3 of the protocol, PG has always > returned all the rows you request. I certainly can't find any mention of > it here: > http://www.postgresql.org/docs/8.1/static/protocol-changes.html I looked at implementing the v3 protocol when it was introduced (maybe 2 or 3 years ago). I recall that it didn't quite make sense to me that cursors should already work with the v2 protocol, yet it seemed that the v3 protocol was needed to get partial result sets. After re-reading the spec, I agree with you - PG does return the rows you request. So, like you said, cursors is what you need to avoid filling up your memory with a large result set, and this should work already with the existed driver. Now the part that got me confused was "Extended Query" at: http://www.postgresql.org/docs/8.1/static/protocol-flow.html#AEN60506 where it says: Once a portal exists, it can be executed using an Execute message. The Execute message specifies the portal name (empty string denotes the unnamed portal) and a maximum result-row count (zero meaning "fetch all rows"). The Extended Query is new in the v3 protocol. That section, and some other words around message synchronization led me to conclude that the protocol had changed a lot. Now, it seems to me that it is probably just a matter of adding the new message types, and altering the state machine. However, adding the changes to a single state machine may start to get ugly (i.e. unmanagable). Do you have any sense of when (or if) the v2 protocol support on the server side would be discontinued? _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Yanni Chiu wrote:
> Richard Huxton wrote: >> I'm not sure this has changed in v3 of the protocol, PG has always >> returned all the rows you request. I certainly can't find any mention >> of it here: >> http://www.postgresql.org/docs/8.1/static/protocol-changes.html > > I looked at implementing the v3 protocol when it was introduced > (maybe 2 or 3 years ago). I recall that it didn't quite make > sense to me that cursors should already work with the v2 protocol, > yet it seemed that the v3 protocol was needed to get partial result > sets. After re-reading the spec, I agree with you - PG does return > the rows you request. So, like you said, cursors is what you need > to avoid filling up your memory with a large result set, and this > should work already with the existed driver. > > Now the part that got me confused was "Extended Query" at: > http://www.postgresql.org/docs/8.1/static/protocol-flow.html#AEN60506 > where it says: > Once a portal exists, it can be executed using an Execute message. > The Execute message specifies the portal name (empty string denotes > the unnamed portal) and a maximum result-row count (zero meaning > "fetch all rows"). > > The Extended Query is new in the v3 protocol. That section, and some > other words around message synchronization led me to conclude that > the protocol had changed a lot. Now, it seems to me that it is probably > just a matter of adding the new message types, and altering the state > machine. However, adding the changes to a single state machine may > start to get ugly (i.e. unmanagable). I'd be surprised if it wasn't fairly straightforward to have the state machine drop back from v3 to v2. The PG developers try to keep it simple to connect between versions. > Do you have any sense of when (or if) the v2 protocol support > on the server side would be discontinued? I don't think it's being dropped in the next release (8.2), so you're safe for at least 18 months I'd say. -- Richard Huxton Archonet Ltd _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Yanni Chiu
On 8/2/06, Yanni Chiu <[hidden email]> wrote:
Jeremy Shute wrote: Yes, it's very much in the critical path. I'm sorry but I'm still amazed that it can't assemble and ship queries to Postgres as fast as I can get data from a cable modem. That was a huge shocker -- I've got a weak cable connection on one side and a disk on the other, and rifling through strings and objects in RAM is the issue??? I agree that there are solutions which involve direct-SQL access, and making a mess of otherwise clean code (but a well isolated mess, of course). I could also simply contribute to GLORP in order to make it better. I would do this in a second if I thought it would get me from point A to point B, GLORP is great software! As an off-topic side-note, in order to GET from point A to point B, I addressed the problem by developing a "new to me" paradigm for dealing with data of this type. So far, I think I did the right thing. Like SQL (and unlike serialization or the Prevayler approach), multiple programs can get access to the same objects from out-of-core datastructures. But unlike SQL, the indices require ~1 disk seeks to get at objects after a cache-miss ( i.e. hash based), columns can be in a much more structured format (think of something similar to memcpy for a DOM tree, for instance), etc. Having said that, unless you use cursors, the postgres Sigh. I know. The options seem to be: * Get the whole result set if it fits in memory. * Seek the same B-tree nodes over and over again if it doesn't (the root should be cached by the RDBMS, of course). Cursors would definitely be the answer to this, but I recognize that I am in the minority in my need for them. Really, I don't think the cursor would fix my 10-12x problem. For me, it's a matter of bypassing caches and using prepared statements. But, I wanted to deal in objects, and found a fine way to continue to do that without the overhead of OR mapping. Assuming you can get cursors working, I'd be surprised if I'm betting that Squeak is capable of that 10-12x with proper massage. But the 10-12x will simply match the prototype implementation, which in turn has not been massaged. (In fact, both implementations are really stupid in that they are SERIAL.) I have figured out how to use a proxy object to get the GLORP sessions to be thread-safe, but the next barrier will be lock contention as the serial implementation becomes a simple producer/consumer queue. I would say that Squeak is currently state-of-the-art in terms of programmer interface -- Seaside and GLORP are basically unrivalled in their design and terseness. Avi didn't need any of the stuff I mentioned to make a great piece of software. But, the subject was "scalability", so I wanted to offer myself as a data point. Jeremy _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
On Aug 3, 2006, at 4:51 PM, Jeremy Shute wrote: > I agree that there are solutions which involve direct-SQL access, > and making a mess of otherwise clean code (but a well isolated > mess, of course). I could also simply contribute to GLORP in order > to make it better. I would do this in a second if I thought it > would get me from point A to point B, GLORP is great software! Hi Jeremy, You might want to have a look at ROE. It was a little experiment that Avi did for creating SQL queries in a nice object-oriented way. See the url below for more explanation. The code is in SqueakMap. http://www.cincomsmalltalk.com/userblogs/avi/blogView? showComments=true&entry=3246121322 Colin _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Jeremy Shute
Jeremy
am I right in understanding that what you are saying is that squeak is simply not able to rip through strings as fast as perhaps a perl regex. Does this mean that there is some overhead in the handling or Strings in squeak that could use a look. I can imagine that ByteArrays may be more efficient if less useful than Strings. Just a thought, but as you talk about all this work with strings and so forth I am wondering about object creation/deletion overhead. One example for you. In a UI system that I once used, there were a lot of Rect objects flying about. It turns out that in this case extreme performance improvements could be had be simply reusing one Rect instance and passing it into the routines that need it. Hundreds of calculations and operations can all be performed without filling any memory up with instances that are instantly thrown away and hang around for extensive garbage collection later. so for example drawSquare: size | w | w := Rect new w width: size height: size. do things with a rect here... becomes drawSquare: size on: aRect "note no new object allocation" w width: size height: size. do things with rect here. I have used this tactic/pattern on several occasions many years ago, and I struggle to remember the details of specific instances, but I think one such instance was in an import routine. I was importing a data table of alarms that are raised by a piece of telecoms equipment. The input would have been a raw text file, the output 6000 or so populated objects with some munging in between. Simply reusing the same object as a buffer saved a lot of time. best regards Keith ___________________________________________________________ Yahoo! Messenger - with free PC-PC calling and photo sharing. http://uk.messenger.yahoo.com _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Jeremy Shute
Jeremy Shute wrote:
> > Yes, it's very much in the critical path. I'm sorry but I'm still > amazed that it can't assemble and ship queries to Postgres as fast as I > can get data from a cable modem. That was a huge shocker -- I've got a > weak cable connection on one side and a disk on the other, and rifling > through strings and objects in RAM is the issue??? > > I agree that there are solutions which involve direct-SQL access, and > making a mess of otherwise clean code (but a well isolated mess, of > course). I could also simply contribute to GLORP in order to make it > better. I would do this in a second if I thought it would get me from > point A to point B, GLORP is great software! Jeremy -- I plopped a note over on the Glorp mailing list about your Cursor comment (I hope you didn't mind) and got the following reply from Alan Knight about what happens with Cursors & Glorp (he wanted me to post this since he wasn't able to post directly) : ========================================================================== If you can post, you might mention that Glorp actually does everything internally in terms of cursors. If you want the result set returned only part at a time, you can set the query collectionType: to GlorpCursoredStream, which gives you a stream on the results. However, that will then depend on the underlying driver's behaviour. I know that in VW, I've seen complaints the Postgresql driver doesn't do cursors very effectively - it gets all the results before returning anything. Other drivers, however, certainly do do cursors. ========================================================================== -- Rick _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Free forum by Nabble | Edit this page |