I also have that idea for a while and together with Michael Lucas-Smith
we started to work on something like that on VisualWorks, it's called Prevayler and on my wiki you can find more about it: http://wiki.eranova.si/aida/Prevayler+persistency I think that a Prevayler is quite easily achievable especially on VisualWorks, because it supports so called object immutability, that is, you can set an object read-only and when someone try to change it, an UHE is raised. All a Prevayler needs is to catch that exception and save changes. I don't know if Sqeak supports immutability too? Janko Bert Freudenberg wrote: > Am 23.10.2006 um 21:30 schrieb Philippe Marschall: > >> 2006/10/23, Cees de Groot <[hidden email]>: >>> On 10/23/06, Philippe Marschall <[hidden email]> wrote: >>> > > > So about 300 Euros? >>> >>> [...] >>> >>> > 64bit VM? >>> > >>> You pay the hosting bills for a new box? ;) >> >> I'm willing to pay 2 GB of RAM if that's what is needed to run Pier. >> That Squeak can't handle this is a Squeak specific limitation that has >> nothing to do with the point that memory is that cheap. >> As pointed out numerous times on squeak-dev and disputed by none, all >> VM related issues can be fixed easily by just fixing the VM. This is >> no problem since the VM is open source. > > If we had a transactional virtual object memory that's continuously > saved to disk (think OOZE/LOOM), that might be viable. Perhaps with > Magma you could have almost the same semantics, just be careful what you > touch. But not with the current object memory. No way. Not if you care > about the data. > > It's not about RAM being cheap or not. It's about designing for the > common and the worst case. Why you would want to bring in gigabytes of > data if the working set is just a few megabytes is beyond me. > > - Bert - |
In reply to this post by Bert Freudenberg
2006/10/23, Bert Freudenberg <[hidden email]>:
> Am 23.10.2006 um 21:30 schrieb Philippe Marschall: > > > 2006/10/23, Cees de Groot <[hidden email]>: > >> On 10/23/06, Philippe Marschall <[hidden email]> wrote: > >> > > > So about 300 Euros? > >> > >> [...] > >> > >> > 64bit VM? > >> > > >> You pay the hosting bills for a new box? ;) > > > > I'm willing to pay 2 GB of RAM if that's what is needed to run Pier. > > That Squeak can't handle this is a Squeak specific limitation that has > > nothing to do with the point that memory is that cheap. > > As pointed out numerous times on squeak-dev and disputed by none, all > > VM related issues can be fixed easily by just fixing the VM. This is > > no problem since the VM is open source. > > If we had a transactional virtual object memory that's continuously > saved to disk (think OOZE/LOOM), that might be viable. Perhaps with > Magma you could have almost the same semantics, just be careful what > you touch. But not with the current object memory. No way. Not if you > care about the data. > > It's not about RAM being cheap or not. It's about designing for the > common and the worst case. Why you would want to bring in gigabytes > of data if the working set is just a few megabytes is beyond me. The point was just that holding the whole wiki in the memory is no problem memory or money wise. That the vm, like in many other cases too, is the real problem (and I'm quite sure the Java VM would be up to it) is a completely unrealted issue. Philippe |
In reply to this post by Michael Rueger-6
The process of porting minnow from swiki to pier, is likely to happen.
It will just require a little bit of patience, as the needed components are road-tested and refined. As you can imagine my own testing process has been somewhat hindered by image freezes. Now that I have been informed that Squeak vm 3.6.3 is actually stable. I have been able to create a pier-wiki with 6200 or more pages for testing purposes. Initial Stats Using Pier with PRNullPersistency (i.e. everything in memory): 6200 pages (generated one for each squeak class from squeak sources) 235352 internal links. Adding a page 100-500ms Removing a page 215 seconds!! (many wikis dont support removing pages anyway) Total Memory = 77Mb. Compare this to minnow: Text data of its 5889 pages is about 30Mb. Of course the swiki, has full page history, and uploaded files too. Pier Magma should be able to handle this kind of load, but it remains to be explicitly tested. Anticipated work to make things workable. 1. Some explicit caching of items that will slow pier-magma down with data on disk rather than in memory. Removal of pages may be extremely slow without this. 2. Explicit support for an indexed full text search which avoids the need to traverse the whole data tree for a simple search. 3. Some form of logging of user edits in addition to the default persistency strategy. J J wrote: >I would say go to Pier. I think Kieth released some software that you can point >at the swiki and it will slurp it all up. Am I right Kieth? I haven't written any proper data slurper for minnow. I believe there is already an importing tool. I have pointed wget at minnow to get the current set of pages as a test data set. Its about 30Mb or so. Which leads me to a question. How would pier handle some random person running this script? #!/usr/bin/ruby for i in 1..5889 print `wget --user=squeak --password=viewpoints http://minnow.cc.gatech.edu/squeak/#{i }.edit` end this would probably create almost 5800 seaside sessions, in a matter of minutes? Keith Send instant messages to your online friends http://uk.messenger.yahoo.com |
In reply to this post by Philippe Marschall
Am 23.10.2006 um 22:55 schrieb Philippe Marschall:
> 2006/10/23, Bert Freudenberg <[hidden email]>: >> Am 23.10.2006 um 21:30 schrieb Philippe Marschall: >> >> > 2006/10/23, Cees de Groot <[hidden email]>: >> >> On 10/23/06, Philippe Marschall <[hidden email]> >> wrote: >> >> > > > So about 300 Euros? >> >> >> >> [...] >> >> >> >> > 64bit VM? >> >> > >> >> You pay the hosting bills for a new box? ;) >> > >> > I'm willing to pay 2 GB of RAM if that's what is needed to run >> Pier. >> > That Squeak can't handle this is a Squeak specific limitation >> that has >> > nothing to do with the point that memory is that cheap. >> > As pointed out numerous times on squeak-dev and disputed by >> none, all >> > VM related issues can be fixed easily by just fixing the VM. >> This is >> > no problem since the VM is open source. >> >> If we had a transactional virtual object memory that's continuously >> saved to disk (think OOZE/LOOM), that might be viable. Perhaps with >> Magma you could have almost the same semantics, just be careful what >> you touch. But not with the current object memory. No way. Not if you >> care about the data. >> >> It's not about RAM being cheap or not. It's about designing for the >> common and the worst case. Why you would want to bring in gigabytes >> of data if the working set is just a few megabytes is beyond me. > > The point was just that holding the whole wiki in the memory is no > problem memory or money wise. No, this was not the point at all. The point was that *even* if you could have as many Gigabytes of RAM as you want, holding everything in the image *without* being backed by some permanent storage does not scale, and therefore is unsuited for real deployment. > That the vm, like in many other cases too, is the real problem (and > I'm quite sure the Java VM would be up to it) is a completely > unrealted issue. It's news to me that the Java VM supports an object image. Or that any real-world system on Java would just load a snapshot of *all* its data and save it *in whole* later - I sincerely doubt that. - Bert - |
> > It's news to me that the Java VM supports an object image. Or that any > real-world system on Java would just load a snapshot of *all* its data > and save it *in whole* later - I sincerely doubt that. > > - Bert - > I wrote a system which held a substantial data set in image with no problems (this was ST/X). The biggest problem I had was with stdio.h being limited to 255 open file descriptors in certain situations on Solaris. As a work around I had to open 256 dummy file descriptors, then open the 700-1000 file descriptors that I wanted to use. Then close the dummy file descriptors, so as to leave some in the range 0-255 available for those parts of the system that required them. Another team tried a similar project in perl and another team followed suit in java. Last I heard they reimplemented from scratch in C++. The java system took a farm of machines to run it. Following this experience I have no confidence in the java vm, or associated technologies being able to run anything of any size or complexity. Smalltalk can load a simulation of over 1000 interacting telecoms units with a full simulation of all their configurations, cards, alarms, etc, and load and have running that simulation in about 20 seconds. The time it takes to load a 200-400Mb image. Which is not long on a big expensive sun server machine. I cant imagine even attempting the same in java without requiring a database backend and all of the overhead that that would entail. Overall Squeak's vm may not be as fast as ST/X, but I think it does a pretty good job. Keith Send instant messages to your online friends http://uk.messenger.yahoo.com |
In reply to this post by Michael Rueger-6
2006/10/23, Michael Rueger <[hidden email]>:
> Philippe Marschall wrote: > > 2006/10/23, Cees de Groot <[hidden email]>: > >> On 10/23/06, Philippe Marschall <[hidden email]> wrote: > >> > > > So about 300 Euros? > >> > >> [...] > >> > >> > 64bit VM? > >> > > >> You pay the hosting bills for a new box? ;) > > > > I'm willing to pay 2 GB of RAM if that's what is needed to run Pier. > > That Squeak can't handle this is a Squeak specific limitation that has > > nothing to do with the point that memory is that cheap. > > As pointed out numerous times on squeak-dev and disputed by none, all > > VM related issues can be fixed easily by just fixing the VM. This is > > no problem since the VM is open source. > > So I'm assuming you just volunteered to fix these issues so we can > switch to Pier in a few weeks? That would be a waste of time because in ten years we will have a new, fixed and cool VM. Philippe |
In reply to this post by Bert Freudenberg
2006/10/23, Bert Freudenberg <[hidden email]>:
> Am 23.10.2006 um 22:55 schrieb Philippe Marschall: > > > 2006/10/23, Bert Freudenberg <[hidden email]>: > >> Am 23.10.2006 um 21:30 schrieb Philippe Marschall: > >> > >> > 2006/10/23, Cees de Groot <[hidden email]>: > >> >> On 10/23/06, Philippe Marschall <[hidden email]> > >> wrote: > >> >> > > > So about 300 Euros? > >> >> > >> >> [...] > >> >> > >> >> > 64bit VM? > >> >> > > >> >> You pay the hosting bills for a new box? ;) > >> > > >> > I'm willing to pay 2 GB of RAM if that's what is needed to run > >> Pier. > >> > That Squeak can't handle this is a Squeak specific limitation > >> that has > >> > nothing to do with the point that memory is that cheap. > >> > As pointed out numerous times on squeak-dev and disputed by > >> none, all > >> > VM related issues can be fixed easily by just fixing the VM. > >> This is > >> > no problem since the VM is open source. > >> > >> If we had a transactional virtual object memory that's continuously > >> saved to disk (think OOZE/LOOM), that might be viable. Perhaps with > >> Magma you could have almost the same semantics, just be careful what > >> you touch. But not with the current object memory. No way. Not if you > >> care about the data. > >> > >> It's not about RAM being cheap or not. It's about designing for the > >> common and the worst case. Why you would want to bring in gigabytes > >> of data if the working set is just a few megabytes is beyond me. > > > > The point was just that holding the whole wiki in the memory is no > > problem memory or money wise. > > No, this was not the point at all. The point was that *even* if you > could have as many Gigabytes of RAM as you want, holding everything > in the image *without* being backed by some permanent storage does > not scale, and therefore is unsuited for real deployment. Let me quote Michael Rueger: > IIRC this does not pull in the history. For the SmallWiki port Thomas > back then wrote an importer that imports everything. The persistency > also avoid having to keep everything in memory which with the amount of > content on Minnow is not practical anyways. I know memory is cheap, but > not that cheap ;-) Having no permanent storage (save image doesn't count) is just plain stupid and therefore is unsuited for real deployment. But this has nothing to do with holding the whole data in the image. You can save a page to the filesystem when it was edited or created and still have it in the image. Pier has hooks for this since before it was called Pier. Having all the data in RAM scales the same way as having all the data on disk. Linearly. IIRC Google can hold almost the entire web in RAM. So there is virtually no limit to that. I know this is not clever. I just say it is possible and the cost is not excessive (holding Minnow in RAM, not the web). > > That the vm, like in many other cases too, is the real problem (and > > I'm quite sure the Java VM would be up to it) is a completely > > unrealted issue. > > It's news to me that the Java VM supports an object image. Or that > any real-world system on Java would just load a snapshot of *all* its > data and save it *in whole* later - I sincerely doubt that. I was talking about dealing with 2 GB of RAM. Philippe |
Am 24.10.2006 um 21:04 schrieb Philippe Marschall:
> Having no permanent storage (save image doesn't count) is just plain > stupid and therefore is unsuited for real deployment. But this has > nothing to do with holding the whole data in the image. You can save a > page to the filesystem when it was edited or created and still have it > in the image. Pier has hooks for this since before it was called Pier. > > Having all the data in RAM scales the same way as having all the data > on disk. Linearly. IIRC Google can hold almost the entire web in RAM. > So there is virtually no limit to that. I know this is not clever. I > just say it is possible and the cost is not excessive (holding Minnow > in RAM, not the web). I thought we were having a serious discussion, and not just pointing fingers at RAM prices. Or pointing to non-existent VM technology, as you did in another thread. I stand by my assessment that holding *everything* including all versions of all pages and also all uploaded files in RAM is just plain stupid. - Bert - |
> > Having all the data in RAM scales the same way as having all the data
> > on disk. Linearly. IIRC Google can hold almost the entire web in RAM. > > So there is virtually no limit to that. I know this is not clever. I > > just say it is possible and the cost is not excessive (holding Minnow > > in RAM, not the web). > > I thought we were having a serious discussion, and not just pointing > fingers at RAM prices. Or pointing to non-existent VM technology, as > you did in another thread. I strongly second Philippe. The Squeak VM technology will simply die, if it is unable to efficiently address more than 2 GB of data and process its calculations on only 1 CPU. There are technologies like memory-mapped files that transparently give an unlimited amount of RAM (if the GC was a bit smarter ...) > I stand by my assessment that holding *everything* including all > versions of all pages and also all uploaded files in RAM is just > plain stupid. We are used to be called ridiculous and stupid. No problem. And yes, we do not keep files in RAM. We store them on the file-system so that Apache can serve them quickly: reading the file into the image and pushing it into a socket way too slow anyway. And yes, Apache caches often requested files in the RAM. Lukas -- Lukas Renggli http://www.lukas-renggli.ch |
Am 24.10.2006 um 22:28 schrieb Lukas Renggli:
>> > Having all the data in RAM scales the same way as having all the >> data >> > on disk. Linearly. IIRC Google can hold almost the entire web in >> RAM. >> > So there is virtually no limit to that. I know this is not >> clever. I >> > just say it is possible and the cost is not excessive (holding >> Minnow >> > in RAM, not the web). >> >> I thought we were having a serious discussion, and not just pointing >> fingers at RAM prices. Or pointing to non-existent VM technology, as >> you did in another thread. > > I strongly second Philippe. > > The Squeak VM technology will simply die, if it is unable to > efficiently address more than 2 GB of data and process its > calculations on only 1 CPU. There are technologies like memory-mapped > files that transparently give an unlimited amount of RAM (if the GC > was a bit smarter ...) Sure. That's just irrelevant to the discussion at hand. >> I stand by my assessment that holding *everything* including all >> versions of all pages and also all uploaded files in RAM is just >> plain stupid. > > We are used to be called ridiculous and stupid. No problem. Come on, I wasn't calling you stupid. Below you actually say that you are not doing what I described - so why are you upset? And wouldn't you agree that *if* someone would hold, for example, all uploaded files of a large Wiki in the Squeak image running on the current VM, that this would be highly unreasonable? I can imagine systems that allow that, I pointed out ideas for such systems in fact, but for our immediate problem we need to stick to what we have. > And yes, we do not keep files in RAM. We store them on the file-system > so that Apache can serve them quickly: reading the file into the image > and pushing it into a socket way too slow anyway. And yes, Apache > caches often requested files in the RAM. So we are not in disagreement after all. - Bert - |
In reply to this post by Lukas Renggli
On 24-Oct-06, at 1:28 PM, Lukas Renggli wrote: > > The Squeak VM technology will simply die, if it is unable to > efficiently address more than 2 GB of data and process its > calculations on only 1 CPU. There are technologies like memory-mapped > files that transparently give an unlimited amount of RAM (if the GC > was a bit smarter ...) Generate your vm with the '64 bit' flag turned on in VMMaker tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Strange OpCodes: CPM: Change Programmer's Mind |
In reply to this post by Bert Freudenberg
2006/10/24, Bert Freudenberg <[hidden email]>:
> Am 24.10.2006 um 21:04 schrieb Philippe Marschall: > > > Having no permanent storage (save image doesn't count) is just plain > > stupid and therefore is unsuited for real deployment. But this has > > nothing to do with holding the whole data in the image. You can save a > > page to the filesystem when it was edited or created and still have it > > in the image. Pier has hooks for this since before it was called Pier. > > > > Having all the data in RAM scales the same way as having all the data > > on disk. Linearly. IIRC Google can hold almost the entire web in RAM. > > So there is virtually no limit to that. I know this is not clever. I > > just say it is possible and the cost is not excessive (holding Minnow > > in RAM, not the web). > > I thought we were having a serious discussion, and not just pointing > fingers at RAM prices. Or pointing to non-existent VM technology, as > you did in another thread. No that is not the case. Sorry for the misunderstanding. > I stand by my assessment that holding *everything* including all > versions of all pages and also all uploaded files in RAM is just > plain stupid. I never questioned that one. Philippe |
In reply to this post by keith1y
In my experience, trying to fit a so large db in RAM can be a problem.
SqueakVM has proven to be not so kind when the image size grows, even if we will not have the 2Gb limit any more in 64bit VM I have used magma years ago, and it was quite slow when data starts to grow. Even using the Collection class provided for large data set, the result is not so impressive. I think magma is improved, but what about using a simple relational database for the pages? Squeak has an MySQL driver....and also for postgres... On 10/23/06, Keith Hodges <[hidden email]> wrote: > The process of porting minnow from swiki to pier, is likely to happen. > It will just require a little bit of patience, as the needed components > are road-tested and refined. > -- "Just Design It" -- GG Software Architect http://www.objectsroot.com/ |
In reply to this post by Bert Freudenberg
Bert Freudenberg <[hidden email]> writes:
> I thought we were having a serious discussion, and not just pointing > fingers at RAM prices. Or pointing to non-existent VM technology, as > you did in another thread. > > I stand by my assessment that holding *everything* including all > versions of all pages and also all uploaded files in RAM is just > plain stupid. I am not sure about just plain stupid, but it's at least a very risky thing to do. Dealing with such a large image is almost certainly harder than rewriting it not to need so much memory. The challenge to the hardware is just the beginning. Squeak's VM is not at all made for such big images. I saw "funny" GC behavior with my 75 MB images for Chuck. I'd want to run some experiments before entrusting it to a 2 GB image. Probably you'd have to code your software carefully w.r.t. memory management. The other problem leaping out to me is managing the data over time, especially when corruption inevitably occurs. Are you ready to open a Squeak just to debug the data? Are you aware that Squeak can lose images in some cases? [1] With the real data in simple files or in a database, these issues are much less risky. Overall, I am not opposed to jumping ship from ComSwiki. Indeed, it would be excellent to use wiki software that is maintained by someone highly motivated. Even given that, however, should we not wait until we have something that is *already* better than ComSwiki? -Lex [1] http://lists.squeakfoundation.org/pipermail/squeak-dev/2001-January/009731.html |
Free forum by Nabble | Edit this page |