Hi guys
in the past marco did a bridge to database for metadescribed but since famix was not regular we could not get famix models in databases. So I would be interested to know if there is an ongoing effort to do that? Because this would be a real plus for FameDescribed models. May be using Glorp and I was thinking that this would be good also to get all the source code of pharo in a db using Ring. And get Torch there too. Stef _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
Hi,
Indeed, this is an important project. The project of Marco was MetaDB and it was done in VW for Meta using Glorp. Alberto and me expressed the interest of working on FameDB because he needs it for his experiments. The project is still at the beginning - we only discussed the intention. It would be great if we would get someone else around this, especially with some DB know-how. Cheers, Doru On 19 Jan 2011, at 09:39, Stéphane Ducasse wrote: > Hi guys > > in the past marco did a bridge to database for metadescribed but since famix was not regular we could not get > famix models in databases. > So I would be interested to know if there is an ongoing effort to do that? > Because this would be a real plus for FameDescribed models. May be using Glorp and I was thinking that this would be > good also to get all the source code of pharo in a db using Ring. > And get Torch there too. > > Stef > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev -- www.tudorgirba.com "Beauty is where we see it." _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
On Jan 19, 2011, at 9:53 AM, Tudor Girba wrote: > Hi, > > Indeed, this is an important project. The project of Marco was MetaDB and it was done in VW for Meta using Glorp. > > Alberto and me expressed the interest of working on FameDB because he needs it for his experiments. The project is still at the beginning - we only discussed the intention. It would be great if we would get someone else around this, especially with some DB know-how. I was thinking that we could allocate some months of Cyrille in that task. And Mariano is expert on DB so we could ask him some feedback and advices Stef > > Cheers, > Doru > > > > On 19 Jan 2011, at 09:39, Stéphane Ducasse wrote: > >> Hi guys >> >> in the past marco did a bridge to database for metadescribed but since famix was not regular we could not get >> famix models in databases. >> So I would be interested to know if there is an ongoing effort to do that? >> Because this would be a real plus for FameDescribed models. May be using Glorp and I was thinking that this would be >> good also to get all the source code of pharo in a db using Ring. >> And get Torch there too. >> >> Stef >> _______________________________________________ >> Moose-dev mailing list >> [hidden email] >> https://www.iam.unibe.ch/mailman/listinfo/moose-dev > > -- > www.tudorgirba.com > > "Beauty is where we see it." > > > > > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
That would be so great! We also said that Cyrille would work on some Morphs, but I guess it would be better to focus on the DB.
Cheers, Doru On 19 Jan 2011, at 10:03, Stéphane Ducasse wrote: > > On Jan 19, 2011, at 9:53 AM, Tudor Girba wrote: > >> Hi, >> >> Indeed, this is an important project. The project of Marco was MetaDB and it was done in VW for Meta using Glorp. >> >> Alberto and me expressed the interest of working on FameDB because he needs it for his experiments. The project is still at the beginning - we only discussed the intention. It would be great if we would get someone else around this, especially with some DB know-how. > > I was thinking that we could allocate some months of Cyrille in that task. > And Mariano is expert on DB so we could ask him some feedback and advices > > Stef > >> >> Cheers, >> Doru >> >> >> >> On 19 Jan 2011, at 09:39, Stéphane Ducasse wrote: >> >>> Hi guys >>> >>> in the past marco did a bridge to database for metadescribed but since famix was not regular we could not get >>> famix models in databases. >>> So I would be interested to know if there is an ongoing effort to do that? >>> Because this would be a real plus for FameDescribed models. May be using Glorp and I was thinking that this would be >>> good also to get all the source code of pharo in a db using Ring. >>> And get Torch there too. >>> >>> Stef >>> _______________________________________________ >>> Moose-dev mailing list >>> [hidden email] >>> https://www.iam.unibe.ch/mailman/listinfo/moose-dev >> >> -- >> www.tudorgirba.com >> >> "Beauty is where we see it." >> >> >> >> >> _______________________________________________ >> Moose-dev mailing list >> [hidden email] >> https://www.iam.unibe.ch/mailman/listinfo/moose-dev > > > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev -- www.tudorgirba.com "Reasonable is what we are accustomed with." _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
On Jan 19, 2011, at 10:13 AM, Tudor Girba wrote: > That would be so great! We also said that Cyrille would work on some Morphs, but I guess it would be better to focus on the DB. Yes right now focusing on RPackage integration. We should give him feedback. Stef > > Cheers, > Doru > > > On 19 Jan 2011, at 10:03, Stéphane Ducasse wrote: > >> >> On Jan 19, 2011, at 9:53 AM, Tudor Girba wrote: >> >>> Hi, >>> >>> Indeed, this is an important project. The project of Marco was MetaDB and it was done in VW for Meta using Glorp. >>> >>> Alberto and me expressed the interest of working on FameDB because he needs it for his experiments. The project is still at the beginning - we only discussed the intention. It would be great if we would get someone else around this, especially with some DB know-how. >> >> I was thinking that we could allocate some months of Cyrille in that task. >> And Mariano is expert on DB so we could ask him some feedback and advices >> >> Stef >> >>> >>> Cheers, >>> Doru >>> >>> >>> >>> On 19 Jan 2011, at 09:39, Stéphane Ducasse wrote: >>> >>>> Hi guys >>>> >>>> in the past marco did a bridge to database for metadescribed but since famix was not regular we could not get >>>> famix models in databases. >>>> So I would be interested to know if there is an ongoing effort to do that? >>>> Because this would be a real plus for FameDescribed models. May be using Glorp and I was thinking that this would be >>>> good also to get all the source code of pharo in a db using Ring. >>>> And get Torch there too. >>>> >>>> Stef >>>> _______________________________________________ >>>> Moose-dev mailing list >>>> [hidden email] >>>> https://www.iam.unibe.ch/mailman/listinfo/moose-dev >>> >>> -- >>> www.tudorgirba.com >>> >>> "Beauty is where we see it." >>> >>> >>> >>> >>> _______________________________________________ >>> Moose-dev mailing list >>> [hidden email] >>> https://www.iam.unibe.ch/mailman/listinfo/moose-dev >> >> >> _______________________________________________ >> Moose-dev mailing list >> [hidden email] >> https://www.iam.unibe.ch/mailman/listinfo/moose-dev > > -- > www.tudorgirba.com > > "Reasonable is what we are accustomed with." > > > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
In reply to this post by Stéphane Ducasse
On 19 jan 2011, at 09:39, Stéphane Ducasse wrote: > in the past marco did a bridge to database for metadescribed but since famix was not regular we could not get > famix models in databases. > So I would be interested to know if there is an ongoing effort to do that? > Because this would be a real plus for FameDescribed models. May be using Glorp and I was thinking that this would be > good also to get all the source code of pharo in a db using Ring. > And get Torch there too. What do you hope to achieve by having these models in a relational database? Stephan Eggermont _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
A database would provide scalability.
Of course, it does not have to be a relational database. I would also like to experiment an OO one, but I know even less about these :) Cheers, Doru On 20 Jan 2011, at 00:12, Stephan Eggermont wrote: > > On 19 jan 2011, at 09:39, Stéphane Ducasse wrote: >> in the past marco did a bridge to database for metadescribed but since famix was not regular we could not get >> famix models in databases. >> So I would be interested to know if there is an ongoing effort to do that? >> Because this would be a real plus for FameDescribed models. May be using Glorp and I was thinking that this would be >> good also to get all the source code of pharo in a db using Ring. >> And get Torch there too. > > What do you hope to achieve by having these models in a relational database? > > Stephan Eggermont > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev -- www.tudorgirba.com "In a world where everything is moving ever faster, one might have better chances to win by moving slower." _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
In reply to this post by Stephan Eggermont-3
>> in the past marco did a bridge to database for metadescribed but since famix was not regular we could not get >> famix models in databases. >> So I would be interested to know if there is an ongoing effort to do that? >> Because this would be a real plus for FameDescribed models. May be using Glorp and I was thinking that this would be >> good also to get all the source code of pharo in a db using Ring. >> And get Torch there too. > > What do you hope to achieve by having these models in a relational database? Query all the versions of a class. All the difference between to changeset over a stream of changes. We do not care about relational or not. But we should start somewhere. Stef _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
Hi,
I strongly recommend NOT to use a relational database. It is a very bad solution for OO (for several reasons, the best example I read ever is: "is like disassemble your car every night before sleep, and reassemble it every morning, before going to work"). I worked with relational databases a lot and I can ensure that it is a pain in the a**. Yes, with Mariano we did SqueakDBX, but that was just because in lots of jobs the database is not an option, you must use a relational db (and usually oracle), not because we think it is good for programming. Of course, gemstone can be a better solution... but if we want scalability for free, a nosql solution here can be a good choice (and there are some implementations in pharo, to choose one) btw...a document oriented database (like MongoDB, already implemented for Pharo) is a good approach for non-regular structures. my 2c. best, Esteban El 20/01/2011, a las 8:22a.m., Stéphane Ducasse escribió: > >>> in the past marco did a bridge to database for metadescribed but since famix was not regular we could not get >>> famix models in databases. >>> So I would be interested to know if there is an ongoing effort to do that? >>> Because this would be a real plus for FameDescribed models. May be using Glorp and I was thinking that this would be >>> good also to get all the source code of pharo in a db using Ring. >>> And get Torch there too. >> >> What do you hope to achieve by having these models in a relational database? > > Query all the versions of a class. > All the difference between to changeset over a stream of changes. > We do not care about relational or not. But we should start somewhere. > > Stef > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
thanks for the point.
Stef On Jan 20, 2011, at 12:33 PM, Esteban Lorenzano wrote: > Hi, > I strongly recommend NOT to use a relational database. It is a very bad solution for OO (for several reasons, the best example I read ever is: "is like disassemble your car every night before sleep, and reassemble it every morning, before going to work"). I worked with relational databases a lot and I can ensure that it is a pain in the a**. Yes, with Mariano we did SqueakDBX, but that was just because in lots of jobs the database is not an option, you must use a relational db (and usually oracle), not because we think it is good for programming. > Of course, gemstone can be a better solution... but if we want scalability for free, a nosql solution here can be a good choice (and there are some implementations in pharo, to choose one) > > btw...a document oriented database (like MongoDB, already implemented for Pharo) is a good approach for non-regular structures. > > my 2c. > > best, > Esteban > > > El 20/01/2011, a las 8:22a.m., Stéphane Ducasse escribió: > >> >>>> in the past marco did a bridge to database for metadescribed but since famix was not regular we could not get >>>> famix models in databases. >>>> So I would be interested to know if there is an ongoing effort to do that? >>>> Because this would be a real plus for FameDescribed models. May be using Glorp and I was thinking that this would be >>>> good also to get all the source code of pharo in a db using Ring. >>>> And get Torch there too. >>> >>> What do you hope to achieve by having these models in a relational database? >> >> Query all the versions of a class. >> All the difference between to changeset over a stream of changes. >> We do not care about relational or not. But we should start somewhere. >> >> Stef >> _______________________________________________ >> Moose-dev mailing list >> [hidden email] >> https://www.iam.unibe.ch/mailman/listinfo/moose-dev > > > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
In reply to this post by EstebanLM
On 20 janv. 2011, at 12:33, Esteban Lorenzano wrote: > Hi, > I strongly recommend NOT to use a relational database. It is a very bad solution for OO (for several reasons, the best example I read ever is: "is like disassemble your car every night before sleep, and reassemble it every morning, before going to work"). I worked with relational databases a lot and I can ensure that it is a pain in the a**. Yes, with Mariano we did SqueakDBX, but that was just because in lots of jobs the database is not an option, you must use a relational db (and usually oracle), not because we think it is good for programming. > Of course, gemstone can be a better solution... but if we want scalability for free, a nosql solution here can be a good choice (and there are some implementations in pharo, to choose one) > > btw...a document oriented database (like MongoDB, already implemented for Pharo) is a good approach for non-regular structures. Thanks for the tip Esteban, that's interesting news. FYI https://twitter.com/#!/renggli/status/17317173076 #moose got it right from the beginning: "For truly deep program analyses, relational database still don't work." Oege de Moor at #tools2010 :) > > my 2c. > > best, > Esteban > > > El 20/01/2011, a las 8:22a.m., Stéphane Ducasse escribió: > >> >>>> in the past marco did a bridge to database for metadescribed but since famix was not regular we could not get >>>> famix models in databases. >>>> So I would be interested to know if there is an ongoing effort to do that? >>>> Because this would be a real plus for FameDescribed models. May be using Glorp and I was thinking that this would be >>>> good also to get all the source code of pharo in a db using Ring. >>>> And get Torch there too. >>> >>> What do you hope to achieve by having these models in a relational database? >> >> Query all the versions of a class. >> All the difference between to changeset over a stream of changes. >> We do not care about relational or not. But we should start somewhere. >> >> Stef >> _______________________________________________ >> Moose-dev mailing list >> [hidden email] >> https://www.iam.unibe.ch/mailman/listinfo/moose-dev > > > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev -- Simon Denier _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
In reply to this post by Stéphane Ducasse
On 19.01.2011, at 09:39, Stéphane Ducasse wrote: > Hi guys > > in the past marco did a bridge to database for metadescribed but since famix was not regular we could not get > famix models in databases. > So I would be interested to know if there is an ongoing effort to do that? > Because this would be a real plus for FameDescribed models. May be using Glorp and I was thinking that this would be > good also to get all the source code of pharo in a db using Ring. > And get Torch there too. Stef, are there any specific requirements on the db? Is this for long time storage of a lot of things or just to extend the main memory for a definite amount of time? What will be the main actions to do? Searching for symbols/methods/class names or traversing stuff? In case you want to add over time a lot of code and want to have general search capabilities that can be stored in a central location it might be a good idea to ask gemstone to sponsor a full license and install this somewhere in inria. Or do I missing the point? Norbert _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
Hi Norbert,
The goal of Moose is to help us analyze data. This means: modeling, mining, measuring, querying, visualizing, browsing etc. To do this, the prerequisite is being able to manipulate the data. Right now, we have all objects in memory. To be able to scale we need database support. So, all in all, it's not for a specific use case, but for any model that is described by Fame. Cheers, Doru On 20 Jan 2011, at 21:06, Norbert Hartl wrote: > > On 19.01.2011, at 09:39, Stéphane Ducasse wrote: > >> Hi guys >> >> in the past marco did a bridge to database for metadescribed but since famix was not regular we could not get >> famix models in databases. >> So I would be interested to know if there is an ongoing effort to do that? >> Because this would be a real plus for FameDescribed models. May be using Glorp and I was thinking that this would be >> good also to get all the source code of pharo in a db using Ring. >> And get Torch there too. > > Stef, > > are there any specific requirements on the db? Is this for long time storage of a lot of things or just to extend the main memory for a definite amount of time? > What will be the main actions to do? Searching for symbols/methods/class names or traversing stuff? > > In case you want to add over time a lot of code and want to have general search capabilities that can be stored in a central location it might be a good idea to ask gemstone to sponsor a full license and install this somewhere in inria. Or do I missing the point? > > Norbert > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev -- www.tudorgirba.com "Beauty is where we see it." _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
But I believe the main goal is storing lot of data and being able to query it fast.
So Querying and scalability are the main requirements. Other requirement like lot of updates, or top level security (redundancy, ...) or less important in this case nicolas ----- Mail original ----- > De: "Tudor Girba" <[hidden email]> > À: "Moose-related development" <[hidden email]> > Envoyé: Jeudi 20 Janvier 2011 21:32:40 > Objet: [Moose-dev] Re: what is the status of FameDB > Hi Norbert, > > The goal of Moose is to help us analyze data. This means: modeling, > mining, measuring, querying, visualizing, browsing etc. To do this, > the prerequisite is being able to manipulate the data. Right now, we > have all objects in memory. To be able to scale we need database > support. > > So, all in all, it's not for a specific use case, but for any model > that is described by Fame. > > Cheers, > Doru > > > On 20 Jan 2011, at 21:06, Norbert Hartl wrote: > > > > > On 19.01.2011, at 09:39, Stéphane Ducasse wrote: > > > >> Hi guys > >> > >> in the past marco did a bridge to database for metadescribed but > >> since famix was not regular we could not get > >> famix models in databases. > >> So I would be interested to know if there is an ongoing effort to > >> do that? > >> Because this would be a real plus for FameDescribed models. May be > >> using Glorp and I was thinking that this would be > >> good also to get all the source code of pharo in a db using Ring. > >> And get Torch there too. > > > > Stef, > > > > are there any specific requirements on the db? Is this for long time > > storage of a lot of things or just to extend the main memory for a > > definite amount of time? > > What will be the main actions to do? Searching for > > symbols/methods/class names or traversing stuff? > > > > In case you want to add over time a lot of code and want to have > > general search capabilities that can be stored in a central location > > it might be a good idea to ask gemstone to sponsor a full license > > and install this somewhere in inria. Or do I missing the point? > > > > Norbert > > _______________________________________________ > > Moose-dev mailing list > > [hidden email] > > https://www.iam.unibe.ch/mailman/listinfo/moose-dev > > -- > www.tudorgirba.com > > "Beauty is where we see it." > > > > > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
In reply to this post by Tudor Girba
On 20 jan 2011, at 21:32, Tudor Girba wrote: > The goal of Moose is to help us analyze data. This means: modeling, mining, measuring, querying, visualizing, browsing etc. To do this, the prerequisite is being able to manipulate the data. Right now, we have all objects in memory. To be able to scale we need database support. Currently, the models have to fit into 32 bit address space. Modern machines support much more than that (data points: 16 GB @ 160 Euro for my current machine, standard workstations support 192GB). Do you have many models that wouldn't fit in 192 GB? The kinds of analysis Moose does are not supported efficiently by standard relational databases at all. They are optimized for a very different access scheme: selecting a very small subset of data and changing that. That means that they are only able to provide reasonable results for datasets that (nearly) fit into memory. In short: they allow you to avoid using a 64 bit Pharo image, and are able to use more cores. What you lose is having to copy data from and to the database and having to generate queries that don't fit the object model well. They are unlikely to provide better performance than a straightforward 64 bit Pharo image would, but can provide a short-term solution. Datawarehouse style databases (Vertical) and object oriented databases (Gemstone) are probably able to do better. Datawarehouse databases by pregenerating all kinds of cross sections and projections of the data, and oodbs by navigating instead of joining (and Gemstone by being able to use all memory). But even there, a lot of the Moose analysis seem to touch a large part of the model, and the interactivity needed means that disk based models will never become popular. Scaling of Moose is more likely to come from going 64 bit and distributing the model over multiple vms. Stephan _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
Thanks for this very sensible analysis.
I would definitely give any database support for an image that can scale to use the entire promise of 64 bits and of multiple VMs. However, it looks like it will take a while until we will get that. In the meantime, it is more practical to use what exists. Even if relational databases are not at all my preferred option, we know that for Glorp there was an implementation in VW that served the purpose of storing the objects and enabling mining algorithms (even at the expense of high interaction). So, it would be great to salvage this effort and have a similar support in Pharo. In any case, I definitely would like to start an effort of looking into Gemstone and into other object-oriented databases. Anybody interested in joining the effort? Just a question: Would the new slate disks not alleviate the problem of the disk speed? Cheers, Doru On 21 Jan 2011, at 12:15, Stephan Eggermont wrote: > > On 20 jan 2011, at 21:32, Tudor Girba wrote: >> The goal of Moose is to help us analyze data. This means: modeling, mining, measuring, querying, visualizing, browsing etc. To do this, the prerequisite is being able to manipulate the data. Right now, we have all objects in memory. To be able to scale we need database support. > > Currently, the models have to fit into 32 bit address space. Modern machines support much more than that (data points: 16 GB @ 160 Euro for my current machine, standard workstations support 192GB). Do you have many models that wouldn't fit in 192 GB? > > The kinds of analysis Moose does are not supported efficiently by standard relational databases at all. They are optimized for a very different access scheme: selecting a very small subset of data and changing that. That means that they are only able to provide reasonable results for datasets that (nearly) fit into memory. In short: they allow you to avoid using a 64 bit Pharo image, and are able to use more cores. What you lose is having to copy data from and to the database and having to generate queries that don't fit the object model well. They are unlikely to provide better performance than a straightforward 64 bit Pharo image would, but can provide a short-term solution. > > Datawarehouse style databases (Vertical) and object oriented databases (Gemstone) are probably able to do better. Datawarehouse databases by pregenerating all kinds of cross sections and projections of the data, and oodbs by navigating instead of joining (and Gemstone by being able to use all memory). But even there, a lot of the Moose analysis seem to touch a large part of the model, and the interactivity needed means that disk based models will never become popular. > > Scaling of Moose is more likely to come from going 64 bit and distributing the model over multiple vms. > > Stephan > > > > > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev -- www.tudorgirba.com "We cannot reach the flow of things unless we let go." _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
In reply to this post by NorbertHartl
>
>> Hi guys >> >> in the past marco did a bridge to database for metadescribed but since famix was not regular we could not get >> famix models in databases. >> So I would be interested to know if there is an ongoing effort to do that? >> Because this would be a real plus for FameDescribed models. May be using Glorp and I was thinking that this would be >> good also to get all the source code of pharo in a db using Ring. >> And get Torch there too. > > Stef, > > are there any specific requirements on the db? Is this for long time storage of a lot of things for pharo I would like to have all the versions of all release so that we can do and decent versions of and compute related changes and others. > or just to extend the main memory for a definite amount of time? > What will be the main actions to do? Searching for symbols/methods/class names or traversing stuff? for pharo yes. > In case you want to add over time a lot of code and want to have general search capabilities that can be stored in a central location it might be a good idea to ask gemstone to sponsor a full license and install this somewhere in inria. This was a good possibility even if I have no idea about gemstone > Or do I missing the point? > > Norbert > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
In reply to this post by Stephan Eggermont-3
thanks stefan.
Interesting. On Jan 21, 2011, at 12:15 PM, Stephan Eggermont wrote: > > On 20 jan 2011, at 21:32, Tudor Girba wrote: >> The goal of Moose is to help us analyze data. This means: modeling, mining, measuring, querying, visualizing, browsing etc. To do this, the prerequisite is being able to manipulate the data. Right now, we have all objects in memory. To be able to scale we need database support. > > Currently, the models have to fit into 32 bit address space. Modern machines support much more than that (data points: 16 GB @ 160 Euro for my current machine, standard workstations support 192GB). Do you have many models that wouldn't fit in 192 GB? > > The kinds of analysis Moose does are not supported efficiently by standard relational databases at all. They are optimized for a very different access scheme: selecting a very small subset of data and changing that. That means that they are only able to provide reasonable results for datasets that (nearly) fit into memory. In short: they allow you to avoid using a 64 bit Pharo image, and are able to use more cores. What you lose is having to copy data from and to the database and having to generate queries that don't fit the object model well. They are unlikely to provide better performance than a straightforward 64 bit Pharo image would, but can provide a short-term solution. > > Datawarehouse style databases (Vertical) and object oriented databases (Gemstone) are probably able to do better. Datawarehouse databases by pregenerating all kinds of cross sections and projections of the data, and oodbs by navigating instead of joining (and Gemstone by being able to use all memory). But even there, a lot of the Moose analysis seem to touch a large part of the model, and the interactivity needed means that disk based models will never become popular. > > Scaling of Moose is more likely to come from going 64 bit and distributing the model over multiple vms. > > Stephan > > > > > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
In reply to this post by Tudor Girba
I would like to learn more about gemstone. Now I'm a bit full.
Stef On Jan 21, 2011, at 5:12 PM, Tudor Girba wrote: > Thanks for this very sensible analysis. > > I would definitely give any database support for an image that can scale to use the entire promise of 64 bits and of multiple VMs. However, it looks like it will take a while until we will get that. In the meantime, it is more practical to use what exists. > > Even if relational databases are not at all my preferred option, we know that for Glorp there was an implementation in VW that served the purpose of storing the objects and enabling mining algorithms (even at the expense of high interaction). So, it would be great to salvage this effort and have a similar support in Pharo. > > In any case, I definitely would like to start an effort of looking into Gemstone and into other object-oriented databases. Anybody interested in joining the effort? > > Just a question: Would the new slate disks not alleviate the problem of the disk speed? > > Cheers, > Doru > > > On 21 Jan 2011, at 12:15, Stephan Eggermont wrote: > >> >> On 20 jan 2011, at 21:32, Tudor Girba wrote: >>> The goal of Moose is to help us analyze data. This means: modeling, mining, measuring, querying, visualizing, browsing etc. To do this, the prerequisite is being able to manipulate the data. Right now, we have all objects in memory. To be able to scale we need database support. >> >> Currently, the models have to fit into 32 bit address space. Modern machines support much more than that (data points: 16 GB @ 160 Euro for my current machine, standard workstations support 192GB). Do you have many models that wouldn't fit in 192 GB? >> >> The kinds of analysis Moose does are not supported efficiently by standard relational databases at all. They are optimized for a very different access scheme: selecting a very small subset of data and changing that. That means that they are only able to provide reasonable results for datasets that (nearly) fit into memory. In short: they allow you to avoid using a 64 bit Pharo image, and are able to use more cores. What you lose is having to copy data from and to the database and having to generate queries that don't fit the object model well. They are unlikely to provide better performance than a straightforward 64 bit Pharo image would, but can provide a short-term solution. >> >> Datawarehouse style databases (Vertical) and object oriented databases (Gemstone) are probably able to do better. Datawarehouse databases by pregenerating all kinds of cross sections and projections of the data, and oodbs by navigating instead of joining (and Gemstone by being able to use all memory). But even there, a lot of the Moose analysis seem to touch a large part of the model, and the interactivity needed means that disk based models will never become popular. >> >> Scaling of Moose is more likely to come from going 64 bit and distributing the model over multiple vms. >> >> Stephan >> >> >> >> >> _______________________________________________ >> Moose-dev mailing list >> [hidden email] >> https://www.iam.unibe.ch/mailman/listinfo/moose-dev > > -- > www.tudorgirba.com > > "We cannot reach the flow of things unless we let go." > > > > > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
In reply to this post by Tudor Girba
On Fri, Jan 21, 2011 at 5:12 PM, Tudor Girba <[hidden email]> wrote: Thanks for this very sensible analysis. For Gemstone, Hernan Wilkinson did some test 1 o 2 years ago, and the difference was....mmmm I dont remember, but I think 40x faster. All the Gemstone migrations and I don't remember what more, was done with that HDD. So, yes, at least in Gemstone it changes a lot.
_______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
Free forum by Nabble | Edit this page |