Hi all
Last months I and Tristan have been working on Fuel project, an object binary serialization tool. The idea is that objects are much more times loaded than stored, therefore it is worth to spend time while storing in order to have faster loading and user experience. We present an implementation of a pickle format that is based on clustering similar objects. There is a summary of the project below, but more complete information is available here: http://rmod.lille.inria.fr/web/pier/software/Fuel The implementation still needs a lot of work to be really useful, optimizations should be done, but we'll be glad to get feedback of the community. = Pickle format = The pickle format and the serialization algorithm main idea, is explained in this slides: http://www.slideshare.net/tinchodias/fuel-serialization-in-an-example = Current features = - Class shape changing (when a variable has been added, or removed, or its index changed) - Serialize most of the basic objects. - Serialize (almost) any CompiledMethod - Detection of global or class variables - Support for cyclic object graphs - Tests = Next steps = - Improve version checking. - Optimize performance. - Serialize more kinds of objects: -- Class with its complete description. -- Method contexts -- Active block closures -- Continuation - Some improvements for the user: -- pre and post actions to be executed. -- easily say 'this object is singleton'. - Partial loading of a stored graph. - Fast statistics/brief info extraction of a stored graph. - ConfigurationOfFuel. - Be able to deploy materialization behavior only (independent from the serialization behavior) = Download = In a Pharo 1.1 or 1.1.1 evaluate: Gofer new squeaksource: 'Fuel'; version: 'Fuel-MartinDias.74'; version: 'FuelBenchmarks-MartinDias.4'; load. = Benchmarks = You can run benchmarks executing this line (results in Transcript): FLBenchmarks newBasic run. Thank you! Martin Dias |
Thanks martin.
Stef On Dec 8, 2010, at 5:50 PM, Martin Dias wrote: > Hi all > > Last months I and Tristan have been working on Fuel project, an object > binary serialization tool. The idea is that objects are much more > times loaded than stored, therefore it is worth to spend time while > storing in order to have faster loading and user experience. We > present an implementation of a pickle format that is based on > clustering similar objects. > > There is a summary of the project below, but more complete information > is available here: http://rmod.lille.inria.fr/web/pier/software/Fuel > > The implementation still needs a lot of work to be really useful, > optimizations should be done, but we'll be glad to get feedback of the > community. > > > = Pickle format = > > The pickle format and the serialization algorithm main idea, is > explained in this slides: > > http://www.slideshare.net/tinchodias/fuel-serialization-in-an-example > > > = Current features = > > - Class shape changing (when a variable has been added, or removed, or > its index changed) > - Serialize most of the basic objects. > - Serialize (almost) any CompiledMethod > - Detection of global or class variables > - Support for cyclic object graphs > - Tests > > > = Next steps = > > - Improve version checking. > - Optimize performance. > - Serialize more kinds of objects: > -- Class with its complete description. > -- Method contexts > -- Active block closures > -- Continuation > - Some improvements for the user: > -- pre and post actions to be executed. > -- easily say 'this object is singleton'. > - Partial loading of a stored graph. > - Fast statistics/brief info extraction of a stored graph. > - ConfigurationOfFuel. > - Be able to deploy materialization behavior only (independent from > the serialization behavior) > > > = Download = > > In a Pharo 1.1 or 1.1.1 evaluate: > > Gofer new > squeaksource: 'Fuel'; > version: 'Fuel-MartinDias.74'; > version: 'FuelBenchmarks-MartinDias.4'; > load. > > > = Benchmarks = > > You can run benchmarks executing this line (results in Transcript): > > FLBenchmarks newBasic run. > > > Thank you! > Martin Dias > |
In reply to this post by tinchodias
It look cool!
Is this a better ImageSegment? Alexandre On 8 Dec 2010, at 13:50, Martin Dias wrote: > Hi all > > Last months I and Tristan have been working on Fuel project, an object > binary serialization tool. The idea is that objects are much more > times loaded than stored, therefore it is worth to spend time while > storing in order to have faster loading and user experience. We > present an implementation of a pickle format that is based on > clustering similar objects. > > There is a summary of the project below, but more complete information > is available here: http://rmod.lille.inria.fr/web/pier/software/Fuel > > The implementation still needs a lot of work to be really useful, > optimizations should be done, but we'll be glad to get feedback of the > community. > > > = Pickle format = > > The pickle format and the serialization algorithm main idea, is > explained in this slides: > > http://www.slideshare.net/tinchodias/fuel-serialization-in-an-example > > > = Current features = > > - Class shape changing (when a variable has been added, or removed, or > its index changed) > - Serialize most of the basic objects. > - Serialize (almost) any CompiledMethod > - Detection of global or class variables > - Support for cyclic object graphs > - Tests > > > = Next steps = > > - Improve version checking. > - Optimize performance. > - Serialize more kinds of objects: > -- Class with its complete description. > -- Method contexts > -- Active block closures > -- Continuation > - Some improvements for the user: > -- pre and post actions to be executed. > -- easily say 'this object is singleton'. > - Partial loading of a stored graph. > - Fast statistics/brief info extraction of a stored graph. > - ConfigurationOfFuel. > - Be able to deploy materialization behavior only (independent from > the serialization behavior) > > > = Download = > > In a Pharo 1.1 or 1.1.1 evaluate: > > Gofer new > squeaksource: 'Fuel'; > version: 'Fuel-MartinDias.74'; > version: 'FuelBenchmarks-MartinDias.4'; > load. > > > = Benchmarks = > > You can run benchmarks executing this line (results in Transcript): > > FLBenchmarks newBasic run. > > > Thank you! > Martin Dias > -- _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. |
On Dec 8, 2010, at 8:42 PM, Alexandre Bergel wrote: > It look cool! > Is this a better ImageSegment? there is no VM code implied and you do not get all the problems of imageSegment. You should ask is it better than SmartRefStreams. Yes Now the code has not been optimized > > Alexandre > > > On 8 Dec 2010, at 13:50, Martin Dias wrote: > >> Hi all >> >> Last months I and Tristan have been working on Fuel project, an object >> binary serialization tool. The idea is that objects are much more >> times loaded than stored, therefore it is worth to spend time while >> storing in order to have faster loading and user experience. We >> present an implementation of a pickle format that is based on >> clustering similar objects. >> >> There is a summary of the project below, but more complete information >> is available here: http://rmod.lille.inria.fr/web/pier/software/Fuel >> >> The implementation still needs a lot of work to be really useful, >> optimizations should be done, but we'll be glad to get feedback of the >> community. >> >> >> = Pickle format = >> >> The pickle format and the serialization algorithm main idea, is >> explained in this slides: >> >> http://www.slideshare.net/tinchodias/fuel-serialization-in-an-example >> >> >> = Current features = >> >> - Class shape changing (when a variable has been added, or removed, or >> its index changed) >> - Serialize most of the basic objects. >> - Serialize (almost) any CompiledMethod >> - Detection of global or class variables >> - Support for cyclic object graphs >> - Tests >> >> >> = Next steps = >> >> - Improve version checking. >> - Optimize performance. >> - Serialize more kinds of objects: >> -- Class with its complete description. >> -- Method contexts >> -- Active block closures >> -- Continuation >> - Some improvements for the user: >> -- pre and post actions to be executed. >> -- easily say 'this object is singleton'. >> - Partial loading of a stored graph. >> - Fast statistics/brief info extraction of a stored graph. >> - ConfigurationOfFuel. >> - Be able to deploy materialization behavior only (independent from >> the serialization behavior) >> >> >> = Download = >> >> In a Pharo 1.1 or 1.1.1 evaluate: >> >> Gofer new >> squeaksource: 'Fuel'; >> version: 'Fuel-MartinDias.74'; >> version: 'FuelBenchmarks-MartinDias.4'; >> load. >> >> >> = Benchmarks = >> >> You can run benchmarks executing this line (results in Transcript): >> >> FLBenchmarks newBasic run. >> >> >> Thank you! >> Martin Dias >> > > -- > _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: > Alexandre Bergel http://www.bergel.eu > ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. > > > > > > |
>> It look cool!
>> Is this a better ImageSegment? > > there is no VM code implied and you do not get all the problems of imageSegment. > You should ask is it better than SmartRefStreams. > Yes > Now the code has not been optimized Ah ok! How to deal with stubs is the difference with ImageSegment then. It looks exciting! Alexandre > >> >> Alexandre >> >> >> On 8 Dec 2010, at 13:50, Martin Dias wrote: >> >>> Hi all >>> >>> Last months I and Tristan have been working on Fuel project, an object >>> binary serialization tool. The idea is that objects are much more >>> times loaded than stored, therefore it is worth to spend time while >>> storing in order to have faster loading and user experience. We >>> present an implementation of a pickle format that is based on >>> clustering similar objects. >>> >>> There is a summary of the project below, but more complete information >>> is available here: http://rmod.lille.inria.fr/web/pier/software/Fuel >>> >>> The implementation still needs a lot of work to be really useful, >>> optimizations should be done, but we'll be glad to get feedback of the >>> community. >>> >>> >>> = Pickle format = >>> >>> The pickle format and the serialization algorithm main idea, is >>> explained in this slides: >>> >>> http://www.slideshare.net/tinchodias/fuel-serialization-in-an-example >>> >>> >>> = Current features = >>> >>> - Class shape changing (when a variable has been added, or removed, or >>> its index changed) >>> - Serialize most of the basic objects. >>> - Serialize (almost) any CompiledMethod >>> - Detection of global or class variables >>> - Support for cyclic object graphs >>> - Tests >>> >>> >>> = Next steps = >>> >>> - Improve version checking. >>> - Optimize performance. >>> - Serialize more kinds of objects: >>> -- Class with its complete description. >>> -- Method contexts >>> -- Active block closures >>> -- Continuation >>> - Some improvements for the user: >>> -- pre and post actions to be executed. >>> -- easily say 'this object is singleton'. >>> - Partial loading of a stored graph. >>> - Fast statistics/brief info extraction of a stored graph. >>> - ConfigurationOfFuel. >>> - Be able to deploy materialization behavior only (independent from >>> the serialization behavior) >>> >>> >>> = Download = >>> >>> In a Pharo 1.1 or 1.1.1 evaluate: >>> >>> Gofer new >>> squeaksource: 'Fuel'; >>> version: 'Fuel-MartinDias.74'; >>> version: 'FuelBenchmarks-MartinDias.4'; >>> load. >>> >>> >>> = Benchmarks = >>> >>> You can run benchmarks executing this line (results in Transcript): >>> >>> FLBenchmarks newBasic run. >>> >>> >>> Thank you! >>> Martin Dias >>> >> >> -- >> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: >> Alexandre Bergel http://www.bergel.eu >> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. >> >> >> >> >> >> > > -- _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. |
alex this is good candidate for your profiler :)
Stef On Dec 8, 2010, at 9:05 PM, Alexandre Bergel wrote: >>> It look cool! >>> Is this a better ImageSegment? >> >> there is no VM code implied and you do not get all the problems of imageSegment. >> You should ask is it better than SmartRefStreams. >> Yes >> Now the code has not been optimized > > Ah ok! How to deal with stubs is the difference with ImageSegment then. > > It looks exciting! > > Alexandre > > >> >>> >>> Alexandre >>> >>> >>> On 8 Dec 2010, at 13:50, Martin Dias wrote: >>> >>>> Hi all >>>> >>>> Last months I and Tristan have been working on Fuel project, an object >>>> binary serialization tool. The idea is that objects are much more >>>> times loaded than stored, therefore it is worth to spend time while >>>> storing in order to have faster loading and user experience. We >>>> present an implementation of a pickle format that is based on >>>> clustering similar objects. >>>> >>>> There is a summary of the project below, but more complete information >>>> is available here: http://rmod.lille.inria.fr/web/pier/software/Fuel >>>> >>>> The implementation still needs a lot of work to be really useful, >>>> optimizations should be done, but we'll be glad to get feedback of the >>>> community. >>>> >>>> >>>> = Pickle format = >>>> >>>> The pickle format and the serialization algorithm main idea, is >>>> explained in this slides: >>>> >>>> http://www.slideshare.net/tinchodias/fuel-serialization-in-an-example >>>> >>>> >>>> = Current features = >>>> >>>> - Class shape changing (when a variable has been added, or removed, or >>>> its index changed) >>>> - Serialize most of the basic objects. >>>> - Serialize (almost) any CompiledMethod >>>> - Detection of global or class variables >>>> - Support for cyclic object graphs >>>> - Tests >>>> >>>> >>>> = Next steps = >>>> >>>> - Improve version checking. >>>> - Optimize performance. >>>> - Serialize more kinds of objects: >>>> -- Class with its complete description. >>>> -- Method contexts >>>> -- Active block closures >>>> -- Continuation >>>> - Some improvements for the user: >>>> -- pre and post actions to be executed. >>>> -- easily say 'this object is singleton'. >>>> - Partial loading of a stored graph. >>>> - Fast statistics/brief info extraction of a stored graph. >>>> - ConfigurationOfFuel. >>>> - Be able to deploy materialization behavior only (independent from >>>> the serialization behavior) >>>> >>>> >>>> = Download = >>>> >>>> In a Pharo 1.1 or 1.1.1 evaluate: >>>> >>>> Gofer new >>>> squeaksource: 'Fuel'; >>>> version: 'Fuel-MartinDias.74'; >>>> version: 'FuelBenchmarks-MartinDias.4'; >>>> load. >>>> >>>> >>>> = Benchmarks = >>>> >>>> You can run benchmarks executing this line (results in Transcript): >>>> >>>> FLBenchmarks newBasic run. >>>> >>>> >>>> Thank you! >>>> Martin Dias >>>> >>> >>> -- >>> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: >>> Alexandre Bergel http://www.bergel.eu >>> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. >>> >>> >>> >>> >>> >>> >> >> > > -- > _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: > Alexandre Bergel http://www.bergel.eu > ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. > > > > > > |
In reply to this post by Alexandre Bergel
On Wed, Dec 8, 2010 at 9:05 PM, Alexandre Bergel <[hidden email]> wrote:
The difference is that in Fuel there is no managment for "shared objects". Fuel is not for swapping (there are no stubs/proxies and becomes), and the idea is that you write on a file, ALL (not only those objects that are ONLY accessible by the roots, like in ImageSegment) the objects reachable from user defined roots. Fuel is similar to ReferenceStream and subclasses, a XML serializer, or any serializers, etc...but MOSTLY, to Parcels. It is very similar to VisualWorks parcel. Another important thing is that it is binary. The idea is that maybe in a future we have a Monticello that doesn't need a mcz (code) + Compiler, but instead, just load binary code (fuel files). This would be muuuch faster that current Monticello, and even more, for minimal images, we wouldn't need a Compiler. Finally, Fuel is designed (like Parcels) to be very fast at loading time :) Cheers Mariano It looks exciting! |
On Wed, 8 Dec 2010, Mariano Martinez Peck wrote:
> On Wed, Dec 8, 2010 at 9:05 PM, Alexandre Bergel <[hidden email]>wrote: > >>>> It look cool! >>>> Is this a better ImageSegment? >>> >>> there is no VM code implied and you do not get all the problems of >> imageSegment. >>> You should ask is it better than SmartRefStreams. >>> Yes >>> Now the code has not been optimized >> >> Ah ok! How to deal with stubs is the difference with ImageSegment then. >> >> > The difference is that in Fuel there is no managment for "shared objects". > Fuel is not for swapping (there are no stubs/proxies and becomes), and the > idea is that you write on a file, ALL (not only those objects that are ONLY > accessible by the roots, like in ImageSegment) the objects reachable from > user defined roots. > > Fuel is similar to ReferenceStream and subclasses, a XML serializer, or any > serializers, etc...but MOSTLY, to Parcels. It is very similar to VisualWorks > parcel. > > Another important thing is that it is binary. The idea is that maybe in a > future we have a Monticello that doesn't need a mcz (code) + Compiler, but > instead, just load binary code (fuel files). This would be muuuch faster > that current Monticello, and even more, for minimal images, we wouldn't need > a Compiler. > Finally, Fuel is designed (like Parcels) to be very fast at loading time :) Well, I doubt if speed is really important, since you're loading everything at most once. And the current tools are really fast IMHO. The following numbers are from Squeak: [ Compiler recompileAll ] timeToRun. 29083. CompiledMethod allInstances size. 60701. [ CompiledMethod allInstancesDo: [ :each | each getSource ] ] timeToRun. 1133. So the compiler is compiling 2087 methods per second on average. You can load 53575 methods per second from a file on average. If it's zipped, than it may be a bit slower, say a factor of 2-3x slowdown. So you can still load and compile more than 1800 methods per second. I guess thats fast enough. Even if Fuel can be 10x faster, it doesn't really make a difference IMHO. Levente > > Cheers > > Mariano > > > > >> It looks exciting! >> >> Alexandre >> >> >>> >>>> >>>> Alexandre >>>> >>>> >>>> On 8 Dec 2010, at 13:50, Martin Dias wrote: >>>> >>>>> Hi all >>>>> >>>>> Last months I and Tristan have been working on Fuel project, an object >>>>> binary serialization tool. The idea is that objects are much more >>>>> times loaded than stored, therefore it is worth to spend time while >>>>> storing in order to have faster loading and user experience. We >>>>> present an implementation of a pickle format that is based on >>>>> clustering similar objects. >>>>> >>>>> There is a summary of the project below, but more complete information >>>>> is available here: http://rmod.lille.inria.fr/web/pier/software/Fuel >>>>> >>>>> The implementation still needs a lot of work to be really useful, >>>>> optimizations should be done, but we'll be glad to get feedback of the >>>>> community. >>>>> >>>>> >>>>> = Pickle format = >>>>> >>>>> The pickle format and the serialization algorithm main idea, is >>>>> explained in this slides: >>>>> >>>>> http://www.slideshare.net/tinchodias/fuel-serialization-in-an-example >>>>> >>>>> >>>>> = Current features = >>>>> >>>>> - Class shape changing (when a variable has been added, or removed, or >>>>> its index changed) >>>>> - Serialize most of the basic objects. >>>>> - Serialize (almost) any CompiledMethod >>>>> - Detection of global or class variables >>>>> - Support for cyclic object graphs >>>>> - Tests >>>>> >>>>> >>>>> = Next steps = >>>>> >>>>> - Improve version checking. >>>>> - Optimize performance. >>>>> - Serialize more kinds of objects: >>>>> -- Class with its complete description. >>>>> -- Method contexts >>>>> -- Active block closures >>>>> -- Continuation >>>>> - Some improvements for the user: >>>>> -- pre and post actions to be executed. >>>>> -- easily say 'this object is singleton'. >>>>> - Partial loading of a stored graph. >>>>> - Fast statistics/brief info extraction of a stored graph. >>>>> - ConfigurationOfFuel. >>>>> - Be able to deploy materialization behavior only (independent from >>>>> the serialization behavior) >>>>> >>>>> >>>>> = Download = >>>>> >>>>> In a Pharo 1.1 or 1.1.1 evaluate: >>>>> >>>>> Gofer new >>>>> squeaksource: 'Fuel'; >>>>> version: 'Fuel-MartinDias.74'; >>>>> version: 'FuelBenchmarks-MartinDias.4'; >>>>> load. >>>>> >>>>> >>>>> = Benchmarks = >>>>> >>>>> You can run benchmarks executing this line (results in Transcript): >>>>> >>>>> FLBenchmarks newBasic run. >>>>> >>>>> >>>>> Thank you! >>>>> Martin Dias >>>>> >>>> >>>> -- >>>> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: >>>> Alexandre Bergel http://www.bergel.eu >>>> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. >>>> >>>> >>>> >>>> >>>> >>>> >>> >>> >> >> -- >> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: >> Alexandre Bergel http://www.bergel.eu >> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. >> >> >> >> >> >> >> > |
In reply to this post by tinchodias
Hi Martin,
I took some application for which we use image segments to test Fuel - With Fuel serializing and writing to disk took 330s. File size is 16.1MB - With image segments saving takes 4s and the file size is 2.4MB - When loading, I got a low space warning because the primitive newMethod:header: failed. - Loading with image segments takes 11 seconds I wonder what your rational is for "objects are much more times loaded than stored". In our case its exactly the other way round. We store very often (like every couple of minutes if there are changes), but only load when we restart an image (this may be weeks). HTH, Adrian On Dec 8, 2010, at 17:50 , Martin Dias wrote: > Hi all > > Last months I and Tristan have been working on Fuel project, an object > binary serialization tool. The idea is that objects are much more > times loaded than stored, therefore it is worth to spend time while > storing in order to have faster loading and user experience. We > present an implementation of a pickle format that is based on > clustering similar objects. > > There is a summary of the project below, but more complete information > is available here: http://rmod.lille.inria.fr/web/pier/software/Fuel > > The implementation still needs a lot of work to be really useful, > optimizations should be done, but we'll be glad to get feedback of the > community. > > > = Pickle format = > > The pickle format and the serialization algorithm main idea, is > explained in this slides: > > http://www.slideshare.net/tinchodias/fuel-serialization-in-an-example > > > = Current features = > > - Class shape changing (when a variable has been added, or removed, or > its index changed) > - Serialize most of the basic objects. > - Serialize (almost) any CompiledMethod > - Detection of global or class variables > - Support for cyclic object graphs > - Tests > > > = Next steps = > > - Improve version checking. > - Optimize performance. > - Serialize more kinds of objects: > -- Class with its complete description. > -- Method contexts > -- Active block closures > -- Continuation > - Some improvements for the user: > -- pre and post actions to be executed. > -- easily say 'this object is singleton'. > - Partial loading of a stored graph. > - Fast statistics/brief info extraction of a stored graph. > - ConfigurationOfFuel. > - Be able to deploy materialization behavior only (independent from > the serialization behavior) > > > = Download = > > In a Pharo 1.1 or 1.1.1 evaluate: > > Gofer new > squeaksource: 'Fuel'; > version: 'Fuel-MartinDias.74'; > version: 'FuelBenchmarks-MartinDias.4'; > load. > > > = Benchmarks = > > You can run benchmarks executing this line (results in Transcript): > > FLBenchmarks newBasic run. > > > Thank you! > Martin Dias > |
In reply to this post by Levente Uzonyi-2
On Dec 8, 2010, at 22:33 , Levente Uzonyi wrote: > On Wed, 8 Dec 2010, Mariano Martinez Peck wrote: > > Well, I doubt if speed is really important, since you're loading everything at most once. And the current tools are really fast IMHO. The following numbers are from Squeak: > [ Compiler recompileAll ] timeToRun. 29083. > CompiledMethod allInstances size. 60701. > [ CompiledMethod allInstancesDo: [ :each | each getSource ] ] timeToRun. 1133. > So the compiler is compiling 2087 methods per second on average. You can load 53575 methods per second from a file on average. If it's zipped, than it may be a bit slower, say a factor of 2-3x slowdown. So you can still load and compile more than 1800 methods per second. I guess thats fast enough. > > Even if Fuel can be 10x faster, it doesn't really make a difference IMHO. It would be interesting to thoroughly profile MC to figure where it spends all its time (with large projects it gets very very slow, like several minutes to just show the merge diffs between two branches). Adrian |
In reply to this post by Adrian Lienhard
On Wed, Dec 8, 2010 at 10:33 PM, Adrian Lienhard <[hidden email]> wrote: Hi Martin, but how are you using ImageSegment? just the primitive? because in order to compare it to Fuel, you should write all objects, including "outPointers". So you should use #writeForExportOn: or similar... funny it is only 4s in IS, since it has to do a #become, a full GC mark phase, etc...
That's Fuel purpose. And it is useful for version systems, where you may commit once (a specfic version for example), but load hundred of times. In our case its exactly the other way round. We store very often (like every couple of minutes if there are changes), but only load when we restart an image (this may be weeks). So I guess Fuel is not the best approach for you :) HTH, |
In reply to this post by Adrian Lienhard
On Wed, Dec 8, 2010 at 10:38 PM, Adrian Lienhard <[hidden email]> wrote:
Exactly. That's why I was suggesting that maybe with a fast binary serializer this can be much faster. |
In reply to this post by Adrian Lienhard
On Wed, 8 Dec 2010, Adrian Lienhard wrote:
> > On Dec 8, 2010, at 22:33 , Levente Uzonyi wrote: > >> On Wed, 8 Dec 2010, Mariano Martinez Peck wrote: >> >> Well, I doubt if speed is really important, since you're loading everything at most once. And the current tools are really fast IMHO. The following numbers are from Squeak: >> [ Compiler recompileAll ] timeToRun. 29083. >> CompiledMethod allInstances size. 60701. >> [ CompiledMethod allInstancesDo: [ :each | each getSource ] ] timeToRun. 1133. >> So the compiler is compiling 2087 methods per second on average. You can load 53575 methods per second from a file on average. If it's zipped, than it may be a bit slower, say a factor of 2-3x slowdown. So you can still load and compile more than 1800 methods per second. I guess thats fast enough. >> >> Even if Fuel can be 10x faster, it doesn't really make a difference IMHO. > > It would be interesting to thoroughly profile MC to figure where it spends all its time (with large projects it gets very very slow, like several minutes to just show the merge diffs between two branches). I guess those days are over when MC spends minutes doing this, it's at most a few seconds for large packages. The 1.5MB Morphic package of Squeak can be compared to another really old version (changes) in 3 seconds. According to MessageTally 50% of the time is spend in getting the timeStamp for the methods. Levente > > Adrian > |
In reply to this post by Levente Uzonyi-2
It does!
It seems that you did not work in VW2.5 and 3.0 and when parcels arrived loading was realllllly a big difference I do not see why this would not the same with Fuel. Stef >>>>> The difference is that in Fuel there is no managment for "shared objects". >> Fuel is not for swapping (there are no stubs/proxies and becomes), and the >> idea is that you write on a file, ALL (not only those objects that are ONLY >> accessible by the roots, like in ImageSegment) the objects reachable from >> user defined roots. >> >> Fuel is similar to ReferenceStream and subclasses, a XML serializer, or any >> serializers, etc...but MOSTLY, to Parcels. It is very similar to VisualWorks >> parcel. >> >> Another important thing is that it is binary. The idea is that maybe in a >> future we have a Monticello that doesn't need a mcz (code) + Compiler, but >> instead, just load binary code (fuel files). This would be muuuch faster >> that current Monticello, and even more, for minimal images, we wouldn't need >> a Compiler. >> Finally, Fuel is designed (like Parcels) to be very fast at loading time :) > > Well, I doubt if speed is really important, since you're loading everything at most once. And the current tools are really fast IMHO. The following numbers are from Squeak: > [ Compiler recompileAll ] timeToRun. 29083. > CompiledMethod allInstances size. 60701. > [ CompiledMethod allInstancesDo: [ :each | each getSource ] ] timeToRun. 1133. > So the compiler is compiling 2087 methods per second on average. You can load 53575 methods per second from a file on average. If it's zipped, than it may be a bit slower, say a factor of 2-3x slowdown. So you can still load and compile more than 1800 methods per second. I guess thats fast enough. > > Even if Fuel can be 10x faster, it doesn't really make a difference IMHO. > > > Levente > >> >> Cheers >> >> Mariano >> >> >> >> >>> It looks exciting! >>> >>> Alexandre >>> >>> >>>> >>>>> >>>>> Alexandre >>>>> >>>>> >>>>> On 8 Dec 2010, at 13:50, Martin Dias wrote: >>>>> >>>>>> Hi all >>>>>> >>>>>> Last months I and Tristan have been working on Fuel project, an object >>>>>> binary serialization tool. The idea is that objects are much more >>>>>> times loaded than stored, therefore it is worth to spend time while >>>>>> storing in order to have faster loading and user experience. We >>>>>> present an implementation of a pickle format that is based on >>>>>> clustering similar objects. >>>>>> >>>>>> There is a summary of the project below, but more complete information >>>>>> is available here: http://rmod.lille.inria.fr/web/pier/software/Fuel >>>>>> >>>>>> The implementation still needs a lot of work to be really useful, >>>>>> optimizations should be done, but we'll be glad to get feedback of the >>>>>> community. >>>>>> >>>>>> >>>>>> = Pickle format = >>>>>> >>>>>> The pickle format and the serialization algorithm main idea, is >>>>>> explained in this slides: >>>>>> >>>>>> http://www.slideshare.net/tinchodias/fuel-serialization-in-an-example >>>>>> >>>>>> >>>>>> = Current features = >>>>>> >>>>>> - Class shape changing (when a variable has been added, or removed, or >>>>>> its index changed) >>>>>> - Serialize most of the basic objects. >>>>>> - Serialize (almost) any CompiledMethod >>>>>> - Detection of global or class variables >>>>>> - Support for cyclic object graphs >>>>>> - Tests >>>>>> >>>>>> >>>>>> = Next steps = >>>>>> >>>>>> - Improve version checking. >>>>>> - Optimize performance. >>>>>> - Serialize more kinds of objects: >>>>>> -- Class with its complete description. >>>>>> -- Method contexts >>>>>> -- Active block closures >>>>>> -- Continuation >>>>>> - Some improvements for the user: >>>>>> -- pre and post actions to be executed. >>>>>> -- easily say 'this object is singleton'. >>>>>> - Partial loading of a stored graph. >>>>>> - Fast statistics/brief info extraction of a stored graph. >>>>>> - ConfigurationOfFuel. >>>>>> - Be able to deploy materialization behavior only (independent from >>>>>> the serialization behavior) >>>>>> >>>>>> >>>>>> = Download = >>>>>> >>>>>> In a Pharo 1.1 or 1.1.1 evaluate: >>>>>> >>>>>> Gofer new >>>>>> squeaksource: 'Fuel'; >>>>>> version: 'Fuel-MartinDias.74'; >>>>>> version: 'FuelBenchmarks-MartinDias.4'; >>>>>> load. >>>>>> >>>>>> >>>>>> = Benchmarks = >>>>>> >>>>>> You can run benchmarks executing this line (results in Transcript): >>>>>> >>>>>> FLBenchmarks newBasic run. >>>>>> >>>>>> >>>>>> Thank you! >>>>>> Martin Dias >>>>>> >>>>> >>>>> -- >>>>> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: >>>>> Alexandre Bergel http://www.bergel.eu >>>>> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>> >>> -- >>> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: >>> Alexandre Bergel http://www.bergel.eu >>> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. >>> >>> >>> >>> >>> >>> >>> >> > |
In reply to this post by Levente Uzonyi-2
BTW
when giving feedback consider that the guy doing that is spending a lot of time and this will be his master and that the code was not optimize and that there is no dedicated primitive in play. So we will see at the end and I was thinking that our little community would be much more positive but we will continue because we believe that there is some value in that. Stef |
In reply to this post by Mariano Martinez Peck
On Dec 8, 2010, at 22:49 , Mariano Martinez Peck wrote: > On Wed, Dec 8, 2010 at 10:33 PM, Adrian Lienhard <[hidden email]> wrote: > >> Hi Martin, >> >> I took some application for which we use image segments to test Fuel >> >> - With Fuel serializing and writing to disk took 330s. File size is 16.1MB >> - With image segments saving takes 4s and the file size is 2.4MB >> > > but how are you using ImageSegment? just the primitive? because in order > to compare it to Fuel, you should write all objects, including > "outPointers". So you should use #writeForExportOn: or similar... yes, this includes outPointers serialized with reference stream and writing to disk. Adrian BTW, I was just providing the numbers that I gathered when looking at Fuel (to see whether that could be interesting for our use case to replace image segments). This was not to say that Fuel is not on the right way or anything, but I though the numbers would be interesting for Martin because they show a real-world use case with a large number of objects. I know that Fuel is in an early stage of development and it doesn't (yet?) have a primitive/plugin to speed things up. |
In reply to this post by tinchodias
Hi Martin,
Nice project. I noticed that you have a package FuelFameExtension. Is this done for the Fame meta engine? If yes, I would be interested in testing it, especially that in the context of Moose we do load the objects significantly more often than we store them :). Cheers, Doru On 8 Dec 2010, at 17:50, Martin Dias wrote: > Hi all > > Last months I and Tristan have been working on Fuel project, an object > binary serialization tool. The idea is that objects are much more > times loaded than stored, therefore it is worth to spend time while > storing in order to have faster loading and user experience. We > present an implementation of a pickle format that is based on > clustering similar objects. > > There is a summary of the project below, but more complete information > is available here: http://rmod.lille.inria.fr/web/pier/software/Fuel > > The implementation still needs a lot of work to be really useful, > optimizations should be done, but we'll be glad to get feedback of the > community. > > > = Pickle format = > > The pickle format and the serialization algorithm main idea, is > explained in this slides: > > http://www.slideshare.net/tinchodias/fuel-serialization-in-an-example > > > = Current features = > > - Class shape changing (when a variable has been added, or removed, or > its index changed) > - Serialize most of the basic objects. > - Serialize (almost) any CompiledMethod > - Detection of global or class variables > - Support for cyclic object graphs > - Tests > > > = Next steps = > > - Improve version checking. > - Optimize performance. > - Serialize more kinds of objects: > -- Class with its complete description. > -- Method contexts > -- Active block closures > -- Continuation > - Some improvements for the user: > -- pre and post actions to be executed. > -- easily say 'this object is singleton'. > - Partial loading of a stored graph. > - Fast statistics/brief info extraction of a stored graph. > - ConfigurationOfFuel. > - Be able to deploy materialization behavior only (independent from > the serialization behavior) > > > = Download = > > In a Pharo 1.1 or 1.1.1 evaluate: > > Gofer new > squeaksource: 'Fuel'; > version: 'Fuel-MartinDias.74'; > version: 'FuelBenchmarks-MartinDias.4'; > load. > > > = Benchmarks = > > You can run benchmarks executing this line (results in Transcript): > > FLBenchmarks newBasic run. > > > Thank you! > Martin Dias > -- www.tudorgirba.com "Sometimes the best solution is not the best solution." |
In reply to this post by Stéphane Ducasse
On Thu, 9 Dec 2010, Stéphane Ducasse wrote:
> It does! > It seems that you did not work in VW2.5 and 3.0 and when parcels arrived loading was realllllly a big difference > I do not see why this would not the same with Fuel. No I didn't, but the version number of VW is around 7.x now, so I guess the CPUs and VMs are now several times faster. Does it really matter if it takes 200ms or 20ms to load a package? Levente > > Stef > >>>>>> The difference is that in Fuel there is no managment for "shared objects". >>> Fuel is not for swapping (there are no stubs/proxies and becomes), and the >>> idea is that you write on a file, ALL (not only those objects that are ONLY >>> accessible by the roots, like in ImageSegment) the objects reachable from >>> user defined roots. >>> >>> Fuel is similar to ReferenceStream and subclasses, a XML serializer, or any >>> serializers, etc...but MOSTLY, to Parcels. It is very similar to VisualWorks >>> parcel. >>> >>> Another important thing is that it is binary. The idea is that maybe in a >>> future we have a Monticello that doesn't need a mcz (code) + Compiler, but >>> instead, just load binary code (fuel files). This would be muuuch faster >>> that current Monticello, and even more, for minimal images, we wouldn't need >>> a Compiler. >>> Finally, Fuel is designed (like Parcels) to be very fast at loading time :) >> >> Well, I doubt if speed is really important, since you're loading everything at most once. And the current tools are really fast IMHO. The following numbers are from Squeak: >> [ Compiler recompileAll ] timeToRun. 29083. >> CompiledMethod allInstances size. 60701. >> [ CompiledMethod allInstancesDo: [ :each | each getSource ] ] timeToRun. 1133. >> So the compiler is compiling 2087 methods per second on average. You can load 53575 methods per second from a file on average. If it's zipped, than it may be a bit slower, say a factor of 2-3x slowdown. So you can still load and compile more than 1800 methods per second. I guess thats fast enough. >> >> Even if Fuel can be 10x faster, it doesn't really make a difference IMHO. >> >> >> Levente >> >>> >>> Cheers >>> >>> Mariano >>> >>> >>> >>> >>>> It looks exciting! >>>> >>>> Alexandre >>>> >>>> >>>>> >>>>>> >>>>>> Alexandre >>>>>> >>>>>> >>>>>> On 8 Dec 2010, at 13:50, Martin Dias wrote: >>>>>> >>>>>>> Hi all >>>>>>> >>>>>>> Last months I and Tristan have been working on Fuel project, an object >>>>>>> binary serialization tool. The idea is that objects are much more >>>>>>> times loaded than stored, therefore it is worth to spend time while >>>>>>> storing in order to have faster loading and user experience. We >>>>>>> present an implementation of a pickle format that is based on >>>>>>> clustering similar objects. >>>>>>> >>>>>>> There is a summary of the project below, but more complete information >>>>>>> is available here: http://rmod.lille.inria.fr/web/pier/software/Fuel >>>>>>> >>>>>>> The implementation still needs a lot of work to be really useful, >>>>>>> optimizations should be done, but we'll be glad to get feedback of the >>>>>>> community. >>>>>>> >>>>>>> >>>>>>> = Pickle format = >>>>>>> >>>>>>> The pickle format and the serialization algorithm main idea, is >>>>>>> explained in this slides: >>>>>>> >>>>>>> http://www.slideshare.net/tinchodias/fuel-serialization-in-an-example >>>>>>> >>>>>>> >>>>>>> = Current features = >>>>>>> >>>>>>> - Class shape changing (when a variable has been added, or removed, or >>>>>>> its index changed) >>>>>>> - Serialize most of the basic objects. >>>>>>> - Serialize (almost) any CompiledMethod >>>>>>> - Detection of global or class variables >>>>>>> - Support for cyclic object graphs >>>>>>> - Tests >>>>>>> >>>>>>> >>>>>>> = Next steps = >>>>>>> >>>>>>> - Improve version checking. >>>>>>> - Optimize performance. >>>>>>> - Serialize more kinds of objects: >>>>>>> -- Class with its complete description. >>>>>>> -- Method contexts >>>>>>> -- Active block closures >>>>>>> -- Continuation >>>>>>> - Some improvements for the user: >>>>>>> -- pre and post actions to be executed. >>>>>>> -- easily say 'this object is singleton'. >>>>>>> - Partial loading of a stored graph. >>>>>>> - Fast statistics/brief info extraction of a stored graph. >>>>>>> - ConfigurationOfFuel. >>>>>>> - Be able to deploy materialization behavior only (independent from >>>>>>> the serialization behavior) >>>>>>> >>>>>>> >>>>>>> = Download = >>>>>>> >>>>>>> In a Pharo 1.1 or 1.1.1 evaluate: >>>>>>> >>>>>>> Gofer new >>>>>>> squeaksource: 'Fuel'; >>>>>>> version: 'Fuel-MartinDias.74'; >>>>>>> version: 'FuelBenchmarks-MartinDias.4'; >>>>>>> load. >>>>>>> >>>>>>> >>>>>>> = Benchmarks = >>>>>>> >>>>>>> You can run benchmarks executing this line (results in Transcript): >>>>>>> >>>>>>> FLBenchmarks newBasic run. >>>>>>> >>>>>>> >>>>>>> Thank you! >>>>>>> Martin Dias >>>>>>> >>>>>> >>>>>> -- >>>>>> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: >>>>>> Alexandre Bergel http://www.bergel.eu >>>>>> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> -- >>>> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: >>>> Alexandre Bergel http://www.bergel.eu >>>> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >> > > > |
In reply to this post by Stéphane Ducasse
On Thu, 9 Dec 2010, Stéphane Ducasse wrote:
> BTW > when giving feedback consider that the guy doing that is spending a lot of time and this will be his master > and that the code was not optimize and that there is no dedicated primitive in play. > > So we will see at the end and I was thinking that our little community would be much more positive but > we will continue because we believe that there is some value in that. Don't get me wrong, I'm not saying that Fuel is not useful. I'm saying that improving code loading performance is not that important. Levente > > Stef > |
2010/12/9 Levente Uzonyi <[hidden email]>:
> On Thu, 9 Dec 2010, Stéphane Ducasse wrote: > >> BTW >> when giving feedback consider that the guy doing that is spending a lot of >> time and this will be his master >> and that the code was not optimize and that there is no dedicated >> primitive in play. >> >> So we will see at the end and I was thinking that our little community >> would be much more positive but >> we will continue because we believe that there is some value in that. > > Don't get me wrong, I'm not saying that Fuel is not useful. I'm saying that > improving code loading performance is not that important. > Sort of. But what is most important, i think that you can exchange objects between images. MC really allows you to exchange only with source code, while with Fuel, i think you could put any object/data into binary package, and don't bother with inventing the pervasive ways how to recreate complex (or big) data structures from array literals :) Another interesting aspect of binary format is that you can give binary to people without disclosing source code.. (waving to corporate world ;) > > Levente > >> >> Stef > -- Best regards, Igor Stasenko AKA sig. |
Free forum by Nabble | Edit this page |