ESUG SummerTalk - Fuel, binary object serializer

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

ESUG SummerTalk - Fuel, binary object serializer

tinchodias
Hi folks. I am really happy to announce that ESUG is sponsoring me for Fuel development through the ESUG SummerTalk. I am Martin Dias, a student at Buenos Aires, Argentina. The idea behind this SummerTalk is to implement Fuel, a binary, fast and general-purpose object graph serializer in Pharo. It is based on VisualWorks' Parcels ideas.

Actually, the project has already started since several months. Tristan Bourgois and I started with the project while doing an internship with RMoD, INRIA. Since a couple of months, Mariano Martinez Peck joined the team, and now he is the official mentor in the SummerTalk.

ESUG website for SummertTalk: http://www.esug.org/wiki/pier/Promotion/SummerTalk/SummerTalk2011

The website with all the necessary information is here: http://rmod.lille.inria.fr/web/pier/software/Fuel
It even includes slides explaining the algorithm. In addition, a paper is in progress.

For the moment, Fuel already provides the following features:

- Fast pickle format. It is much faster to materialize than to serialize.
- Correctly support class reshape (when the class of serialized objects has changed).
- Serialize ANY kind of object. For the moment there is no object to our knowledge that we cannot serialize and materialize.
- Be able to completely serialize classes and traits (not just a global name).
- Support cycles and avoid duplicates in the graph.
- Integration to Moose with an extension to export and import their models.
- Detection of globals: for example if you serialize Transcript, it is not duplicated and instead managed as a global reference.
- Solve common problems like Set rehash.
- Buffered writing: we use a buffered write stream for the serialization part (thanks Sven!).
- No need of special support from the VM.
- Try to have a good object oriented design.
- Well tested (about 120 tests, for the moment).
- Large set of benchmarks (even benchmarks for Moose extension).

And of course, there are a lot features for the future. You can see some of them in the website and some in the issue tracker: http://code.google.com/p/fuel/issues/list

We really appreciate all kind of feedback and comments. If you want to try it, check in the website how to do it. It is extremely easy.

Once again, I want to thank a lot to ESUG for sponsoring the project. I plan to create a "news" section in the website with some RSS. I will keep you informed.

Best regards,
Martin

Reply | Threaded
Open this post in threaded view
|

Re: [Seaside] ESUG SummerTalk - Fuel, binary object serializer

Tudor Girba-2
Congratulations, Martin!

This is great news. I am looking forward to using it even tighter in Moose :)

Cheers,
Doru


On 24 May 2011, at 22:39, Martin Dias wrote:

> Hi folks. I am really happy to announce that ESUG is sponsoring me for Fuel development through the ESUG SummerTalk. I am Martin Dias, a student at Buenos Aires, Argentina. The idea behind this SummerTalk is to implement Fuel, a binary, fast and general-purpose object graph serializer in Pharo. It is based on VisualWorks' Parcels ideas.
>
> Actually, the project has already started since several months. Tristan Bourgois and I started with the project while doing an internship with RMoD, INRIA. Since a couple of months, Mariano Martinez Peck joined the team, and now he is the official mentor in the SummerTalk.
>
> ESUG website for SummertTalk: http://www.esug.org/wiki/pier/Promotion/SummerTalk/SummerTalk2011
>
> The website with all the necessary information is here: http://rmod.lille.inria.fr/web/pier/software/Fuel
> It even includes slides explaining the algorithm. In addition, a paper is in progress.
>
> For the moment, Fuel already provides the following features:
>
> - Fast pickle format. It is much faster to materialize than to serialize.
> - Correctly support class reshape (when the class of serialized objects has changed).
> - Serialize ANY kind of object. For the moment there is no object to our knowledge that we cannot serialize and materialize.
> - Be able to completely serialize classes and traits (not just a global name).
> - Support cycles and avoid duplicates in the graph.
> - Integration to Moose with an extension to export and import their models.
> - Detection of globals: for example if you serialize Transcript, it is not duplicated and instead managed as a global reference.
> - Solve common problems like Set rehash.
> - Buffered writing: we use a buffered write stream for the serialization part (thanks Sven!).
> - No need of special support from the VM.
> - Try to have a good object oriented design.
> - Well tested (about 120 tests, for the moment).
> - Large set of benchmarks (even benchmarks for Moose extension).
>
> And of course, there are a lot features for the future. You can see some of them in the website and some in the issue tracker: http://code.google.com/p/fuel/issues/list
>
> We really appreciate all kind of feedback and comments. If you want to try it, check in the website how to do it. It is extremely easy.
>
> Once again, I want to thank a lot to ESUG for sponsoring the project. I plan to create a "news" section in the website with some RSS. I will keep you informed.
>
> Best regards,
> Martin
>
> _______________________________________________
> seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside

--
www.tudorgirba.com

"Being happy is a matter of choice."




Reply | Threaded
Open this post in threaded view
|

Re: [Seaside] Re: ESUG SummerTalk - Fuel, binary object serializer

Mariano Martinez Peck
In reply to this post by tinchodias


On Wed, May 25, 2011 at 5:14 PM, Yanni Chiu <[hidden email]> wrote:
On 24/05/11 4:39 PM, Martin Dias wrote:

We really appreciate all kind of feedback and comments. If you want to
try it, check in the website how to do it. It is extremely easy.

I had a brief look and will look some more. I may try to use it to serialize a Pier kernel.

 
heheheheh. We would LOVE that. In fact, I told martin few months ago to do EXACTLY that.
If you could give it a try or need help, please let us know.

 
In another use case, I'd like to serialize from one image, and deserialize in another image - under end user control. The issue here is that "nasty" code could be introduced: e.g. capture the Fuel output, deserialize, add nasty code, re-serialize, then send onward for import to image. Would it be possible to have some sort of "virus" filter? Maybe something like the Star Trek transporter that can filter out nasty stuff before re-materializing. :) For a start, maybe an inclusion list and/or an exclusion list of classes and globals would be useful.


I guess this should be easy to do. For the moment:

- globals objects are hardcoded in #globalNames
- globals behaviors (classes and traits) are managed (by default) in kind of "light" serialization. Where we only serialize the global name which means that the class has to be present in Smalltalk globals in the image that you want to materialize.

You can change the default behavior and be able to completely serialize a class/trait. But this is much more complicated and it is still work on process (ClassBuilder is not your best friend).

Cheers


--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: [Seaside] Re: ESUG SummerTalk - Fuel, binary object serializer

Yanni Chiu
On 25/05/11 11:22 AM, Mariano Martinez Peck wrote:
>
> - globals behaviors (classes and traits) are managed (by default) in
> kind of "light" serialization. Where we only serialize the global name
> which means that the class has to be present in Smalltalk globals in the
> image that you want to materialize.
>
> You can change the default behavior and be able to completely serialize
> a class/trait. But this is much more complicated and it is still work on
> process (ClassBuilder is not your best friend).

Okay. I think all that is needed is a "safe mode" option when
*de-serializing*, which would discard any non-light classes or global
objects that appear in the stream.


Reply | Threaded
Open this post in threaded view
|

Re: [Seaside] Re: ESUG SummerTalk - Fuel, binary object serializer

Mariano Martinez Peck


On Wed, May 25, 2011 at 5:31 PM, Yanni Chiu <[hidden email]> wrote:
On 25/05/11 11:22 AM, Mariano Martinez Peck wrote:

- globals behaviors (classes and traits) are managed (by default) in
kind of "light" serialization. Where we only serialize the global name
which means that the class has to be present in Smalltalk globals in the
image that you want to materialize.

You can change the default behavior and be able to completely serialize
a class/trait. But this is much more complicated and it is still work on
process (ClassBuilder is not your best friend).

Okay. I think all that is needed is a "safe mode" option when *de-serializing*, which would discard any non-light classes or global objects that appear in the stream.



Sorry Yanni, I didn't follow. Could you please explain a bit more? what do you want to serialize? do you want to be able to choose some classes as light and some as non-light? where do you want to materialize ? in the same image or in another one ?   When you said discard....what would you do with the instances of those non-light classes for example? you don't materialize them? and what happens to the objects that were pointing to them ?  why would be the scenario useful for ? security ?

Thanks

--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: [Seaside] Re: Re: ESUG SummerTalk - Fuel, binary object serializer

tinchodias
Hi Yanni,

On Wed, May 25, 2011 at 1:10 PM, Yanni Chiu <[hidden email]> wrote:
On 25/05/11 11:53 AM, Mariano Martinez Peck wrote:

Sorry Yanni, I didn't follow. Could you please explain a bit more? what
do you want to serialize? do you want to be able to choose some classes
as light and some as non-light? where do you want to materialize ? in
the same image or in another one ?   When you said discard....what would
you do with the instances of those non-light classes for example? you
don't materialize them? and what happens to the objects that were
pointing to them ?  why would be the scenario useful for ? security ?

====
Yes, security. Here's my first post again, with different formatting:

In another use case, I'd like to serialize from one image, and deserialize in another image - *under end user control*. [e.g. web app]


The issue here is that "nasty" code could be introduced:
- capture the Fuel output
- deserialize, add nasty code, re-serialize
- then send onward for import to image.

Would it be possible to have some sort of "virus" filter?
====

So a simple "safe-mode" option on de-serialization would probably be sufficient.

It is a good point.

For the moment, when you deserialize a full class into the image, their methods are created, and bytecodes are copied from the stream without any validation check.

Anyway, you could deserialize the class, run your own validations, and then install the class (I mean, add the class to Smalltalk globals, do the class initialization, run announcements).

 


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside