Smalltalk › Smalltalk Related › Lively Kernel

image persistence

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

6 messages Options

David Paola

image persistence

Hi lively kernel folks,

I've spent the past month or so digging around in several language VMs -- CPython, Rubinius, Topaz, Pypy, etc in an attempt to add the equivalent of the original Smalltalk "snapshot" VM primitive. Obviously I have been naive.

I've learned a lot, above all else that I'm not giving up. I have a decent, academic understanding of compilers, interpreters, VMs (and a foggy understanding of JITs), and was curious if anyone could clarify how the lively kernel serializes the world into JSON. Was this hairy? What were the hardest parts?

I realize everyone has a full time job and can't hand-hold a newbie, so any direction at all would be appreciated. I tried to pick apart the Squeak source code but without a background in the Squeak architecture, it was fruitless.

Thanks so much for your energy on lively kernel, I'm looking forward to hearing more and possibly contributing in the future.

-dave

More info:

I realize that the "high level" idea of snapshotting a running VM basically involves serializing the object memory, bytecode, and instruction pointer, and then deserializing that on "resume". Most of the issues I'm encountering lead me to believe I have an incomplete understanding.

--
Dave Paola

_______________________________________________
lively-kernel mailing list
[hidden email]
http://lists.hpi.uni-potsdam.de/listinfo/lively-kernel

Robert Krahn-4

Re: image persistence

Hi, Dave --

Thanks for the question, this is actually a really fascinating topic :)

First, we wrote up some general information about it (that let's you

interactively try out things ;) here:

http://lively-kernel.org/repository/webwerkstatt/documentation/Serialization-Overview.xhtml

and here:

http://lively-kernel.org/repository/webwerkstatt/documentation/Serialization-Interface.xhtml

I guess the "hairy" part was/is how to deal with "native" objects. JS browser

environments introduce functions and state that are not implemented /

represented in the JS context but hidden. The DOM and DOM nodes are an example

for that -- you cannot get or modify all the state that would be necessary to

capture or restablish a document / world.

The solution that we came up with and that works very well is to implement a

general JS serializer that walks an object graph starting from root objects.

When certain objects are encountered - e.g. DOM nodes - we make an exception

(this is what the serialization plugins that are mentioned in the worlds above

are for) and store not their full object representation but just "what we need

to know".

The creation of objects from a serialization works accordingly

create/instantiate objects + run custom init code for the "exceptions".

The shortcomings of this approach are the following:

- On the application development level you still need to be a bit careful what

objects you reference. Direct pointers to DOM nodes for example won't break

the serialization but when you deserialize you need custom init logic to

make things work as expected again.

- The stored representations become big (x-xxx MBs) really quickly.

Implementing optimizations using the plugin approach is possible but

requires additional work.

This deals with the "state" of a JS application / Lively world. Another point

that you mention is to capture running computations. From a certain level of

abstraction this is actually the same thing but since JS has incomplete

metprogramming capabilities (you are not able to reflect on closures, e.g.)

the "hidden state" problem comes up again. For Lively practically this has

little impact since in the "reactive" browser environment Lively don't have to

implement a "main" function. Anyway, we dealt with the problem and came up

with a solution. I will describe that in an upcoming post.

Please let me know if you have questions or want a more technical answer.

Best,

Robert

On Mon, Mar 11, 2013 at 5:51 PM, David Paola <[hidden email]> wrote:

Hi lively kernel folks,

I've spent the past month or so digging around in several language VMs -- CPython, Rubinius, Topaz, Pypy, etc in an attempt to add the equivalent of the original Smalltalk "snapshot" VM primitive. Obviously I have been naive.

I've learned a lot, above all else that I'm not giving up. I have a decent, academic understanding of compilers, interpreters, VMs (and a foggy understanding of JITs), and was curious if anyone could clarify how the lively kernel serializes the world into JSON. Was this hairy? What were the hardest parts?

I realize everyone has a full time job and can't hand-hold a newbie, so any direction at all would be appreciated. I tried to pick apart the Squeak source code but without a background in the Squeak architecture, it was fruitless.

Thanks so much for your energy on lively kernel, I'm looking forward to hearing more and possibly contributing in the future.

-dave

More info:

I realize that the "high level" idea of snapshotting a running VM basically involves serializing the object memory, bytecode, and instruction pointer, and then deserializing that on "resume". Most of the issues I'm encountering lead me to believe I have an incomplete understanding.

--
Dave Paola

_______________________________________________
lively-kernel mailing list
[hidden email]
http://lists.hpi.uni-potsdam.de/listinfo/lively-kernel

_______________________________________________
lively-kernel mailing list
[hidden email]
http://lists.hpi.uni-potsdam.de/listinfo/lively-kernel

Robert Krahn-4

Re: image persistence

Btw. the most influencing work for Lively's persistency mechanism comes from Self, see the excellent paper Annotating Objects for Transport to Other Worlds,

On Tue, Mar 12, 2013 at 5:34 PM, Robert Krahn <[hidden email]> wrote:

Hi, Dave --

Thanks for the question, this is actually a really fascinating topic :)

First, we wrote up some general information about it (that let's you

interactively try out things ;) here:
http://lively-kernel.org/repository/webwerkstatt/documentation/Serialization-Overview.xhtml

and here:
http://lively-kernel.org/repository/webwerkstatt/documentation/Serialization-Interface.xhtml

I guess the "hairy" part was/is how to deal with "native" objects. JS browser
environments introduce functions and state that are not implemented /
represented in the JS context but hidden. The DOM and DOM nodes are an example

for that -- you cannot get or modify all the state that would be necessary to
capture or restablish a document / world.

The solution that we came up with and that works very well is to implement a

general JS serializer that walks an object graph starting from root objects.
When certain objects are encountered - e.g. DOM nodes - we make an exception
(this is what the serialization plugins that are mentioned in the worlds above

are for) and store not their full object representation but just "what we need
to know".

The creation of objects from a serialization works accordingly
create/instantiate objects + run custom init code for the "exceptions".

The shortcomings of this approach are the following:
- On the application development level you still need to be a bit careful what
objects you reference. Direct pointers to DOM nodes for example won't break

the serialization but when you deserialize you need custom init logic to
make things work as expected again.
- The stored representations become big (x-xxx MBs) really quickly.
Implementing optimizations using the plugin approach is possible but

requires additional work.

This deals with the "state" of a JS application / Lively world. Another point
that you mention is to capture running computations. From a certain level of

abstraction this is actually the same thing but since JS has incomplete
metprogramming capabilities (you are not able to reflect on closures, e.g.)
the "hidden state" problem comes up again. For Lively practically this has

little impact since in the "reactive" browser environment Lively don't have to
implement a "main" function. Anyway, we dealt with the problem and came up
with a solution. I will describe that in an upcoming post.

Please let me know if you have questions or want a more technical answer.

Best,
Robert

On Mon, Mar 11, 2013 at 5:51 PM, David Paola <[hidden email]> wrote:
Hi lively kernel folks,

I've spent the past month or so digging around in several language VMs -- CPython, Rubinius, Topaz, Pypy, etc in an attempt to add the equivalent of the original Smalltalk "snapshot" VM primitive. Obviously I have been naive.

I've learned a lot, above all else that I'm not giving up. I have a decent, academic understanding of compilers, interpreters, VMs (and a foggy understanding of JITs), and was curious if anyone could clarify how the lively kernel serializes the world into JSON. Was this hairy? What were the hardest parts?

I realize everyone has a full time job and can't hand-hold a newbie, so any direction at all would be appreciated. I tried to pick apart the Squeak source code but without a background in the Squeak architecture, it was fruitless.

Thanks so much for your energy on lively kernel, I'm looking forward to hearing more and possibly contributing in the future.

-dave

More info:

I realize that the "high level" idea of snapshotting a running VM basically involves serializing the object memory, bytecode, and instruction pointer, and then deserializing that on "resume". Most of the issues I'm encountering lead me to believe I have an incomplete understanding.

--
Dave Paola

_______________________________________________
lively-kernel mailing list
[hidden email]
http://lists.hpi.uni-potsdam.de/listinfo/lively-kernel

_______________________________________________
lively-kernel mailing list
[hidden email]
http://lists.hpi.uni-potsdam.de/listinfo/lively-kernel

David Paola

Re: image persistence

Thanks both of you for your answers :-) that self paper is *exactly* what I am looking for.

Happy hacking!

Dave Paola

On Mar 12, 2013, at 6:47 PM, Robert Krahn <[hidden email]> wrote:

Btw. the most influencing work for Lively's persistency mechanism comes from Self, see the excellent paper Annotating Objects for Transport to Other Worlds,

On Tue, Mar 12, 2013 at 5:34 PM, Robert Krahn <[hidden email]> wrote:

Hi, Dave --

Thanks for the question, this is actually a really fascinating topic :)

First, we wrote up some general information about it (that let's you

interactively try out things ;) here:
http://lively-kernel.org/repository/webwerkstatt/documentation/Serialization-Overview.xhtml

and here:
http://lively-kernel.org/repository/webwerkstatt/documentation/Serialization-Interface.xhtml

I guess the "hairy" part was/is how to deal with "native" objects. JS browser
environments introduce functions and state that are not implemented /
represented in the JS context but hidden. The DOM and DOM nodes are an example

for that -- you cannot get or modify all the state that would be necessary to
capture or restablish a document / world.

The solution that we came up with and that works very well is to implement a

general JS serializer that walks an object graph starting from root objects.
When certain objects are encountered - e.g. DOM nodes - we make an exception
(this is what the serialization plugins that are mentioned in the worlds above

are for) and store not their full object representation but just "what we need
to know".

The creation of objects from a serialization works accordingly
create/instantiate objects + run custom init code for the "exceptions".

The shortcomings of this approach are the following:
- On the application development level you still need to be a bit careful what
objects you reference. Direct pointers to DOM nodes for example won't break

the serialization but when you deserialize you need custom init logic to
make things work as expected again.
- The stored representations become big (x-xxx MBs) really quickly.
Implementing optimizations using the plugin approach is possible but

requires additional work.

This deals with the "state" of a JS application / Lively world. Another point
that you mention is to capture running computations. From a certain level of

abstraction this is actually the same thing but since JS has incomplete
metprogramming capabilities (you are not able to reflect on closures, e.g.)
the "hidden state" problem comes up again. For Lively practically this has

little impact since in the "reactive" browser environment Lively don't have to
implement a "main" function. Anyway, we dealt with the problem and came up
with a solution. I will describe that in an upcoming post.

Please let me know if you have questions or want a more technical answer.

Best,
Robert

On Mon, Mar 11, 2013 at 5:51 PM, David Paola <[hidden email]> wrote:
Hi lively kernel folks,

I've spent the past month or so digging around in several language VMs -- CPython, Rubinius, Topaz, Pypy, etc in an attempt to add the equivalent of the original Smalltalk "snapshot" VM primitive. Obviously I have been naive.

I've learned a lot, above all else that I'm not giving up. I have a decent, academic understanding of compilers, interpreters, VMs (and a foggy understanding of JITs), and was curious if anyone could clarify how the lively kernel serializes the world into JSON. Was this hairy? What were the hardest parts?

I realize everyone has a full time job and can't hand-hold a newbie, so any direction at all would be appreciated. I tried to pick apart the Squeak source code but without a background in the Squeak architecture, it was fruitless.

Thanks so much for your energy on lively kernel, I'm looking forward to hearing more and possibly contributing in the future.

-dave

More info:

I realize that the "high level" idea of snapshotting a running VM basically involves serializing the object memory, bytecode, and instruction pointer, and then deserializing that on "resume". Most of the issues I'm encountering lead me to believe I have an incomplete understanding.

--
Dave Paola

_______________________________________________
lively-kernel mailing list
[hidden email]
http://lists.hpi.uni-potsdam.de/listinfo/lively-kernel

_______________________________________________
lively-kernel mailing list
[hidden email]
http://lists.hpi.uni-potsdam.de/listinfo/lively-kernel

Casey Ransberger-2

Re: image persistence

Actually that paper answered a lot of questions I had about stuff that I want to be able to do in Squeak eventually. I thought I'd read all of the Self papers, but it looks like I missed this one. Thanks for the question, Dave, and thanks for the answer, Robert!

On Wed, Mar 13, 2013 at 3:45 PM, David Paola <[hidden email]> wrote:

Thanks both of you for your answers :-) that self paper is *exactly* what I am looking for.

Happy hacking!

--
Dave Paola

On Mar 12, 2013, at 6:47 PM, Robert Krahn <[hidden email]> wrote:

Btw. the most influencing work for Lively's persistency mechanism comes from Self, see the excellent paper Annotating Objects for Transport to Other Worlds,

On Tue, Mar 12, 2013 at 5:34 PM, Robert Krahn <[hidden email]> wrote:

Hi, Dave --

Thanks for the question, this is actually a really fascinating topic :)

First, we wrote up some general information about it (that let's you

interactively try out things ;) here:
http://lively-kernel.org/repository/webwerkstatt/documentation/Serialization-Overview.xhtml

and here:
http://lively-kernel.org/repository/webwerkstatt/documentation/Serialization-Interface.xhtml

I guess the "hairy" part was/is how to deal with "native" objects. JS browser
environments introduce functions and state that are not implemented /
represented in the JS context but hidden. The DOM and DOM nodes are an example

for that -- you cannot get or modify all the state that would be necessary to
capture or restablish a document / world.

The solution that we came up with and that works very well is to implement a

general JS serializer that walks an object graph starting from root objects.
When certain objects are encountered - e.g. DOM nodes - we make an exception
(this is what the serialization plugins that are mentioned in the worlds above

are for) and store not their full object representation but just "what we need
to know".

The creation of objects from a serialization works accordingly
create/instantiate objects + run custom init code for the "exceptions".

The shortcomings of this approach are the following:
- On the application development level you still need to be a bit careful what
objects you reference. Direct pointers to DOM nodes for example won't break

the serialization but when you deserialize you need custom init logic to
make things work as expected again.
- The stored representations become big (x-xxx MBs) really quickly.
Implementing optimizations using the plugin approach is possible but

requires additional work.

This deals with the "state" of a JS application / Lively world. Another point
that you mention is to capture running computations. From a certain level of

abstraction this is actually the same thing but since JS has incomplete
metprogramming capabilities (you are not able to reflect on closures, e.g.)
the "hidden state" problem comes up again. For Lively practically this has

little impact since in the "reactive" browser environment Lively don't have to
implement a "main" function. Anyway, we dealt with the problem and came up
with a solution. I will describe that in an upcoming post.

Please let me know if you have questions or want a more technical answer.

Best,
Robert

On Mon, Mar 11, 2013 at 5:51 PM, David Paola <[hidden email]> wrote:
Hi lively kernel folks,

I've spent the past month or so digging around in several language VMs -- CPython, Rubinius, Topaz, Pypy, etc in an attempt to add the equivalent of the original Smalltalk "snapshot" VM primitive. Obviously I have been naive.

I've learned a lot, above all else that I'm not giving up. I have a decent, academic understanding of compilers, interpreters, VMs (and a foggy understanding of JITs), and was curious if anyone could clarify how the lively kernel serializes the world into JSON. Was this hairy? What were the hardest parts?

I realize everyone has a full time job and can't hand-hold a newbie, so any direction at all would be appreciated. I tried to pick apart the Squeak source code but without a background in the Squeak architecture, it was fruitless.

Thanks so much for your energy on lively kernel, I'm looking forward to hearing more and possibly contributing in the future.

-dave

More info:

I realize that the "high level" idea of snapshotting a running VM basically involves serializing the object memory, bytecode, and instruction pointer, and then deserializing that on "resume". Most of the issues I'm encountering lead me to believe I have an incomplete understanding.

--
Dave Paola

_______________________________________________
lively-kernel mailing list
[hidden email]
http://lists.hpi.uni-potsdam.de/listinfo/lively-kernel

_______________________________________________
lively-kernel mailing list
[hidden email]
http://lists.hpi.uni-potsdam.de/listinfo/lively-kernel

_______________________________________________
lively-kernel mailing list
[hidden email]
http://lists.hpi.uni-potsdam.de/listinfo/lively-kernel

David Paola

Re: image persistence

Indeed -- what other self papers would you recommend?

Dave Paola

On Mar 13, 2013, at 6:52 PM, Casey Ransberger <[hidden email]> wrote:

Actually that paper answered a lot of questions I had about stuff that I want to be able to do in Squeak eventually. I thought I'd read all of the Self papers, but it looks like I missed this one. Thanks for the question, Dave, and thanks for the answer, Robert!

On Wed, Mar 13, 2013 at 3:45 PM, David Paola <[hidden email]> wrote:

Thanks both of you for your answers :-) that self paper is *exactly* what I am looking for.

Happy hacking!

--
Dave Paola

On Mar 12, 2013, at 6:47 PM, Robert Krahn <[hidden email]> wrote:

Btw. the most influencing work for Lively's persistency mechanism comes from Self, see the excellent paper Annotating Objects for Transport to Other Worlds,

On Tue, Mar 12, 2013 at 5:34 PM, Robert Krahn <[hidden email]> wrote:

Hi, Dave --

Thanks for the question, this is actually a really fascinating topic :)

First, we wrote up some general information about it (that let's you

interactively try out things ;) here:
http://lively-kernel.org/repository/webwerkstatt/documentation/Serialization-Overview.xhtml

and here:
http://lively-kernel.org/repository/webwerkstatt/documentation/Serialization-Interface.xhtml

I guess the "hairy" part was/is how to deal with "native" objects. JS browser
environments introduce functions and state that are not implemented /
represented in the JS context but hidden. The DOM and DOM nodes are an example

for that -- you cannot get or modify all the state that would be necessary to
capture or restablish a document / world.

The solution that we came up with and that works very well is to implement a

general JS serializer that walks an object graph starting from root objects.
When certain objects are encountered - e.g. DOM nodes - we make an exception
(this is what the serialization plugins that are mentioned in the worlds above

are for) and store not their full object representation but just "what we need
to know".

The creation of objects from a serialization works accordingly
create/instantiate objects + run custom init code for the "exceptions".

The shortcomings of this approach are the following:
- On the application development level you still need to be a bit careful what
objects you reference. Direct pointers to DOM nodes for example won't break

the serialization but when you deserialize you need custom init logic to
make things work as expected again.
- The stored representations become big (x-xxx MBs) really quickly.
Implementing optimizations using the plugin approach is possible but

requires additional work.

This deals with the "state" of a JS application / Lively world. Another point
that you mention is to capture running computations. From a certain level of

abstraction this is actually the same thing but since JS has incomplete
metprogramming capabilities (you are not able to reflect on closures, e.g.)
the "hidden state" problem comes up again. For Lively practically this has

little impact since in the "reactive" browser environment Lively don't have to
implement a "main" function. Anyway, we dealt with the problem and came up
with a solution. I will describe that in an upcoming post.

Please let me know if you have questions or want a more technical answer.

Best,
Robert

On Mon, Mar 11, 2013 at 5:51 PM, David Paola <[hidden email]> wrote:
Hi lively kernel folks,

I've spent the past month or so digging around in several language VMs -- CPython, Rubinius, Topaz, Pypy, etc in an attempt to add the equivalent of the original Smalltalk "snapshot" VM primitive. Obviously I have been naive.

I've learned a lot, above all else that I'm not giving up. I have a decent, academic understanding of compilers, interpreters, VMs (and a foggy understanding of JITs), and was curious if anyone could clarify how the lively kernel serializes the world into JSON. Was this hairy? What were the hardest parts?

I realize everyone has a full time job and can't hand-hold a newbie, so any direction at all would be appreciated. I tried to pick apart the Squeak source code but without a background in the Squeak architecture, it was fruitless.

Thanks so much for your energy on lively kernel, I'm looking forward to hearing more and possibly contributing in the future.

-dave

More info:

I realize that the "high level" idea of snapshotting a running VM basically involves serializing the object memory, bytecode, and instruction pointer, and then deserializing that on "resume". Most of the issues I'm encountering lead me to believe I have an incomplete understanding.

--
Dave Paola

_______________________________________________
lively-kernel mailing list
[hidden email]
http://lists.hpi.uni-potsdam.de/listinfo/lively-kernel

_______________________________________________
lively-kernel mailing list
[hidden email]
http://lists.hpi.uni-potsdam.de/listinfo/lively-kernel

_______________________________________________
lively-kernel mailing list
[hidden email]
http://lists.hpi.uni-potsdam.de/listinfo/lively-kernel

_______________________________________________
lively-kernel mailing list
[hidden email]
http://lists.hpi.uni-potsdam.de/listinfo/lively-kernel