Smalltalk › Pharo › Pharo Smalltalk Developers

Monticello and Fuel

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

26 messages Options

Eliot Miranda-2

Re: Monticello and Fuel

Hi Max,

On Sat, Dec 6, 2014 at 1:43 AM, Max Leske <[hidden email]> wrote:

On 06 Dec 2014, at 04:37, Eliot Miranda <[hidden email]> wrote:

On Thu, Dec 4, 2014 at 11:35 PM, Max Leske <[hidden email]> wrote:

> On 05 Dec 2014, at 08:02, Norbert Hartl <[hidden email]> wrote:
>
>
>
>
>> Am 04.12.2014 um 23:31 schrieb Thierry Goubier <[hidden email]>:
>>
>> Hi all,
>>
>> I'm just wondering.
>>
>> Would it work to have a package format based on Fuel?
>>
> I doubt it would work cross platform. I don't know how fuel serializes WideString, LargePositiveInteger, BoxedFloat64. These differ between smalltalk platforms. The source as string solves that. Strings are written as unicode string and numbers as certain number format etc.

True. Cross dialect loading isn’t something we encourage to do with Fuel.

@Thierry
You said something about partial loading: that will never be possible with Fuel because of its pickle format. To select any partial graph you have to first read the entire file and rebuild the graph first (ok, you don’t strictly have to build the graph but you still need to read the entire file).

Hi Max, you misunderstand what partial loading means in this context. I designed and implemented partial loading for the Parcel system in VW, and Fuel is essentially a clean reimplementation of parcels. The idea is that one does indeed read the entire object graph, but then only the parts of the graph that mate with the current class hierarchy are installed, and the bits that don't fit are stored for later. Why? So that a component can define extensions on classes in components that may not be present. Why? So that one can maintain a single logical component in a single package instead of decomposing it into independently loadable fragments.

Makes sense. It’s just that I often get asked if it is possible to read / update only parts of a Fuel file, so I wanted to make it clear that that can’t work.

Right, good point. But is there a use case? For example, in effect is there a difference between being able to patch a part of a Fuel file and loading that file into an image (possibly automatically), patching the object graph and saving a new version?

Lets take an example like Fuel itself. This may have specialised marshalling and unmarshalling extensions defined on may classes, some of which may be to do with the GUI. If we have partial loading we can load Fuel into a headless image. The extensions on the GUI classes will simply not be installed, *until* we load the GUI. Hence Fuel does not have to be decomposed every time we factor the system into subcomponents. Without partial loading Fuel must be decomposed into a series of fragments so that it can load that part that fits into the headless base, and we have to manage teh dependency to ensure the fragment that loads against the GUI is loaded when the GUI is loaded. Worse still, if we cut the GUI into two (e.g. development vs deployment tools) we have to visit the Fuel GUI component (and potentially many other component) and decompose it into two pieces. In practice this is extremely costly to maintain, verging on chaos. With partial loading things are simple.

Perhaps I should have called it partial installation, but you get the idea.

Thanks for the explanation.

>
> Norbert
>
>> Would that make loading faster?
>>
>> Does it already exist?
>>
>> Thanks,
>>
>> Thierry
>

--
best,
Eliot

best,

Eliot

Max Leske

Re: Monticello and Fuel

On 07 Dec 2014, at 04:19, Eliot Miranda <[hidden email]> wrote:

Hi Max,

Hi Eliot,

On Sat, Dec 6, 2014 at 1:43 AM, Max Leske <[hidden email]> wrote:

On 06 Dec 2014, at 04:37, Eliot Miranda <[hidden email]> wrote:

On Thu, Dec 4, 2014 at 11:35 PM, Max Leske <[hidden email]> wrote:

> On 05 Dec 2014, at 08:02, Norbert Hartl <[hidden email]> wrote:
>
>
>
>
>> Am 04.12.2014 um 23:31 schrieb Thierry Goubier <[hidden email]>:
>>
>> Hi all,
>>
>> I'm just wondering.
>>
>> Would it work to have a package format based on Fuel?
>>
> I doubt it would work cross platform. I don't know how fuel serializes WideString, LargePositiveInteger, BoxedFloat64. These differ between smalltalk platforms. The source as string solves that. Strings are written as unicode string and numbers as certain number format etc.

True. Cross dialect loading isn’t something we encourage to do with Fuel.

@Thierry
You said something about partial loading: that will never be possible with Fuel because of its pickle format. To select any partial graph you have to first read the entire file and rebuild the graph first (ok, you don’t strictly have to build the graph but you still need to read the entire file).

Hi Max, you misunderstand what partial loading means in this context. I designed and implemented partial loading for the Parcel system in VW, and Fuel is essentially a clean reimplementation of parcels. The idea is that one does indeed read the entire object graph, but then only the parts of the graph that mate with the current class hierarchy are installed, and the bits that don't fit are stored for later. Why? So that a component can define extensions on classes in components that may not be present. Why? So that one can maintain a single logical component in a single package instead of decomposing it into independently loadable fragments.

Makes sense. It’s just that I often get asked if it is possible to read / update only parts of a Fuel file, so I wanted to make it clear that that can’t work.

Right, good point. But is there a use case? For example, in effect is there a difference between being able to patch a part of a Fuel file and loading that file into an image (possibly automatically), patching the object graph and saving a new version?

Yes, performance seems to be the main use case. Fuel files of hundreds of MB take a long time to load and save, even if you only change a single string. One instance where this is relevant is the Moose model. They currently use .mse files, which use a textual representation (if I’m not mistaken). Usman asked me if Fuel were an option, especially for updating only parts of the model.

Lets take an example like Fuel itself. This may have specialised marshalling and unmarshalling extensions defined on may classes, some of which may be to do with the GUI. If we have partial loading we can load Fuel into a headless image. The extensions on the GUI classes will simply not be installed, *until* we load the GUI. Hence Fuel does not have to be decomposed every time we factor the system into subcomponents. Without partial loading Fuel must be decomposed into a series of fragments so that it can load that part that fits into the headless base, and we have to manage teh dependency to ensure the fragment that loads against the GUI is loaded when the GUI is loaded. Worse still, if we cut the GUI into two (e.g. development vs deployment tools) we have to visit the Fuel GUI component (and potentially many other component) and decompose it into two pieces. In practice this is extremely costly to maintain, verging on chaos. With partial loading things are simple.

Perhaps I should have called it partial installation, but you get the idea.

Thanks for the explanation.

>
> Norbert
>
>> Would that make loading faster?
>>
>> Does it already exist?
>>
>> Thanks,
>>
>> Thierry
>

--
best,
Eliot

--
best,
Eliot

Eliot Miranda-2

Re: Monticello and Fuel

Hi Max,

On Dec 7, 2014, at 5:14 AM, Max Leske <[hidden email]> wrote:

On 07 Dec 2014, at 04:19, Eliot Miranda <[hidden email]> wrote:

Hi Max,

Hi Eliot,

On Sat, Dec 6, 2014 at 1:43 AM, Max Leske <[hidden email]> wrote:

On 06 Dec 2014, at 04:37, Eliot Miranda <[hidden email]> wrote:

On Thu, Dec 4, 2014 at 11:35 PM, Max Leske <[hidden email]> wrote:

> On 05 Dec 2014, at 08:02, Norbert Hartl <[hidden email]> wrote:
>
>
>
>
>> Am 04.12.2014 um 23:31 schrieb Thierry Goubier <[hidden email]>:
>>
>> Hi all,
>>
>> I'm just wondering.
>>
>> Would it work to have a package format based on Fuel?
>>
> I doubt it would work cross platform. I don't know how fuel serializes WideString, LargePositiveInteger, BoxedFloat64. These differ between smalltalk platforms. The source as string solves that. Strings are written as unicode string and numbers as certain number format etc.

True. Cross dialect loading isn’t something we encourage to do with Fuel.

@Thierry
You said something about partial loading: that will never be possible with Fuel because of its pickle format. To select any partial graph you have to first read the entire file and rebuild the graph first (ok, you don’t strictly have to build the graph but you still need to read the entire file).

Hi Max, you misunderstand what partial loading means in this context. I designed and implemented partial loading for the Parcel system in VW, and Fuel is essentially a clean reimplementation of parcels. The idea is that one does indeed read the entire object graph, but then only the parts of the graph that mate with the current class hierarchy are installed, and the bits that don't fit are stored for later. Why? So that a component can define extensions on classes in components that may not be present. Why? So that one can maintain a single logical component in a single package instead of decomposing it into independently loadable fragments.

Makes sense. It’s just that I often get asked if it is possible to read / update only parts of a Fuel file, so I wanted to make it clear that that can’t work.

Right, good point. But is there a use case? For example, in effect is there a difference between being able to patch a part of a Fuel file and loading that file into an image (possibly automatically), patching the object graph and saving a new version?

Yes, performance seems to be the main use case. Fuel files of hundreds of MB take a long time to load and save, even if you only change a single string. One instance where this is relevant is the Moose model. They currently use .mse files, which use a textual representation (if I’m not mistaken). Usman asked me if Fuel were an option, especially for updating only parts of the model.

Ok, but how could patching the file be much faster? The file has to be parsed into something; why not the objects themselves? To me loading, editing and red acing is patching and done as fast as Fuel can do things.

Depending on syntax you may be able to scan for a string and replace using eg the sed stream editor. But that'll only work for simple data like strings.

If you really need this to be faster have you considered

- decomposing the Moose model into smaller individual components?

- keeping the moose model to be saved in an image (possibly live running in a server) and patching the model in the image and saving it

- using Spur [ ;-) ]. The Pharos tests now run and it should be significantly faster, at least at loading

Lets take an example like Fuel itself. This may have specialised marshalling and unmarshalling extensions defined on may classes, some of which may be to do with the GUI. If we have partial loading we can load Fuel into a headless image. The extensions on the GUI classes will simply not be installed, *until* we load the GUI. Hence Fuel does not have to be decomposed every time we factor the system into subcomponents. Without partial loading Fuel must be decomposed into a series of fragments so that it can load that part that fits into the headless base, and we have to manage teh dependency to ensure the fragment that loads against the GUI is loaded when the GUI is loaded. Worse still, if we cut the GUI into two (e.g. development vs deployment tools) we have to visit the Fuel GUI component (and potentially many other component) and decompose it into two pieces. In practice this is extremely costly to maintain, verging on chaos. With partial loading things are simple.

Perhaps I should have called it partial installation, but you get the idea.

Thanks for the explanation.

>
> Norbert
>
>> Would that make loading faster?
>>
>> Does it already exist?
>>
>> Thanks,
>>
>> Thierry
>

--
best,
Eliot

--
best,
Eliot

Eliot Miranda-2

Re: Monticello and Fuel

In reply to this post by Max Leske

Hi Max,

oops, I was being dense...

On Dec 7, 2014, at 5:14 AM, Max Leske <[hidden email]> wrote:

On 07 Dec 2014, at 04:19, Eliot Miranda <[hidden email]> wrote:

Hi Max,

Hi Eliot,

On Sat, Dec 6, 2014 at 1:43 AM, Max Leske <[hidden email]> wrote:

On 06 Dec 2014, at 04:37, Eliot Miranda <[hidden email]> wrote:

On Thu, Dec 4, 2014 at 11:35 PM, Max Leske <[hidden email]> wrote:

> On 05 Dec 2014, at 08:02, Norbert Hartl <[hidden email]> wrote:
>
>
>
>
>> Am 04.12.2014 um 23:31 schrieb Thierry Goubier <[hidden email]>:
>>
>> Hi all,
>>
>> I'm just wondering.
>>
>> Would it work to have a package format based on Fuel?
>>
> I doubt it would work cross platform. I don't know how fuel serializes WideString, LargePositiveInteger, BoxedFloat64. These differ between smalltalk platforms. The source as string solves that. Strings are written as unicode string and numbers as certain number format etc.

True. Cross dialect loading isn’t something we encourage to do with Fuel.

@Thierry
You said something about partial loading: that will never be possible with Fuel because of its pickle format. To select any partial graph you have to first read the entire file and rebuild the graph first (ok, you don’t strictly have to build the graph but you still need to read the entire file).

Hi Max, you misunderstand what partial loading means in this context. I designed and implemented partial loading for the Parcel system in VW, and Fuel is essentially a clean reimplementation of parcels. The idea is that one does indeed read the entire object graph, but then only the parts of the graph that mate with the current class hierarchy are installed, and the bits that don't fit are stored for later. Why? So that a component can define extensions on classes in components that may not be present. Why? So that one can maintain a single logical component in a single package instead of decomposing it into independently loadable fragments.

Makes sense. It’s just that I often get asked if it is possible to read / update only parts of a Fuel file, so I wanted to make it clear that that can’t work.

Right, good point. But is there a use case? For example, in effect is there a difference between being able to patch a part of a Fuel file and loading that file into an image (possibly automatically), patching the object graph and saving a new version?

Yes, performance seems to be the main use case. Fuel files of hundreds of MB take a long time to load and save, even if you only change a single string. One instance where this is relevant is the Moose model. They currently use .mse files, which use a textual representation (if I’m not mistaken). Usman asked me if Fuel were an option, especially for updating only parts of the model.

Ah ok. Then yes, I suspect Fuel will be faster, but that patching is more complex because it can't be done in a text editor. How do they edit the .mse file? IME text editors suck on large files. What are the relative frequencies of writing, reading and editing these files?

Lets take an example like Fuel itself. This may have specialised marshalling and unmarshalling extensions defined on may classes, some of which may be to do with the GUI. If we have partial loading we can load Fuel into a headless image. The extensions on the GUI classes will simply not be installed, *until* we load the GUI. Hence Fuel does not have to be decomposed every time we factor the system into subcomponents. Without partial loading Fuel must be decomposed into a series of fragments so that it can load that part that fits into the headless base, and we have to manage teh dependency to ensure the fragment that loads against the GUI is loaded when the GUI is loaded. Worse still, if we cut the GUI into two (e.g. development vs deployment tools) we have to visit the Fuel GUI component (and potentially many other component) and decompose it into two pieces. In practice this is extremely costly to maintain, verging on chaos. With partial loading things are simple.

Perhaps I should have called it partial installation, but you get the idea.

Thanks for the explanation.

>
> Norbert
>
>> Would that make loading faster?
>>
>> Does it already exist?
>>
>> Thanks,
>>
>> Thierry
>

--
best,
Eliot

--
best,
Eliot

Max Leske

Re: Monticello and Fuel

On 07 Dec 2014, at 19:24, Eliot Miranda <[hidden email]> wrote:

Hi Max,

oops, I was being dense...

On Dec 7, 2014, at 5:14 AM, Max Leske <[hidden email]> wrote:

On 07 Dec 2014, at 04:19, Eliot Miranda <[hidden email]> wrote:

Hi Max,

Hi Eliot,

On Sat, Dec 6, 2014 at 1:43 AM, Max Leske <[hidden email]> wrote:

On 06 Dec 2014, at 04:37, Eliot Miranda <[hidden email]> wrote:

On Thu, Dec 4, 2014 at 11:35 PM, Max Leske <[hidden email]> wrote:

> On 05 Dec 2014, at 08:02, Norbert Hartl <[hidden email]> wrote:
>
>
>
>
>> Am 04.12.2014 um 23:31 schrieb Thierry Goubier <[hidden email]>:
>>
>> Hi all,
>>
>> I'm just wondering.
>>
>> Would it work to have a package format based on Fuel?
>>
> I doubt it would work cross platform. I don't know how fuel serializes WideString, LargePositiveInteger, BoxedFloat64. These differ between smalltalk platforms. The source as string solves that. Strings are written as unicode string and numbers as certain number format etc.

True. Cross dialect loading isn’t something we encourage to do with Fuel.

@Thierry
You said something about partial loading: that will never be possible with Fuel because of its pickle format. To select any partial graph you have to first read the entire file and rebuild the graph first (ok, you don’t strictly have to build the graph but you still need to read the entire file).

Hi Max, you misunderstand what partial loading means in this context. I designed and implemented partial loading for the Parcel system in VW, and Fuel is essentially a clean reimplementation of parcels. The idea is that one does indeed read the entire object graph, but then only the parts of the graph that mate with the current class hierarchy are installed, and the bits that don't fit are stored for later. Why? So that a component can define extensions on classes in components that may not be present. Why? So that one can maintain a single logical component in a single package instead of decomposing it into independently loadable fragments.

Makes sense. It’s just that I often get asked if it is possible to read / update only parts of a Fuel file, so I wanted to make it clear that that can’t work.

Right, good point. But is there a use case? For example, in effect is there a difference between being able to patch a part of a Fuel file and loading that file into an image (possibly automatically), patching the object graph and saving a new version?

Yes, performance seems to be the main use case. Fuel files of hundreds of MB take a long time to load and save, even if you only change a single string. One instance where this is relevant is the Moose model. They currently use .mse files, which use a textual representation (if I’m not mistaken). Usman asked me if Fuel were an option, especially for updating only parts of the model.

Ah ok. Then yes, I suspect Fuel will be faster, but that patching is more complex because it can't be done in a text editor. How do they edit the .mse file? IME text editors suck on large files. What are the relative frequencies of writing, reading and editing these files?

I don’t think they can patch the .mse files. That’s exactly why they are looking for alternatives.

I have no clue about the read / write frenquencies they need. I’m CC’ing this to Usman, maybe can comment on that.

Lets take an example like Fuel itself. This may have specialised marshalling and unmarshalling extensions defined on may classes, some of which may be to do with the GUI. If we have partial loading we can load Fuel into a headless image. The extensions on the GUI classes will simply not be installed, *until* we load the GUI. Hence Fuel does not have to be decomposed every time we factor the system into subcomponents. Without partial loading Fuel must be decomposed into a series of fragments so that it can load that part that fits into the headless base, and we have to manage teh dependency to ensure the fragment that loads against the GUI is loaded when the GUI is loaded. Worse still, if we cut the GUI into two (e.g. development vs deployment tools) we have to visit the Fuel GUI component (and potentially many other component) and decompose it into two pieces. In practice this is extremely costly to maintain, verging on chaos. With partial loading things are simple.

Perhaps I should have called it partial installation, but you get the idea.

Thanks for the explanation.

>
> Norbert
>
>> Would that make loading faster?
>>
>> Does it already exist?
>>
>> Thanks,
>>
>> Thierry
>

--
best,
Eliot

--
best,
Eliot

Chris Muller-3

Re: Monticello and Fuel

In reply to this post by Eliot Miranda-2

> Yes, performance seems to be the main use case. Fuel files of hundreds of MB
> take a long time to load and save, even if you only change a single string.
> One instance where this is relevant is the Moose model. They currently use
> .mse files, which use a textual representation (if I’m not mistaken). Usman
> asked me if Fuel were an option, especially for updating only parts of the
> model.
>
>
> Ok, but how could patching the file be much faster? The file has to be
> parsed into something; why not the objects themselves? To me loading,
> editing and red acing is patching and done as fast as Fuel can do things.
>
> Depending on syntax you may be able to scan for a string and replace using
> eg the sed stream editor. But that'll only work for simple data like
> strings.
>
> If you really need this to be faster have you considered
>
> - decomposing the Moose model into smaller individual components?
>
> - keeping the moose model to be saved in an image (possibly live running in
> a server) and patching the model in the image and saving it

This is a classic problem; as models start to get large they get
harder to update. If there were some way to update the file with just
the parts in memory that changed since the last save, it would perform
a lot faster for the end user.

Of course whatever makes models large is also what tends to make them
more interesting to more users, but moves inversely to the
sharability. Some kind of server is needed to serve up the fuel file
and use the above to dish out whatever portions of the file needed by
user for the part of the model they're accessing. When they save,
only the parts of the model they changed, nothing more, are sent back
to the server which performs a random-access update to the fuel file
in a controlled, synchronous manner.

Although, even with that I guess users could still overlay other users
work very easily. Some kind of concurrency control mechanism should
look at the changes being made and signal a warning if that happens.