Smalltalk › Pharo › Pharo Smalltalk Developers

Re: large images

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

4 messages Options

Eliot Miranda-2

Re: large images

Hi Both,

let me try again :-/

> On Oct 8, 2016, at 10:49 AM, Tudor Girba <[hidden email]> wrote:
>
> Nice!
>
> I put Eliot in CC :).
>
> @Eliot: John was playing with some large images and I asked him if he could save/load to see what happens. The report is below. Nice job :).
>
> Doru
>
>
>> On Oct 8, 2016, at 7:48 PM, John Brant <[hidden email]> wrote:
>>
>> I loaded my model (6.8GiB on linux), saved the image (7.2GB), and started the image. It all worked. It took a few minutes to load the image, but it worked. It takes ~15 seconds to quit an image that large. I'm not sure what quit is doing, but it appears to be dependent on the image size.

Regrettably the squeak vm does a full gc on snapshot, and then, unavoidably, it does a scan of all contexts in the heap, changing any with machine code pcs to have bytecode pcs so that the image can be restarted on a different version/platform. It then writes the heap segments to the file.

I guess the scan could be folded into the gc. This is a long time to wait!

The gc makes sense only as a way of voiding new space; image loading is simplified only having to load old space segments. It does /not/ make sense semantically because and finalization actions it triggers cannot be responded to until after the snapshot.

Instead, it makes much more sense for the image to invoke a full gc immediately prior to snapshot, and drain the finalization queue (something I guess will happen implicitly due to finalization process priority). This means that expected finalization activities such as flushing and closing output files will actually take place. The existing architecture effectively throws these actions away.

Both squeak and pharo communities could do well to discuss this and agree on an approach.

>>
>>
>> John Brant
>
> --
> www.tudorgirba.com
> www.feenk.com
>
> "Every thing has its own flow."
>
>
>
>
>

David T. Lewis

Re: [squeak-dev] Re: large images

Hi,

Let's measure where the time is going before we fix it. I have saved V3 images
of that size and more, and it takes a very long time to write the image file
independent of any GC or finalization actions that need to happen. We should
measure this on a large Spur image save, but I suspect that the dominant factor
will turn out to be the time that it takes to flush all those gigabytes out
to storage. Garbage collection and finalization actions might be just round-off
error.

Dave

On Fri, Oct 14, 2016 at 08:29:13AM -0700, Eliot Miranda wrote:

> Hi Both,
>
> let me try again :-/
>
> > On Oct 8, 2016, at 10:49 AM, Tudor Girba <[hidden email]> wrote:
> >
> > Nice!
> >
> > I put Eliot in CC :).
> >
> > @Eliot: John was playing with some large images and I asked him if he could save/load to see what happens. The report is below. Nice job :).
> >
> > Doru
> >
> >
> >> On Oct 8, 2016, at 7:48 PM, John Brant <[hidden email]> wrote:
> >>
> >> I loaded my model (6.8GiB on linux), saved the image (7.2GB), and started the image. It all worked. It took a few minutes to load the image, but it worked. It takes ~15 seconds to quit an image that large. I'm not sure what quit is doing, but it appears to be dependent on the image size.
>
> Regrettably the squeak vm does a full gc on snapshot, and then, unavoidably, it does a scan of all contexts in the heap, changing any with machine code pcs to have bytecode pcs so that the image can be restarted on a different version/platform. It then writes the heap segments to the file.
>
> I guess the scan could be folded into the gc. This is a long time to wait!
>
> The gc makes sense only as a way of voiding new space; image loading is simplified only having to load old space segments. It does /not/ make sense semantically because and finalization actions it triggers cannot be responded to until after the snapshot.
>
> Instead, it makes much more sense for the image to invoke a full gc immediately prior to snapshot, and drain the finalization queue (something I guess will happen implicitly due to finalization process priority). This means that expected finalization activities such as flushing and closing output files will actually take place. The existing architecture effectively throws these actions away.
>
> Both squeak and pharo communities could do well to discuss this and agree on an approach.
>
>
> >>
> >>
> >> John Brant
> >
> > --
> > www.tudorgirba.com
> > www.feenk.com
> >
> > "Every thing has its own flow."
> >
> >
> >
> >
> >

John Brant-2

Re: [squeak-dev] Re: large images

The time I was talking about was the time to quit the image without
saving. This time seems to be dependent on the size of the image. For
example, a 100MB image quits almost immediately, but a couple GB one
takes 5 or so seconds and my 7GB one was taking ~15 seconds.

John Brant

On 10/14/2016 05:34 PM, David T. Lewis wrote:

> Hi,
>
> Let's measure where the time is going before we fix it. I have saved V3 images
> of that size and more, and it takes a very long time to write the image file
> independent of any GC or finalization actions that need to happen. We should
> measure this on a large Spur image save, but I suspect that the dominant factor
> will turn out to be the time that it takes to flush all those gigabytes out
> to storage. Garbage collection and finalization actions might be just round-off
> error.
>
> Dave
>
>
> On Fri, Oct 14, 2016 at 08:29:13AM -0700, Eliot Miranda wrote:
>> Hi Both,
>>
>> let me try again :-/
>>
>>> On Oct 8, 2016, at 10:49 AM, Tudor Girba <[hidden email]> wrote:
>>>
>>> Nice!
>>>
>>> I put Eliot in CC :).
>>>
>>> @Eliot: John was playing with some large images and I asked him if he could save/load to see what happens. The report is below. Nice job :).
>>>
>>> Doru
>>>
>>>
>>>> On Oct 8, 2016, at 7:48 PM, John Brant <[hidden email]> wrote:
>>>>
>>>> I loaded my model (6.8GiB on linux), saved the image (7.2GB), and started the image. It all worked. It took a few minutes to load the image, but it worked. It takes ~15 seconds to quit an image that large. I'm not sure what quit is doing, but it appears to be dependent on the image size.
>>
>> Regrettably the squeak vm does a full gc on snapshot, and then, unavoidably, it does a scan of all contexts in the heap, changing any with machine code pcs to have bytecode pcs so that the image can be restarted on a different version/platform. It then writes the heap segments to the file.
>>
>> I guess the scan could be folded into the gc. This is a long time to wait!
>>
>> The gc makes sense only as a way of voiding new space; image loading is simplified only having to load old space segments. It does /not/ make sense semantically because and finalization actions it triggers cannot be responded to until after the snapshot.
>>
>> Instead, it makes much more sense for the image to invoke a full gc immediately prior to snapshot, and drain the finalization queue (something I guess will happen implicitly due to finalization process priority). This means that expected finalization activities such as flushing and closing output files will actually take place. The existing architecture effectively throws these actions away.
>>
>> Both squeak and pharo communities could do well to discuss this and agree on an approach.
>>
>>
>>>>
>>>>
>>>> John Brant
>>>
>>> --
>>> www.tudorgirba.com
>>> www.feenk.com
>>>
>>> "Every thing has its own flow."
>>>
>>>
>>>
>>>
>>>
>

David T. Lewis

Re: [squeak-dev] Re: large images

Ah, right - sorry for my misunderstanding.

Certainly it would make sense to eliminate the GC if we are going to quit
the image anyway. Processing the shutdown list and finalization queue probably
makes sense as a matter of policy, even if in practice these actions are
rarely needed. For cases in which, for example, database sessions are active,
it might be very important to ensure clean shutdown when quitting the image.

Dave

On Fri, Oct 14, 2016 at 11:08:56PM -0500, John Brant wrote:

> The time I was talking about was the time to quit the image without
> saving. This time seems to be dependent on the size of the image. For
> example, a 100MB image quits almost immediately, but a couple GB one
> takes 5 or so seconds and my 7GB one was taking ~15 seconds.
>
>
> John Brant
>
>
> On 10/14/2016 05:34 PM, David T. Lewis wrote:
> >Hi,
> >
> >Let's measure where the time is going before we fix it. I have saved V3
> >images
> >of that size and more, and it takes a very long time to write the image
> >file
> >independent of any GC or finalization actions that need to happen. We
> >should
> >measure this on a large Spur image save, but I suspect that the dominant
> >factor
> >will turn out to be the time that it takes to flush all those gigabytes out
> >to storage. Garbage collection and finalization actions might be just
> >round-off
> >error.
> >
> >Dave
> >
> >
> >On Fri, Oct 14, 2016 at 08:29:13AM -0700, Eliot Miranda wrote:
> >>Hi Both,
> >>
> >> let me try again :-/
> >>
> >>>On Oct 8, 2016, at 10:49 AM, Tudor Girba <[hidden email]> wrote:
> >>>
> >>>Nice!
> >>>
> >>>I put Eliot in CC :).
> >>>
> >>>@Eliot: John was playing with some large images and I asked him if he
> >>>could save/load to see what happens. The report is below. Nice job :).
> >>>
> >>>Doru
> >>>
> >>>
> >>>>On Oct 8, 2016, at 7:48 PM, John Brant <[hidden email]>
> >>>>wrote:
> >>>>
> >>>>I loaded my model (6.8GiB on linux), saved the image (7.2GB), and
> >>>>started the image. It all worked. It took a few minutes to load the
> >>>>image, but it worked. It takes ~15 seconds to quit an image that large.
> >>>>I'm not sure what quit is doing, but it appears to be dependent on the
> >>>>image size.
> >>
> >>Regrettably the squeak vm does a full gc on snapshot, and then,
> >>unavoidably, it does a scan of all contexts in the heap, changing any
> >>with machine code pcs to have bytecode pcs so that the image can be
> >>restarted on a different version/platform. It then writes the heap
> >>segments to the file.
> >>
> >>I guess the scan could be folded into the gc. This is a long time to
> >>wait!
> >>
> >>The gc makes sense only as a way of voiding new space; image loading is
> >>simplified only having to load old space segments. It does /not/ make
> >>sense semantically because and finalization actions it triggers cannot be
> >>responded to until after the snapshot.
> >>
> >>Instead, it makes much more sense for the image to invoke a full gc
> >>immediately prior to snapshot, and drain the finalization queue
> >>(something I guess will happen implicitly due to finalization process
> >>priority). This means that expected finalization activities such as
> >>flushing and closing output files will actually take place. The existing
> >>architecture effectively throws these actions away.
> >>
> >>Both squeak and pharo communities could do well to discuss this and agree
> >>on an approach.
> >>
> >>
> >>>>
> >>>>
> >>>>John Brant
> >>>
> >>>--
> >>>www.tudorgirba.com
> >>>www.feenk.com
> >>>
> >>>"Every thing has its own flow."
> >>>
> >>>
> >>>
> >>>
> >>>
> >