Smalltalk › Squeak › Squeak - Dev

Files-ul.154 seems dangerous, is it needed?

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

9 messages Options

David T. Lewis

Files-ul.154 seems dangerous, is it needed?

We did this change recently:

Name: Files-ul.154
Time: 15 May 2016, 11:17:08.015805 pm

- use #adoptInstance: to convert between binary and ascii mode instead of creating new collections

This assumes that we can convert a ByteString to a ByteArray using
Behavior>>adoptInstance. If the conversion does not work, as it the case
with a V3 image and Cog or interpreter VM, then we end up with a corrupt
image after trying to save the image as new version (and maybe other cases
as well).

It does not seem that #ascii and #binary are called frequently, even when
doing Monticello updates, so I suspect that the optimization is not very
important.

Do we need the #ascii and #binary optimizations in Files-ul.154?

Dave

Levente Uzonyi

Re: Files-ul.154 seems dangerous, is it needed?

Hi Dave,

It's an optimization, so it's not mandatory. Of course it's nice to have,
and #adoptInstance: should work (as it does in Spur).
So the question is: why doesn't it work?

The answer is that in V3 ByteString is compact, but ByteArray is not.
Since the compact classes array has plenty of space, I suggest ByteArray
should be compact as well.
If you evaluate [ByteArray becomeCompact] in a V3 image, then you'll find
that the change will work in those images, too.

I've mentioned it earlier that Spur made some optimizations possible, that
are not V3 compatible. Another widely used example is using #== for
Character comparison. These will of course not work in V3 images, and
detecting the problems due to these changes is extremely hard.

Levente

On Mon, 30 May 2016, David T. Lewis wrote:

> We did this change recently:
>
> Name: Files-ul.154
> Time: 15 May 2016, 11:17:08.015805 pm
>
> - use #adoptInstance: to convert between binary and ascii mode instead of creating new collections
>
> This assumes that we can convert a ByteString to a ByteArray using
> Behavior>>adoptInstance. If the conversion does not work, as it the case
> with a V3 image and Cog or interpreter VM, then we end up with a corrupt
> image after trying to save the image as new version (and maybe other cases
> as well).
>
> It does not seem that #ascii and #binary are called frequently, even when
> doing Monticello updates, so I suspect that the optimization is not very
> important.
>
> Do we need the #ascii and #binary optimizations in Files-ul.154?
>
> Dave
>
>

David T. Lewis

Re: Files-ul.154 seems dangerous, is it needed?

On Tue, May 31, 2016 at 12:35:12AM +0200, Levente Uzonyi wrote:

> Hi Dave,
>
> It's an optimization, so it's not mandatory. Of course it's nice to have,
> and #adoptInstance: should work (as it does in Spur).
> So the question is: why doesn't it work?
>
> The answer is that in V3 ByteString is compact, but ByteArray is not.
> Since the compact classes array has plenty of space, I suggest ByteArray
> should be compact as well.
> If you evaluate [ByteArray becomeCompact] in a V3 image, then you'll find
> that the change will work in those images, too.

Brilliant! Yes, [ByteArray becomeCompact] makes it work.

It still fails on interpreter VM (VM crash), but I think there is a bug
in the interpreter primitiveAdoptInstance so that is not directly related,
so that would be a separate issue.

>
> I've mentioned it earlier that Spur made some optimizations possible, that
> are not V3 compatible. Another widely used example is using #== for
> Character comparison. These will of course not work in V3 images, and
> detecting the problems due to these changes is extremely hard.
>
> Levente

Not really so extremely hard, I noticed it because of failures on
http://build.squeak.org/job/FollowTrunkOnOldV3Image/

Overall, I am quite happy to see that the Spur/V3 differences remain
isolated to 4 packages for trunk. The one glitch that I noticed here
affects the Files package, but this is the first such case that I have
seen since the Spur conversion. So this seems really good to me.

Dave

>
> On Mon, 30 May 2016, David T. Lewis wrote:
>
> >We did this change recently:
> >
> > Name: Files-ul.154
> > Time: 15 May 2016, 11:17:08.015805 pm
> >
> > - use #adoptInstance: to convert between binary and ascii mode instead of
> > creating new collections
> >
> >This assumes that we can convert a ByteString to a ByteArray using
> >Behavior>>adoptInstance. If the conversion does not work, as it the case
> >with a V3 image and Cog or interpreter VM, then we end up with a corrupt
> >image after trying to save the image as new version (and maybe other cases
> >as well).
> >
> >It does not seem that #ascii and #binary are called frequently, even when
> >doing Monticello updates, so I suspect that the optimization is not very
> >important.
> >
> >Do we need the #ascii and #binary optimizations in Files-ul.154?
> >
> >Dave
> >
> >

Chris Muller-3

Re: Files-ul.154 seems dangerous, is it needed?

In reply to this post by Levente Uzonyi

> It's an optimization, so it's not mandatory. Of course it's nice to have,

An optimization of what? Execution speed? Is it such an optimization
that any human could ever possibly notice? Setting the mode of a
stream, really?

I like most of your optimizations but, IMO, this one is so
infinitesmal, I don't think it exceeds the cost of the decreased
legibility / complexity..

Levente Uzonyi

Re: Files-ul.154 seems dangerous, is it needed?

On Tue, 31 May 2016, Chris Muller wrote:

>> It's an optimization, so it's not mandatory. Of course it's nice to have,
>
> An optimization of what? Execution speed? Is it such an optimization
> that any human could ever possibly notice? Setting the mode of a
> stream, really?

I suspect that you expect one to set mode at most once per stream,
which is not always the case.
I could easily give you an artificial benchmark showing 30x speedup, but
that would make no sense.
It's a library method and as such, why not make it quicker when possible?

>
> I like most of your optimizations but, IMO, this one is so
> infinitesmal, I don't think it exceeds the cost of the decreased
> legibility / complexity..

Please explain the legibility / complexity part.

Levente

>
>

Chris Muller-4

Re: Files-ul.154 seems dangerous, is it needed?

You're right, I've never imagined a system that would need to be
flipping the mode back and forth. In that case, adoptInstance is
quicker, but since we've had to special-case it, its certainly more
"complex" and the fact that swapping out an objects class is not a
common thing to do in Smalltalk, it seems fair to say its less
legible. That's not an objection, though. :)

On Tue, May 31, 2016 at 11:12 AM, Levente Uzonyi <[hidden email]> wrote:

> On Tue, 31 May 2016, Chris Muller wrote:
>
>>> It's an optimization, so it's not mandatory. Of course it's nice to have,
>>
>>
>> An optimization of what? Execution speed? Is it such an optimization
>> that any human could ever possibly notice? Setting the mode of a
>> stream, really?
>
>
> I suspect that you expect one to set mode at most once per stream, which is
> not always the case.
> I could easily give you an artificial benchmark showing 30x speedup, but
> that would make no sense.
> It's a library method and as such, why not make it quicker when possible?
>
>>
>> I like most of your optimizations but, IMO, this one is so
>> infinitesmal, I don't think it exceeds the cost of the decreased
>> legibility / complexity..
>
>
> Please explain the legibility / complexity part.
>
> Levente
>
>>
>>
>

David T. Lewis

Re: Files-ul.154 seems dangerous, is it needed?

In reply to this post by Levente Uzonyi

On Tue, May 31, 2016 at 06:12:53PM +0200, Levente Uzonyi wrote:

> On Tue, 31 May 2016, Chris Muller wrote:
>
> >>It's an optimization, so it's not mandatory. Of course it's nice to have,
> >
> >An optimization of what? Execution speed? Is it such an optimization
> >that any human could ever possibly notice? Setting the mode of a
> >stream, really?
>
> I suspect that you expect one to set mode at most once per stream,
> which is not always the case.
> I could easily give you an artificial benchmark showing 30x speedup, but
> that would make no sense.
> It's a library method and as such, why not make it quicker when possible?
>
> >
> >I like most of your optimizations but, IMO, this one is so
> >infinitesmal, I don't think it exceeds the cost of the decreased
> >legibility / complexity..
>
> Please explain the legibility / complexity part.
>

I don't think there is any problem with complexity or legibility here.
The original use of "collection asByteArray" is very readable. The faster
"ByteArray adoptInstance: collection" may require a little more thought,
but both are clear enough in this context.

That said, in the case of setting a stream to #binary or #ascii, the
#adoptInstance: approach does seem fragile compared to #asByteArray.
If Behavior>>adoptInstance: fails, the fallback code attempts to use
Object>>primitiveChangeClassTo: which also fails, this time with no fallback
path. Thus if the primitives for changing the class of an object fail,
there is no fallback and it may no longer be possible to write a new binary
file (such as new image and changes files).

Another way to look at it - if the optimization is worthwhile, maybe it
belongs in the Collections package, which already needs to be different
for Spur versus V3.

For Spur, shouldn't it just be this:

ByteString>>asByteArray
ByteArray adoptInstance: self

I did not really test this, but it should give us the same performance
optimization in a more general way.

Dave

Levente Uzonyi

Re: Files-ul.154 seems dangerous, is it needed?

Hi Dave

On Tue, 31 May 2016, David T. Lewis wrote:

> That said, in the case of setting a stream to #binary or #ascii, the
> #adoptInstance: approach does seem fragile compared to #asByteArray.
> If Behavior>>adoptInstance: fails, the fallback code attempts to use
> Object>>primitiveChangeClassTo: which also fails, this time with no fallback
> path. Thus if the primitives for changing the class of an object fail,
> there is no fallback and it may no longer be possible to write a new binary
> file (such as new image and changes files).

You would still get a debugger in such case where you would have a chance
to fix/work around the problem, wouldn't you?

>
> Another way to look at it - if the optimization is worthwhile, maybe it
> belongs in the Collections package, which already needs to be different
> for Spur versus V3.
>
> For Spur, shouldn't it just be this:
>
> ByteString>>asByteArray
> ByteArray adoptInstance: self
>
> I did not really test this, but it should give us the same performance
> optimization in a more general way.

This would significantly change the behavior, because currently you know
that a copy will be returned.
We might as well use a different selector (#toByteArray or #beByteArray),
but I don't know if that would help V3 images.

Levente

>
> Dave
>
>
>

David T. Lewis

Re: Files-ul.154 seems dangerous, is it needed?

Hi Levente,

On Wed, Jun 01, 2016 at 10:40:09AM +0200, Levente Uzonyi wrote:

> Hi Dave
>
> On Tue, 31 May 2016, David T. Lewis wrote:
>
> >That said, in the case of setting a stream to #binary or #ascii, the
> >#adoptInstance: approach does seem fragile compared to #asByteArray.
> >If Behavior>>adoptInstance: fails, the fallback code attempts to use
> >Object>>primitiveChangeClassTo: which also fails, this time with no
> >fallback
> >path. Thus if the primitives for changing the class of an object fail,
> >there is no fallback and it may no longer be possible to write a new binary
> >file (such as new image and changes files).
>
> You would still get a debugger in such case where you would have a chance
> to fix/work around the problem, wouldn't you?

Theoretically yes, but in practice I have abandoned several images after
trying to play around with it. I think what probably is happening is the
changes file gets damaged as a result of not being able to reopen it in
binary mode. I'm not certain that this is the cause, but I seem to end
up with methods that have no source, so that would be the likely reason.

Cog seems to recover more gracefully than interpreter VM in this case,
so it may be that there are contributing issues in the interpreter VM.
I am not really sure, but it does seem like having a #binary method that
could fail without working fallback code is rather dangerous.

>
> >
> >Another way to look at it - if the optimization is worthwhile, maybe it
> >belongs in the Collections package, which already needs to be different
> >for Spur versus V3.
> >
> >For Spur, shouldn't it just be this:
> >
> >ByteString>>asByteArray
> > ByteArray adoptInstance: self
> >
> >I did not really test this, but it should give us the same performance
> >optimization in a more general way.
>
> This would significantly change the behavior, because currently you know
> that a copy will be returned.
> We might as well use a different selector (#toByteArray or #beByteArray),
> but I don't know if that would help V3 images.
>

D'oh! Yes of course you are right.

Dave