Smalltalk › Squeak › Squeak - Dev

References to a removed and reloaded class

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

15 messages Options

Levente Uzonyi-2

References to a removed and reloaded class

Hi,

I found a bug recently which has the following symptoms:
- If you remove a class from the system, which is referenced from a
method, then the method will keep a reference for the class, but the
class will be obsolete.
- If you create a class with the same name, then the old references will
still point to the old, obsolete class instead of the new one.
So there will be two bindings in the system using the same class name, but
pointing to different classes.

In order to fix this bug, I came up with two ideas:
1) Recompile all methods which point to the old class when the class is
removed. This helps, because the class will be undeclared at that point,
so a new binding will be stored in Undeclared and the same binding will be
used when the class is added again. The drawback of this solution is that
it only fixes references in methods and it's slow, since it has to scan
all methods in the system.
2) Create a new object (a datastructure) which points to the binding
weakly and provides fast lookup by class name. Store the binding in this
object when the class is removed and when a class is added, then check
this object for existing bindings (besides Undeclared). This solution is
fast, but more complicated than the previous one. But the same kind of
datastructure can be used to replace Undeclared, so it will not hold onto
bindings unnecessarily and it won't have to be cleared manually.

I uploaded a few days ago System-ul.497.mcz to the Inbox which provides
solution #1. There are ways to improve the performance of SystemNavigation
>> #allCallsOn: by ~20%, so it will take only 20-40ms more to remove a
class from the system, which might be good enough for now.

In 4.5 I'd like to implement solution #2 and replace Undeclared to work
this way too, but I think for 4.4 we can live with solution #1. What do
you think?

Cheers,
Levente

Colin Putney-3

Re: References to a removed and reloaded class

On Thu, Sep 20, 2012 at 11:28 AM, Levente Uzonyi <[hidden email]> wrote:

> In order to fix this bug, I came up with two ideas:
> 1) Recompile all methods which point to the old class when the class is
> removed. This helps, because the class will be undeclared at that point, so
> a new binding will be stored in Undeclared and the same binding will be used
> when the class is added again. The drawback of this solution is that it only
> fixes references in methods and it's slow, since it has to scan all methods
> in the system.

Couldn't we just move the existing binding to Undeclared? That might
be unnecessary if there are no references, but I don't think it would
hurt.

> 2) Create a new object (a datastructure) which points to the binding weakly
> and provides fast lookup by class name. Store the binding in this object
> when the class is removed and when a class is added, then check this object
> for existing bindings (besides Undeclared). This solution is fast, but more
> complicated than the previous one. But the same kind of
> datastructure can be used to replace Undeclared, so it will not hold onto
> bindings unnecessarily and it won't have to be cleared manually.
>
> I uploaded a few days ago System-ul.497.mcz to the Inbox which provides
> solution #1. There are ways to improve the performance of SystemNavigation
>>>
>>> #allCallsOn: by ~20%, so it will take only 20-40ms more to remove a
>
> class from the system, which might be good enough for now.
>
> In 4.5 I'd like to implement solution #2 and replace Undeclared to work this
> way too, but I think for 4.4 we can live with solution #1. What do you
> think?

I'm not sure I follow solution #2, but we'll be addressing this stuff
anyway in 4.5 with Environments. We can certainly make sure this
doesn't crop up again.

Colin

Levente Uzonyi-2

Re: References to a removed and reloaded class

On Thu, 20 Sep 2012, Colin Putney wrote:

> On Thu, Sep 20, 2012 at 11:28 AM, Levente Uzonyi <[hidden email]> wrote:
>
>> In order to fix this bug, I came up with two ideas:
>> 1) Recompile all methods which point to the old class when the class is
>> removed. This helps, because the class will be undeclared at that point, so
>> a new binding will be stored in Undeclared and the same binding will be used
>> when the class is added again. The drawback of this solution is that it only
>> fixes references in methods and it's slow, since it has to scan all methods
>> in the system.
>
> Couldn't we just move the existing binding to Undeclared? That might
> be unnecessary if there are no references, but I don't think it would
> hurt.

I tried that, but ran into another issue related to Undeclared. Class >>
#rename: checks if the new class name is included in Undeclared. And if it
is, it'll #inform: about it. The problem is that Undeclared holds a
strong reference to its bindings, so if we remove a class which has no
references and we move it to Undeclared, then it will linger there, until
someone calls #cleanOutUndeclared. This is the weakness I described in
solution #2.

To work around this we can remove the unreferenced keys from Undeclared
in Class >> #rename:, but it means, that all methods in the system will
be scanned for each renamed class.

>
>> 2) Create a new object (a datastructure) which points to the binding weakly
>> and provides fast lookup by class name. Store the binding in this object
>> when the class is removed and when a class is added, then check this object
>> for existing bindings (besides Undeclared). This solution is fast, but more
>> complicated than the previous one. But the same kind of
>> datastructure can be used to replace Undeclared, so it will not hold onto
>> bindings unnecessarily and it won't have to be cleared manually.
>>
>> I uploaded a few days ago System-ul.497.mcz to the Inbox which provides
>> solution #1. There are ways to improve the performance of SystemNavigation
>>>>
>>>> #allCallsOn: by ~20%, so it will take only 20-40ms more to remove a
>>
>> class from the system, which might be good enough for now.
>>
>> In 4.5 I'd like to implement solution #2 and replace Undeclared to work this
>> way too, but I think for 4.4 we can live with solution #1. What do you
>> think?
>
> I'm not sure I follow solution #2, but we'll be addressing this stuff

The idea is to store className -> binding pairs in a modified
WeakValueDictionary which cleans itself automatically as its values get
garbage collected or in a new kind of weak dictionary which has weak
references to both its keys and values.

> anyway in 4.5 with Environments. We can certainly make sure this
> doesn't crop up again.

I guess there will be a per environment "Undeclared", but it shouldn't
work like the current one (see above why).

I also saw, that Environment uses an IdentityDictionary to hold the
classes instead of a SystemDictionary. It means that it doesn't use
Undeclared at all and it doesn't use the hash advantage implemented in
SystemDictionary, which uses #== and #hash (instead of #identityHash).
Using #hash works, because Symbols are unique. Since calculating the
hash is supported by a primitive, therefore it's pretty fast and the use
of #hash itself reduces the chance for collisions.

Levente

>
> Colin
>
>

Eliot Miranda-2

Re: References to a removed and reloaded class

On Thu, Sep 20, 2012 at 1:45 PM, Levente Uzonyi <[hidden email]> wrote:

On Thu, 20 Sep 2012, Colin Putney wrote:

On Thu, Sep 20, 2012 at 11:28 AM, Levente Uzonyi <[hidden email]> wrote:

In order to fix this bug, I came up with two ideas:
1) Recompile all methods which point to the old class when the class is
removed. This helps, because the class will be undeclared at that point, so
a new binding will be stored in Undeclared and the same binding will be used
when the class is added again. The drawback of this solution is that it only
fixes references in methods and it's slow, since it has to scan all methods
in the system.

Couldn't we just move the existing binding to Undeclared? That might
be unnecessary if there are no references, but I don't think it would
hurt.

I tried that, but ran into another issue related to Undeclared. Class >> #rename: checks if the new class name is included in Undeclared. And if it is, it'll #inform: about it. The problem is that Undeclared holds a strong reference to its bindings, so if we remove a class which has no references and we move it to Undeclared, then it will linger there, until someone calls #cleanOutUndeclared. This is the weakness I described in solution #2.

To work around this we can remove the unreferenced keys from Undeclared in Class >> #rename:, but it means, that all methods in the system will be scanned for each renamed class.

That's a price well worth paying. Traversing the class hierarchy to fid methods is quite quick (much faster than allInstances) and class rename is rare (i.e. a devlopment time activity, not a typical runtime activity). So IMO this is the way to go.

2) Create a new object (a datastructure) which points to the binding weakly
and provides fast lookup by class name. Store the binding in this object
when the class is removed and when a class is added, then check this object
for existing bindings (besides Undeclared). This solution is fast, but more
complicated than the previous one. But the same kind of
datastructure can be used to replace Undeclared, so it will not hold onto
bindings unnecessarily and it won't have to be cleared manually.

I uploaded a few days ago System-ul.497.mcz to the Inbox which provides
solution #1. There are ways to improve the performance of SystemNavigation

#allCallsOn: by ~20%, so it will take only 20-40ms more to remove a

class from the system, which might be good enough for now.

In 4.5 I'd like to implement solution #2 and replace Undeclared to work this
way too, but I think for 4.4 we can live with solution #1. What do you
think?

I'm not sure I follow solution #2, but we'll be addressing this stuff

The idea is to store className -> binding pairs in a modified WeakValueDictionary which cleans itself automatically as its values get garbage collected or in a new kind of weak dictionary which has weak references to both its keys and values.

anyway in 4.5 with Environments. We can certainly make sure this
doesn't crop up again.

I guess there will be a per environment "Undeclared", but it shouldn't work like the current one (see above why).

I also saw, that Environment uses an IdentityDictionary to hold the classes instead of a SystemDictionary. It means that it doesn't use Undeclared at all and it doesn't use the hash advantage implemented in SystemDictionary, which uses #== and #hash (instead of #identityHash). Using #hash works, because Symbols are unique. Since calculating the hash is supported by a primitive, therefore it's pretty fast and the use of #hash itself reduces the chance for collisions.

Levente

Colin

--
best,

Eliot

Levente Uzonyi-2

Re: References to a removed and reloaded class

Thu, 20 Sep 2012, Eliot Miranda wrote:

>
>
> On Thu, Sep 20, 2012 at 1:45 PM, Levente Uzonyi <[hidden email]> wrote:
> On Thu, 20 Sep 2012, Colin Putney wrote:
>
> On Thu, Sep 20, 2012 at 11:28 AM, Levente Uzonyi <[hidden email]> wrote:
>
> In order to fix this bug, I came up with two ideas:
> 1) Recompile all methods which point to the old class when the class is
> removed. This helps, because the class will be undeclared at that point, so
> a new binding will be stored in Undeclared and the same binding will be used
> when the class is added again. The drawback of this solution is that it only
> fixes references in methods and it's slow, since it has to scan all methods
> in the system.
>
>
> Couldn't we just move the existing binding to Undeclared? That might
> be unnecessary if there are no references, but I don't think it would
> hurt.
>
>
> I tried that, but ran into another issue related to Undeclared. Class >> #rename: checks if the new class name is included in Undeclared. And if it is, it'll #inform: about it. The problem is that Undeclared holds a strong
> reference to its bindings, so if we remove a class which has no references and we move it to Undeclared, then it will linger there, until someone calls #cleanOutUndeclared. This is the weakness I described in solution #2.
>
> To work around this we can remove the unreferenced keys from Undeclared in Class >> #rename:, but it means, that all methods in the system will be scanned for each renamed class.
>
>
> That's a price well worth paying. Traversing the class hierarchy to fid methods is quite quick (much faster than allInstances) and class rename is rare (i.e. a devlopment time activity, not a typical runtime activity). So IMO this
> is the way to go.

I guess I wasn't clear, because #allInstances is not used by the solution
I suggested. It does two things:
When a class is removed, then
1) traverse the class hierarchy and search for methods which refer to the
binding
2) recompile the methods found in step 1

The alternative is to
- nil out the value of the binding and move it to Undeclared when the
class is removed
- when Undeclared is accessed from Class >> #rename: (and who knows which
other methods) remove the unreferenced keys from it by traversing the
class hierarchy searching for methods that refer to the binding. If no
such method was found then the binding is removed from Undeclared.

The reason why I don't like the alternative is that we don't know if there
are other methods which use Undeclared in a similar way and therefore must
be rewritten. The only reason I know about Class >> #rename: is because a
test was failing when I moved the binding to Undeclared. It might not be
that hard to find the methods, but it's still extra work to be done. I'll
check it out though, because class renaming is definitely more rare than
class removal.

Levente

Levente Uzonyi-2

Re: References to a removed and reloaded class

On Fri, 21 Sep 2012, Levente Uzonyi wrote:

> Thu, 20 Sep 2012, Eliot Miranda wrote:
>
>>
>>
>> On Thu, Sep 20, 2012 at 1:45 PM, Levente Uzonyi <[hidden email]> wrote:
>> On Thu, 20 Sep 2012, Colin Putney wrote:
>>
>> On Thu, Sep 20, 2012 at 11:28 AM, Levente Uzonyi
>> <[hidden email]> wrote:
>>
>> In order to fix this bug, I came up with two ideas:
>> 1) Recompile all methods which point to the old class
>> when the class is
>> removed. This helps, because the class will be undeclared
>> at that point, so
>> a new binding will be stored in Undeclared and the same
>> binding will be used
>> when the class is added again. The drawback of this
>> solution is that it only
>> fixes references in methods and it's slow, since it has
>> to scan all methods
>> in the system.
>>
>>
>> Couldn't we just move the existing binding to Undeclared? That
>> might
>> be unnecessary if there are no references, but I don't think it
>> would
>> hurt.
>>
>>
>> I tried that, but ran into another issue related to Undeclared. Class >>
>> #rename: checks if the new class name is included in Undeclared. And if it
>> is, it'll #inform: about it. The problem is that Undeclared holds a strong
>> reference to its bindings, so if we remove a class which has no references
>> and we move it to Undeclared, then it will linger there, until someone
>> calls #cleanOutUndeclared. This is the weakness I described in solution #2.
>>
>> To work around this we can remove the unreferenced keys from Undeclared in
>> Class >> #rename:, but it means, that all methods in the system will be
>> scanned for each renamed class.
>>
>>
>> That's a price well worth paying. Traversing the class hierarchy to fid
>> methods is quite quick (much faster than allInstances) and class rename is
>> rare (i.e. a devlopment time activity, not a typical runtime activity). So
>> IMO this
>> is the way to go.
>
> I guess I wasn't clear, because #allInstances is not used by the solution I
> suggested. It does two things:
> When a class is removed, then
> 1) traverse the class hierarchy and search for methods which refer to the
> binding
> 2) recompile the methods found in step 1
>
> The alternative is to - nil out the value of the binding and move it to
> Undeclared when the class is removed

There's a test which expects that a removed class's method's class is not
nil, so setting the value of the binding to nil doesn't work.

Levente

> - when Undeclared is accessed from Class >> #rename: (and who knows which
> other methods) remove the unreferenced keys from it by traversing the class
> hierarchy searching for methods that refer to the binding. If no such method
> was found then the binding is removed from Undeclared.
>
> The reason why I don't like the alternative is that we don't know if there
> are other methods which use Undeclared in a similar way and therefore must be
> rewritten. The only reason I know about Class >> #rename: is because a test
> was failing when I moved the binding to Undeclared. It might not be that hard
> to find the methods, but it's still extra work to be done. I'll check it out
> though, because class renaming is definitely more rare than class removal.
>
>
> Levente

Eliot Miranda-2

Re: References to a removed and reloaded class

In reply to this post by Levente Uzonyi-2

On Thu, Sep 20, 2012 at 7:37 PM, Levente Uzonyi <[hidden email]> wrote:

Thu, 20 Sep 2012, Eliot Miranda wrote:

On Thu, Sep 20, 2012 at 1:45 PM, Levente Uzonyi <[hidden email]> wrote:
On Thu, 20 Sep 2012, Colin Putney wrote:

On Thu, Sep 20, 2012 at 11:28 AM, Levente Uzonyi <[hidden email]> wrote:

In order to fix this bug, I came up with two ideas:
1) Recompile all methods which point to the old class when the class is
removed. This helps, because the class will be undeclared at that point, so
a new binding will be stored in Undeclared and the same binding will be used
when the class is added again. The drawback of this solution is that it only
fixes references in methods and it's slow, since it has to scan all methods
in the system.

Couldn't we just move the existing binding to Undeclared? That might
be unnecessary if there are no references, but I don't think it would
hurt.

I tried that, but ran into another issue related to Undeclared. Class >> #rename: checks if the new class name is included in Undeclared. And if it is, it'll #inform: about it. The problem is that Undeclared holds a strong
reference to its bindings, so if we remove a class which has no references and we move it to Undeclared, then it will linger there, until someone calls #cleanOutUndeclared. This is the weakness I described in solution #2.

To work around this we can remove the unreferenced keys from Undeclared in Class >> #rename:, but it means, that all methods in the system will be scanned for each renamed class.

That's a price well worth paying. Traversing the class hierarchy to fid methods is quite quick (much faster than allInstances) and class rename is rare (i.e. a devlopment time activity, not a typical runtime activity). So IMO this
is the way to go.

I guess I wasn't clear, because #allInstances is not used by the solution I suggested. It does two things:
When a class is removed, then
1) traverse the class hierarchy and search for methods which refer to the binding
2) recompile the methods found in step 1

The alternative is to - nil out the value of the binding and move it to Undeclared when the class is removed
- when Undeclared is accessed from Class >> #rename: (and who knows which other methods) remove the unreferenced keys from it by traversing the class hierarchy searching for methods that refer to the binding. If no such method was found then the binding is removed from Undeclared.

The reason why I don't like the alternative is that we don't know if there are other methods which use Undeclared in a similar way and therefore must be rewritten.

I don't understand. How else can one use Undeclared? Any binding in a method which is undefined should be in Undefined. Hence when a class is removed or when a class is renamed the binding should be moved to Undeclared and left there, but when a class is renamed we can offer the user the opportunity to redefine methods that refer to the class with the old name (edit the methods to change the name to the new name).

One might think that on rename the binding can be reused and its key changed. But this would leave the methods referring to the new class but with their source naming the old name, hence on recompile the methods would now refer to the binding ... in Undeclared. So the only correct thing for me is to move the binding to Undeclared, and, if interactive, offer up either a list browser of all references to the class, or an option to edit those methods automatically.

The only reason I know about Class >> #rename: is because a test was failing when I moved the binding to Undeclared. It might not be that hard to find the methods, but it's still extra work to be done. I'll check it out though, because class renaming is definitely more rare than class removal.

Levente

--
best,

Eliot

Levente Uzonyi-2

Re: References to a removed and reloaded class

I guess we're going off the track with class renaming. The problem I'd
like to solve is:
If we remove a class and then add a class to the system with the same
name, then the methods which were referring to the class before the
removal are still referring to the same old class. This means that two
bindings will coexist in the system with the same class name, but pointing
to different classes.

How is class renaming related to this?
One of the possible solutions is to move the removed class's binding to
Undeclared. When you rename a class, then Class >> #rename: will check if
there's a binding with the new class name in Undeclared. The problem here
is that Undeclared doesn't release bindings automatically (when they are
not referenced by any methods anymore), so bindings gone long ago are
still lingering in Undeclared, therefore Class >> #rename: (and also Trait
>> #rename:) will #inform: the user about it, which breaks one of the
tests.

But there's another problem with this solution:
There's a test which tests if a removed class's method is still
referencing to the class. So replacing the value of the binding with nil
is not possible.
If we don't replace the value with nil, then the binding isn't really
undeclared. So there will be bindings in Undeclared with non-nil value,
which might have side effects in the system.

How does the solution I suggested work:
- when a class is removed, then its binding will be thrown away
- all methods referring to the class are recompiled. This means that if
any such method exists, then a new binding will be created in Undeclared
and all methods will use that new binding.
The drawback is that mass class removal will be slower. Removing a single
class will have no noticable difference for the user (20-40ms+the
recompilation time).

For completeness, here's a list of problems I don't want to solve now:
- Undeclared holding onto bindings forever
- Undeclared introducing Associations for classes instead of
ReadOnlyVariableBindings
- how class renaming works

On Fri, 21 Sep 2012, Eliot Miranda wrote:

> I don't understand. How else can one use Undeclared? Any binding in a method which is undefined should be in Undefined. Hence when a class is removed or when a class is renamed the binding should be moved to Undeclared and left

Some code uses #at:put: only, others try #includesKey: and #at:ifAbsent:
which might not work as expected (see above).

> there, but when a class is renamed we can offer the user the opportunity to redefine methods that refer to the class with the old name (edit the methods to change the name to the new name).

Should a removed class's binding - which is still pointing to the class
- be in Undeclared? Will it cause any issues that the binding is still
pointing to the class?

Levente

>
> One might think that on rename the binding can be reused and its key changed. But this would leave the methods referring to the new class but with their source naming the old name, hence on recompile the methods would now refer to
> the binding ... in Undeclared. So the only correct thing for me is to move the binding to Undeclared, and, if interactive, offer up either a list browser of all references to the class, or an option to edit those methods
> automatically.
>
>
> The only reason I know about Class >> #rename: is because a test was failing when I moved the binding to Undeclared. It might not be that hard to find the methods, but it's still extra work to be done. I'll check it out
> though, because class renaming is definitely more rare than class removal.
>
>
> Levente
>
>
>
>
>
> --
> best,Eliot
>
>
>

Eliot Miranda-2

Re: References to a removed and reloaded class

On Fri, Sep 21, 2012 at 1:13 PM, Levente Uzonyi <[hidden email]> wrote:

I guess we're going off the track with class renaming. The problem I'd like to solve is:
If we remove a class and then add a class to the system with the same name, then the methods which were referring to the class before the removal are still referring to the same old class. This means that two bindings will coexist in the system with the same class name, but pointing to different classes.

Let me get this clear. Surely the methods should still refer to the class named X after removing and adding back a class named X right? That can (and should) be accomplished by moving the binding to Undeclared on removing it and usiong the binding when declaring the filed-in class. This is the way the system used to work and should continue to work.

How is class renaming related to this?
One of the possible solutions is to move the removed class's binding to Undeclared. When you rename a class, then Class >> #rename: will check if there's a binding with the new class name in Undeclared. The problem here is that Undeclared doesn't release bindings automatically (when they are not referenced by any methods anymore), so bindings gone long ago are still lingering in Undeclared, therefore Class >> #rename: (and also Trait

#rename:) will #inform: the user about it, which breaks one of the

tests.

But that's not a problem. The set-up for these tests can (and should) prune unreferenced bindings from Undeclared, either before running or before complaining of there being bindings in Undeclared.

But there's another problem with this solution:
There's a test which tests if a removed class's method is still referencing to the class. So replacing the value of the binding with nil is not possible.

But that's a bogus test. Of course the binding's value should be set to nil. Either the class has been removed or renamed. The only thing that makes sense with remove is to set it to nil. Rename is more difficult. It makes sense to leave it pointing to the new class if the system is running interactively while the user edits methods to refer to the class under the new name. Tricky.

If we don't replace the value with nil, then the binding isn't really undeclared. So there will be bindings in Undeclared with non-nil value, which might have side effects in the system.

How does the solution I suggested work:
- when a class is removed, then its binding will be thrown away
- all methods referring to the class are recompiled. This means that if any such method exists, then a new binding will be created in Undeclared and all methods will use that new binding.
The drawback is that mass class removal will be slower. Removing a single class will have no noticable difference for the user (20-40ms+the recompilation time).

I don't see that this is usefully different from the existing behaviour. There seem to me to be two cases, one is renaming a class that is key to the system functioning, say some class in the compiler. If the binding is set to nil then the system will break on trying to compile edited versions of the methods that refer to the class. That's tough, but one has to accept it. The work-around is to clone the class under the new name (e.g. file-out, edit, file back in) and move all subclasses under the new class, and then remove the old class. The other case is where the class isn't key and the system keeps running. In either case, keeping the exsting binding and setting its value to nil is the correct behaviour; it maintains system integrity w.r.t. bindings.

For completeness, here's a list of problems I don't want to solve now:
- Undeclared holding onto bindings forever

non-problem. tests that test Undeclared must prune unreferenced bindings from Undeclared. The cost is small. F**ing up Undeclared to make the tests run fast is a case of the tail wagging the dog.

- Undeclared introducing Associations for classes instead of ReadOnlyVariableBindings

Again if the binding is preserved this won't happen.

- how class renaming works

has it changed?

On Fri, 21 Sep 2012, Eliot Miranda wrote:

I don't understand. How else can one use Undeclared? Any binding in a method which is undefined should be in Undefined. Hence when a class is removed or when a class is renamed the binding should be moved to Undeclared and left

Some code uses #at:put: only, others try #includesKey: and #at:ifAbsent: which might not work as expected (see above).

there, but when a class is renamed we can offer the user the opportunity to redefine methods that refer to the class with the old name (edit the methods to change the name to the new name).

Should a removed class's binding - which is still pointing to the class - be in Undeclared? Will it cause any issues that the binding is still pointing to the class?

Levente

One might think that on rename the binding can be reused and its key changed. But this would leave the methods referring to the new class but with their source naming the old name, hence on recompile the methods would now refer to
the binding ... in Undeclared. So the only correct thing for me is to move the binding to Undeclared, and, if interactive, offer up either a list browser of all references to the class, or an option to edit those methods
automatically.

The only reason I know about Class >> #rename: is because a test was failing when I moved the binding to Undeclared. It might not be that hard to find the methods, but it's still extra work to be done. I'll check it out
though, because class renaming is definitely more rare than class removal.

Levente

--
best,Eliot

--
best,

Eliot

Eliot Miranda-2

Re: References to a removed and reloaded class

In reply to this post by Levente Uzonyi-2

oops, missed your questions right at the end.

On Fri, Sep 21, 2012 at 1:13 PM, Levente Uzonyi <[hidden email]> wrote:

I guess we're going off the track with class renaming. The problem I'd like to solve is:
If we remove a class and then add a class to the system with the same name, then the methods which were referring to the class before the removal are still referring to the same old class. This means that two bindings will coexist in the system with the same class name, but pointing to different classes.

How is class renaming related to this?
One of the possible solutions is to move the removed class's binding to Undeclared. When you rename a class, then Class >> #rename: will check if there's a binding with the new class name in Undeclared. The problem here is that Undeclared doesn't release bindings automatically (when they are not referenced by any methods anymore), so bindings gone long ago are still lingering in Undeclared, therefore Class >> #rename: (and also Trait

#rename:) will #inform: the user about it, which breaks one of the

tests.

But there's another problem with this solution:
There's a test which tests if a removed class's method is still referencing to the class. So replacing the value of the binding with nil is not possible.
If we don't replace the value with nil, then the binding isn't really undeclared. So there will be bindings in Undeclared with non-nil value, which might have side effects in the system.

How does the solution I suggested work:
- when a class is removed, then its binding will be thrown away
- all methods referring to the class are recompiled. This means that if any such method exists, then a new binding will be created in Undeclared and all methods will use that new binding.
The drawback is that mass class removal will be slower. Removing a single class will have no noticable difference for the user (20-40ms+the recompilation time).

For completeness, here's a list of problems I don't want to solve now:
- Undeclared holding onto bindings forever
- Undeclared introducing Associations for classes instead of ReadOnlyVariableBindings
- how class renaming works

On Fri, 21 Sep 2012, Eliot Miranda wrote:

I don't understand. How else can one use Undeclared? Any binding in a method which is undefined should be in Undefined. Hence when a class is removed or when a class is renamed the binding should be moved to Undeclared and left

Some code uses #at:put: only, others try #includesKey: and #at:ifAbsent: which might not work as expected (see above).

OK, then fix the code. at:put: seems right. at:ifAbsentPut: will create the wring kinds of bindings right?

there, but when a class is renamed we can offer the user the opportunity to redefine methods that refer to the class with the old name (edit the methods to change the name to the new name).

Should a removed class's binding - which is still pointing to the class - be in Undeclared? Will it cause any issues that the binding is still pointing to the class?

The removed class's binding must be in Undeclared, but its value must be nil. If not, the class will not be reclaimed. If there are instances of the class that's a different issue and the class will be marked as obsolete and only referred to by its instances (as was always the case).

Levente

One might think that on rename the binding can be reused and its key changed. But this would leave the methods referring to the new class but with their source naming the old name, hence on recompile the methods would now refer to
the binding ... in Undeclared. So the only correct thing for me is to move the binding to Undeclared, and, if interactive, offer up either a list browser of all references to the class, or an option to edit those methods
automatically.

The only reason I know about Class >> #rename: is because a test was failing when I moved the binding to Undeclared. It might not be that hard to find the methods, but it's still extra work to be done. I'll check it out
though, because class renaming is definitely more rare than class removal.

Levente

--
best,Eliot

--
best,

Eliot

Colin Putney-3

Re: References to a removed and reloaded class

Levente wrote:

>> Some code uses #at:put: only, others try #includesKey: and #at:ifAbsent:
>> which might not work as expected (see above).

Eliot responded:

> OK, then fix the code. at:put: seems right. at:ifAbsentPut: will create
> the wring kinds of bindings right?

I ran into this too. I'd like to fix the code by creating a new
protocol for dealing with bindings explicitly. Environment would
implement this protocol, and client code will use the new protocol
rather than manipulating dictionaries directly. So #at:put: would
become #bind:to: and so forth, with only a limited subset of the
functionality that dictionaries provide. Then we can add messages for
higher-level operations like removing or renaming classes.

Colin

Levente Uzonyi-2

Re: References to a removed and reloaded class

In reply to this post by Eliot Miranda-2

On Fri, 21 Sep 2012, Eliot Miranda wrote:

>
>
> On Fri, Sep 21, 2012 at 1:13 PM, Levente Uzonyi <[hidden email]> wrote:
> I guess we're going off the track with class renaming. The problem I'd like to solve is:
> If we remove a class and then add a class to the system with the same name, then the methods which were referring to the class before the
> removal are still referring to the same old class. This means that two bindings will coexist in the system with the same class name, but
> pointing to different classes.
>
>
> Let me get this clear. Surely the methods should still refer to the class named X after removing and adding back a class named X right? That can (and
> should) be accomplished by moving the binding to Undeclared on removing it and usiong the binding when declaring the filed-in class. This is the way the
> system used to work and should continue to work.

Right, this is how it should work, but the problem is that it's not
working that way now (and I think it didn't in the past few years).

>
>
> How is class renaming related to this?
> One of the possible solutions is to move the removed class's binding to Undeclared. When you rename a class, then Class >> #rename: will check
> if there's a binding with the new class name in Undeclared. The problem here is that Undeclared doesn't release bindings automatically (when
> they are not referenced by any methods anymore), so bindings gone long ago are still lingering in Undeclared, therefore Class >> #rename: (and
> also Trait
> #rename:) will #inform: the user about it, which breaks one of the
>
> tests.
>
>
> But that's not a problem. The set-up for these tests can (and should) prune unreferenced bindings from Undeclared, either before running or before
> complaining of there being bindings in Undeclared.

It's not the tests which are complaining, but Class >> #rename:. If you
patch SystemDictionary >> #forgetClass:logged: to move the binding to
Undeclared and you run SystemChangeFileTest >> #testClassRenamed then
when the code gets to Class >> #rename: the second time, then at the line

(Undeclared includesKey: newName)

there will be a binding in Undeclared which points to an unreferenced,
already removed class.

Class >> #rename: aString
"The new name of the receiver is the argument, aString."

| oldName newName |
(newName := aString asSymbol) = (oldName := self name)
ifTrue: [^ self].
(self environment includesKey: newName)
ifTrue: [^ self error: newName , ' already exists'].
(Undeclared includesKey: newName)
ifTrue: [self inform: 'There are references to, ' , aString printString , 'from Undeclared. Check them after this change.'].
name := newName.
self environment renameClass: self from: oldName

The only solution I see to the issue is to perform #removeUnreferencedKeys
before sending #includesKey: (or we can be tricky, and do it only if we
find that the binding is there and recheck after to see if it's really
there or not). And yes, I think it's a problem that Undeclared is holding
onto that reference, because every place which uses #includesKey: or
#at:ifAbsent: might get false information.

>
>
> But there's another problem with this solution:
> There's a test which tests if a removed class's method is still referencing to the class. So replacing the value of the binding with nil is
> not possible.
>
>
> But that's a bogus test. Of course the binding's value should be set to nil. Either the class has been removed or renamed. The only thing that makes

So a CompiledMethod should return nil for #methodClass if its class was
removed instead of the obsolete Class object? (see CompiledMethodTest >>
#testMethodClass)

> sense with remove is to set it to nil. Rename is more difficult. It makes sense to leave it pointing to the new class if the system is running
> interactively while the user edits methods to refer to the class under the new name. Tricky.
>
> If we don't replace the value with nil, then the binding isn't really undeclared. So there will be bindings in Undeclared with non-nil value,
> which might have side effects in the system.
>
> How does the solution I suggested work:
> - when a class is removed, then its binding will be thrown away
> - all methods referring to the class are recompiled. This means that if any such method exists, then a new binding will be created in
> Undeclared and all methods will use that new binding.
> The drawback is that mass class removal will be slower. Removing a single class will have no noticable difference for the user (20-40ms+the
> recompilation time).
>
>
> I don't see that this is usefully different from the existing behaviour. There seem to me to be two cases, one is renaming a class that is key to the
> system functioning, say some class in the compiler. If the binding is set to nil then the system will break on trying to compile edited versions of the
> methods that refer to the class. That's tough, but one has to accept it. The work-around is to clone the class under the new name (e.g. file-out, edit,
> file back in) and move all subclasses under the new class, and then remove the old class. The other case is where the class isn't key and the system
> keeps running. In either case, keeping the exsting binding and setting its value to nil is the correct behaviour; it maintains system integrity w.r.t.
> bindings.

My suggested solution only applies to class removal, I don't want to touch
class renaming.

>
>
> For completeness, here's a list of problems I don't want to solve now:
> - Undeclared holding onto bindings forever
>
>
> non-problem. tests that test Undeclared must prune unreferenced bindings from Undeclared. The cost is small. F**ing up Undeclared to make the tests run
> fast is a case of the tail wagging the dog.

It's not about the tests at all, but the system and the way Undeclared is
being used (see above). I had no intention to make any test faster, I just
want them to pass. :)
And I think the way Undeclared is implemented is just wrong.

>
> - Undeclared introducing Associations for classes instead of ReadOnlyVariableBindings
>
>
> Again if the binding is preserved this won't happen.

Using #at:put: will create an Association, because Undeclared is just and
IdentityDictionary.

>
> - how class renaming works
>
>
> has it changed?

I don't think so and I don't really want to deal with it right now, Colin
will do it in 4.5 anyway.

Levente

Levente Uzonyi-2

Re: References to a removed and reloaded class

In reply to this post by Eliot Miranda-2

On Fri, 21 Sep 2012, Eliot Miranda wrote:

> oops, missed your questions right at the end.
>
> On Fri, Sep 21, 2012 at 1:13 PM, Levente Uzonyi <[hidden email]> wrote:
> I guess we're going off the track with class renaming. The problem I'd like to solve is:
> If we remove a class and then add a class to the system with the same name, then the methods which were referring to the class before the
> removal are still referring to the same old class. This means that two bindings will coexist in the system with the same class name, but
> pointing to different classes.
>
> How is class renaming related to this?
> One of the possible solutions is to move the removed class's binding to Undeclared. When you rename a class, then Class >> #rename: will check
> if there's a binding with the new class name in Undeclared. The problem here is that Undeclared doesn't release bindings automatically (when
> they are not referenced by any methods anymore), so bindings gone long ago are still lingering in Undeclared, therefore Class >> #rename: (and
> also Trait
> #rename:) will #inform: the user about it, which breaks one of the
>
> tests.
>
> But there's another problem with this solution:
> There's a test which tests if a removed class's method is still referencing to the class. So replacing the value of the binding with nil is
> not possible.
> If we don't replace the value with nil, then the binding isn't really undeclared. So there will be bindings in Undeclared with non-nil value,
> which might have side effects in the system.
>
> How does the solution I suggested work:
> - when a class is removed, then its binding will be thrown away
> - all methods referring to the class are recompiled. This means that if any such method exists, then a new binding will be created in
> Undeclared and all methods will use that new binding.
> The drawback is that mass class removal will be slower. Removing a single class will have no noticable difference for the user (20-40ms+the
> recompilation time).
>
> For completeness, here's a list of problems I don't want to solve now:
> - Undeclared holding onto bindings forever
> - Undeclared introducing Associations for classes instead of ReadOnlyVariableBindings
> - how class renaming works
>
> On Fri, 21 Sep 2012, Eliot Miranda wrote:
>
> I don't understand. How else can one use Undeclared? Any binding in a method which is undefined should be in Undefined. Hence
> when a class is removed or when a class is renamed the binding should be moved to Undeclared and left
>
>
> Some code uses #at:put: only, others try #includesKey: and #at:ifAbsent: which might not work as expected (see above).
>
>
> OK, then fix the code. at:put: seems right. at:ifAbsentPut: will create the wring kinds of bindings right?

#at:put: is also wrong if the key is not in Undeclared yet, because it
will create Associations. #includesKey: and #at:ifAbsent: will return
false information if the binding is only referenced from Undeclared.

>
>
>
> there, but when a class is renamed we can offer the user the opportunity to redefine methods that refer to the class with the old
> name (edit the methods to change the name to the new name).
>
>
> Should a removed class's binding - which is still pointing to the class - be in Undeclared? Will it cause any issues that the binding is still
> pointing to the class?
>
>
> The removed class's binding must be in Undeclared, but its value must be nil. If not, the class will not be reclaimed. If there are instances of the
> class that's a different issue and the class will be marked as obsolete and only referred to by its instances (as was always the case).

Okay.

Levente

>
>
>
>
>
> Levente
>
>
> One might think that on rename the binding can be reused and its key changed. But this would leave the methods referring to the
> new class but with their source naming the old name, hence on recompile the methods would now refer to
> the binding ... in Undeclared. So the only correct thing for me is to move the binding to Undeclared, and, if interactive, offer
> up either a list browser of all references to the class, or an option to edit those methods
> automatically.
>
>
> The only reason I know about Class >> #rename: is because a test was failing when I moved the binding to Undeclared. It
> might not be that hard to find the methods, but it's still extra work to be done. I'll check it out
> though, because class renaming is definitely more rare than class removal.
>
>
> Levente
>
>
>
>
>
> --
> best,Eliot
>
>
>
>
>
>
>
>
> --
> best,Eliot
>
>
>

Levente Uzonyi-2

Re: References to a removed and reloaded class

In reply to this post by Colin Putney-3

On Fri, 21 Sep 2012, Colin Putney wrote:

> Levente wrote:
>
>>> Some code uses #at:put: only, others try #includesKey: and #at:ifAbsent:
>>> which might not work as expected (see above).
>
> Eliot responded:
>
>> OK, then fix the code. at:put: seems right. at:ifAbsentPut: will create
>> the wring kinds of bindings right?
>
> I ran into this too. I'd like to fix the code by creating a new
> protocol for dealing with bindings explicitly. Environment would
> implement this protocol, and client code will use the new protocol
> rather than manipulating dictionaries directly. So #at:put: would
> become #bind:to: and so forth, with only a limited subset of the
> functionality that dictionaries provide. Then we can add messages for
> higher-level operations like removing or renaming classes.

Right, that's how it should be done.

Levente

>
> Colin
>
>

Levente Uzonyi-2

Re: References to a removed and reloaded class

Hi,

I uploaded a new version of my fix to the Inbox (System-ul.498). The way
it works is similar to my previous attempt - the difference is mainly in
performance. There's no need to recompile methods in this version and the
search for referrers is also faster.
Another additional performance improvement is in Traits-ul.290. It changes
the way Traits are iterated over. Since it makes SystemNavigation >>
#allBehaviorsDo: faster, therefore it improves the performance of many
SystemNavigation methods, including #allCallsOn:.
If there will be no objections, then I'll push these packages to the Trunk
in a day or two.

Cheers,
Levente