Should removed classes become Undeclared?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
63 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Should removed classes become Undeclared?

Bert Freudenberg-3
If a class is still referenced in code, then removed, and later filed  
back in, the former references do not point to the newly filed-in  
class, but to the obsolete former class.

This breaks (at least) the unloading and re-loading of Monticello  
packages. You have to manually recompile all methods that reference  
the class.

Would moving a class to Undeclared be a good fix for that? Or do we  
need a weak registry for removed classes to be able to efficiently  
fix up the references later?

Btw, there are two methods for detecting this situation:

        SystemNavigation default browseObsoleteMethodReferences

        SystemNavigation default browseObsoleteReferences

One is much faster than the other, both have identical results for  
me. Do we need both?

- Bert -


Reply | Threaded
Open this post in threaded view
|

Re: Should removed classes become Undeclared?

Bert Freudenberg-3
Am 12.02.2006 um 14:55 schrieb Bert Freudenberg:

> If a class is still referenced in code, then removed, and later  
> filed back in, the former references do not point to the newly  
> filed-in class, but to the obsolete former class.

On a related note: The PointerFinder (aka 'chase pointers' menu item)  
does not find these references. I  just submitted a fix:

        http://bugs.impara.de/view.php?id=2715

Hopefully now the PointerFinder will never ever return with empty  
hands again ;-)

- Bert -


Reply | Threaded
Open this post in threaded view
|

Re: Should removed classes become Undeclared?

Nicolas Cellier-3
In reply to this post by Bert Freudenberg-3
Le Dimanche 12 Février 2006 14:55, Bert Freudenberg a écrit :

> If a class is still referenced in code, then removed, and later filed
> back in, the former references do not point to the newly filed-in
> class, but to the obsolete former class.
>
> This breaks (at least) the unloading and re-loading of Monticello
> packages. You have to manually recompile all methods that reference
> the class.
>
> Would moving a class to Undeclared be a good fix for that? Or do we
> need a weak registry for removed classes to be able to efficiently
> fix up the references later?
>
> - Bert -

That's what is done in VW. Removed classes are moved to Undeclared
(in fact, this is so any removed entry from any name space, not only
Smalltalk).
In VW, the association value is nilled out when moved to Undeclared.

If you want the association preserved, inverse move operation must be done
when loading a class, Undeclared keys must be checked and association
transferred to the SmalltalkEnvironment.

I think this would solve most of your Monticello problem.

But this could also have tricky results in a future (?) multi-namespace
environment.
Suppose NameSpace A has a class C.
NameSpace B also has a class C.

If you remove both classes A.C and B.C, then load class C again what happen ?
Should every namespace have its own Undeclared ?
Or should Undeclared have several times the same key (not a Dictionary
anymore, but a Set of Associations...) ?
In VW, i do not know how they handle this case...

Without namespaces, there might be over tricks, because Undeclared can also
contains reference to a remove inst var or class var...

The idea of making the whole Undeclared dictionary weak sound a good idea to
me. You would have non referenced entry garbage automatically. What do you
think of that ?


Reply | Threaded
Open this post in threaded view
|

Re: Should removed classes become Undeclared?

Bert Freudenberg-3

Am 12.02.2006 um 19:00 schrieb nicolas cellier:

> Le Dimanche 12 Février 2006 14:55, Bert Freudenberg a écrit :
>> If a class is still referenced in code, then removed, and later filed
>> back in, the former references do not point to the newly filed-in
>> class, but to the obsolete former class.
>>
>> This breaks (at least) the unloading and re-loading of Monticello
>> packages. You have to manually recompile all methods that reference
>> the class.
>>
>> Would moving a class to Undeclared be a good fix for that? Or do we
>> need a weak registry for removed classes to be able to efficiently
>> fix up the references later?
>>
>> - Bert -
>
> the association value is nilled out when moved to Undeclared.

Ah, that's a good solution. No need to hold onto the obsolete  
classes, or is there?

Maybe if we want to migrate instances later?

> If you want the association preserved, inverse move operation must  
> be done
> when loading a class, Undeclared keys must be checked and association
> transferred to the SmalltalkEnvironment.

This already happens in the file-in code.

> But this could also have tricky results in a future (?) multi-
> namespace
> environment.
> Suppose NameSpace A has a class C.
> NameSpace B also has a class C.
>
> If you remove both classes A.C and B.C, then load class C again  
> what happen ?

Nothing - the latest proposal for Squeak namespaces was to just use  
#A::C and #B::C as a class name. Nice and clean.

> The idea of making the whole Undeclared dictionary weak sound a  
> good idea to
> me. You would have non referenced entry garbage automatically. What  
> do you
> think of that ?

Sounds good to me ... unless there is code which relies on  
temporarily moving stuff to Undeclared. I wouldn't rule that one out ;-)

- Bert -


Reply | Threaded
Open this post in threaded view
|

Re: Should removed classes become Undeclared?

Göran Krampe
Bert Freudenberg <[hidden email]> wrote:
[SNIP]

> > But this could also have tricky results in a future (?) multi-
> > namespace
> > environment.
> > Suppose NameSpace A has a class C.
> > NameSpace B also has a class C.
> >
> > If you remove both classes A.C and B.C, then load class C again  
> > what happen ?
>
> Nothing - the latest proposal for Squeak namespaces was to just use  
> #A::C and #B::C as a class name. Nice and clean.

Perhaps I should dust off that code once again. I still feel it had lots
of nice properties (simple, backwards compatible, no need for new
fileout formats, tools still work etc).

regards, Göran

Reply | Threaded
Open this post in threaded view
|

Re: Should removed classes become Undeclared?

Bert Freudenberg-3

Am 13.02.2006 um 09:06 schrieb [hidden email]:

> Bert Freudenberg <[hidden email]> wrote:
> [SNIP]
>>> But this could also have tricky results in a future (?) multi-
>>> namespace
>>> environment.
>>> Suppose NameSpace A has a class C.
>>> NameSpace B also has a class C.
>>>
>>> If you remove both classes A.C and B.C, then load class C again
>>> what happen ?
>>
>> Nothing - the latest proposal for Squeak namespaces was to just use
>> #A::C and #B::C as a class name. Nice and clean.
>
> Perhaps I should dust off that code once again. I still feel it had  
> lots
> of nice properties (simple, backwards compatible, no need for new
> fileout formats, tools still work etc).

Please do :)

- Bert -


Reply | Threaded
Open this post in threaded view
|

Bug: Use of == for arithmetic equality

Dan Ingalls-2
In reply to this post by Bert Freudenberg-3
I've been playing around with a new VM (heh, heh) which, for a while,
happened not to intern (ie force unique instances of) SmallIntegers.
In this case the use of == to mean arithmetic equality will not work
properly.  In my opinion, all such occurrences in the system should
be eliminated ASAP;  == is not an arithmetic compare in any Smalltalk
I know of.  While it may work with small constants, it is simply
wrong, and an especially bad example for newbies to see.  Besides
failing in certain interpreters, it will fail in Squeak itself if the
integers are not small.

I regret that I don't have time to fix these right now.  However, if
there is a well-intentioned soul out there, he or she will perhaps
find the method below to be quite useful.  It found 165 methods in my
system with this pattern.

Hope this helps.

        - Dan
-----------------------------------------------

<CompiledMethod>scanForEqSmallConstant
        "Answer whether the receiver contains the pattern
<expression> == <constant>,
        where constant is -1, 0, 1, or 2..."

        | scanner |
        scanner _ InstructionStream on: self.
        ^ scanner scanFor: [:instr | (instr between: 116 and: 119)
and: [scanner followingByte = 198]]

"
SystemNavigation new browseAllSelect: [:m | m scanForEqSmallConstant]
"

Reply | Threaded
Open this post in threaded view
|

Re: Bug: Use of == for arithmetic equality

Marcus Denker

On 13.02.2006, at 18:37, Dan Ingalls wrote:

> I've been playing around with a new VM (heh, heh) which, for a  
> while, happened not to intern (ie force unique instances of)  
> SmallIntegers. In this case the use of == to mean arithmetic  
> equality will not work properly.  In my opinion, all such  
> occurrences in the system should be eliminated ASAP;  == is not an  
> arithmetic compare in any Smalltalk I know of.  While it may work  
> with small constants, it is simply wrong, and an especially bad  
> example for newbies to see.  Besides failing in certain  
> interpreters, it will fail in Squeak itself if the integers are not  
> small.
>
> I regret that I don't have time to fix these right now.  However,  
> if there is a well-intentioned soul out there, he or she will  
> perhaps find the method below to be quite useful.  It found 165  
> methods in my system with this pattern.
>

The interesting thing is that a quite large percentage of those come  
from the beloved

  someCollection size == 0 ifTrue: []

pattern that lots of people like so much... "calling isEmpty is too  
slow" they will tell you,
(and ifEmpty: is *really* evil). As the main objective is speed, they  
of course don't use #=.

For the newbies: Do not optimize for speed before you have proven  
that it makes sense,
and then *document* the hack. Using a hack by default because is "may  
be too slow" is
not a good idea...

       Marcus




Reply | Threaded
Open this post in threaded view
|

Re: Use of == for arithmetic equality

Frank Shearar
In reply to this post by Dan Ingalls-2
"Dan Ingalls" <[hidden email]> wrote:
<snip>
> I regret that I don't have time to fix these right now.  However, if
> there is a well-intentioned soul out there, he or she will perhaps
> find the method below to be quite useful.  It found 165 methods in my
> system with this pattern.

I'll do it, in my (virgin) 3.9a-6721.

frank

Reply | Threaded
Open this post in threaded view
|

Re: Bug: Use of == for arithmetic equality

Dan Ingalls
In reply to this post by Marcus Denker
>On 13.02.2006, at 18:37, Dan Ingalls wrote:
>
>>I've been playing around with a new VM (heh, heh) which, for a while, happened not to intern (ie force unique instances of) SmallIntegers. In this case the use of == to mean arithmetic equality will not work properly.  In my opinion, all such occurrences in the system should be eliminated ASAP;  == is not an arithmetic compare in any Smalltalk I know of.  While it may work with small constants, it is simply wrong, and an especially bad example for newbies to see.  Besides failing in certain interpreters, it will fail in Squeak itself if the integers are not small.
>>
>>I regret that I don't have time to fix these right now.  However, if there is a well-intentioned soul out there, he or she will perhaps find the method below to be quite useful.  It found 165 methods in my system with this pattern.
>>
>
>The interesting thing is that a quite large percentage of those come from the beloved
>
> someCollection size == 0 ifTrue: []
>
>pattern that lots of people like so much... "calling isEmpty is too slow" they will tell you,
>(and ifEmpty: is *really* evil). As the main objective is speed, they of course don't use #=.

I take issue with the "of course" here.  I defy anyone to demonstrate a significant (even detectable) speedup of == over = between SmallIntegers on any meaningful benchmark.

>For the newbies: Do not optimize for speed before you have proven that it makes sense,
>and then *document* the hack. Using a hack by default because is "may be too slow" is
>not a good idea...
>      Marcus

And for the "pros": Do not optimize for speed before you have proven that it makes sense,
and then *document* the hack. Using a hack by default because is "may be too slow" is
not a good idea...
;-)   Dan

Reply | Threaded
Open this post in threaded view
|

Re: Bug: Use of == for arithmetic equality

timrowledge

On 13-Feb-06, at 1:40 PM, Dan Ingalls wrote:

>
>> For the newbies: Do not optimize for speed before you have proven  
>> that it makes sense,
>> and then *document* the hack. Using a hack by default because is  
>> "may be too slow" is
>> not a good idea...
>>      Marcus
>
> And for the "pros": Do not optimize for speed before you have  
> proven that it makes sense,
> and then *document* the hack. Using a hack by default because is  
> "may be too slow" is
> not a good idea...

Or more succinctly
On Optimisation:-
a) Don't
b) (For experts only) Don't, Yet

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: OI: Vey



Reply | Threaded
Open this post in threaded view
|

Re: Bug: Use of == for arithmetic equality

Colin Putney
In reply to this post by Marcus Denker
Marcus Denker wrote:

> The interesting thing is that a quite large percentage of those come  
> from the beloved
>
>  someCollection size == 0 ifTrue: []
>
> pattern that lots of people like so much... "calling isEmpty is too  
> slow" they will tell you,
> (and ifEmpty: is *really* evil). As the main objective is speed, they  
> of course don't use #=.

It could be premature optimization, but then again it could be lack of
familiarity with the idiom. The above snippet is a straight translation of

if (someCollection.size() == 0) {}

which is pretty common practise in curly-brace languages. Of course, if
curly-braces are what you're used to, sending #== rather than #= is an
easy mistake to make.

I've been writing a fair amount of javascript lately, and been bitten by
the reverse mistake, doing assignment when I want to test for equality.



Reply | Threaded
Open this post in threaded view
|

Re: Bug: Use of == for arithmetic equality

Nicolas Cellier-3
In reply to this post by Dan Ingalls
Le Lundi 13 Février 2006 22:40, Dan Ingalls a écrit :
> I take issue with the "of course" here.  I defy anyone to demonstrate a
> significant (even detectable) speedup of == over = between SmallIntegers on
> any meaningful benchmark.

Just to confirm Dan says, i did this in VW 7.3 and squeak 3.8

| t1 t2 t3 |
t1 := Time millisecondsToRun: [10000000 timesRepeat: [#(1 2 3 4) isEmpty]].
t2 := Time millisecondsToRun: [10000000 timesRepeat: [#(1 2 3 4) size = 0]].
t3 := Time millisecondsToRun: [10000000 timesRepeat: [#(1 2 3 4) size == 0]].
^Array with: t1 with: t2 with: t3

VW:
 #(627 311 291)
 #(693 302 292)

Squeak:
 #(6619 3959 4146)
#(6558 3976 4126)

If such optimization matter, should be at the VM level...
need a guru for JIT, method cache or something we have not yet...
but don't bother too much at upper level.


Reply | Threaded
Open this post in threaded view
|

Re: Bug: Use of == for arithmetic equality

Wolfgang Helbig-2
In reply to this post by Dan Ingalls-2
Dan Ingalls <[hidden email]> wrote:
> Date: Mon, 13 Feb 2006 09:37:42 -0800
...
> happened not to intern (ie force unique instances of) SmallIntegers.
> In this case the use of == to mean arithmetic equality will not work
> properly.  In my opinion, all such occurrences in the system should
> be eliminated ASAP;  == is not an arithmetic compare in any Smalltalk
> I know of.

But #== and #= is equivalent in ST-80 as described in [Adele Goldberg,
David Robson: "Smalltalk-80 The Language", 1989, p 115]:

        Objects, that can not change their internal state are called immutable
objects. This means,
     that, once created, they are not destroyed and then recreated when
they are needed again.
     Rather, the 256 instances of Character are created at the time the
system is initialized and
        remain in the system.
        ...
        Besides Characters, the Smalltalk-80 system includes SmallIntegers and
Symbols as immutable
        objects.

In the same book, there are expressions like

[p 139]
... we want to know how many of the Characters are a or A.
  count _ 0.
  letters do: [:each | each asLowercase  == $a
                              ifTrue: [count _ count + 1]]

[p 168]
Thus
  'a string' asSymbol == 'a string' asSymbol
answers true.

etc. It might be a bad style to use #== instead
of #=, but this "bad habit" is certainly not rooted in the usage of
curly-brace languages alone.

Now, my question. Are SmallIntegers, Characters and Symbols in Squeak
immutable objects in the sense of the above definition, i, e. not
destroyable? If not, why and when was it changed in Squeak. If they are
still immutable, why is this planned to be changed?

Greetings,

whg

Reply | Threaded
Open this post in threaded view
|

Re: Bug: Use of == for arithmetic equality

timrowledge

On 13-Feb-06, at 4:09 PM, Wolfgang Helbig wrote:
>
> Now, my question. Are SmallIntegers, Characters and Symbols in Squeak
> immutable objects in the sense of the above definition, i, e. not
> destroyable? If not, why and when was it changed in Squeak. If they  
> are
> still immutable, why is this planned to be changed?
SmallIntegers, Characters and Symbols are indeed immutable in Squeak.  
However, numbers in general are not quite the same; it is entirely  
possible to have several LargeIntegers with the same numeric value  
and that is one quite obvious case where using #== instead of #=  
could provide a surprise.
All one has to do is remember tat #== means 'is the same object' and  
#= means 'is an equal object' to see that confusion is not smart.

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
If you think nobody cares about you, try missing a couple of payments.



Reply | Threaded
Open this post in threaded view
|

Re: Bug: Use of == for arithmetic equality

Wolfgang Helbig-2
In reply to this post by Nicolas Cellier-3
nicolas cellier <[hidden email]> wrote:
...

> | t1 t2 t3 |
> t1 := Time millisecondsToRun: [10000000 timesRepeat: [#(1 2 3 4) isEmpty]].
> t2 := Time millisecondsToRun: [10000000 timesRepeat: [#(1 2 3 4) size = 0]].
> t3 := Time millisecondsToRun: [10000000 timesRepeat: [#(1 2 3 4) size == 0]].
> ^Array with: t1 with: t2 with: t3
>
> VW:
>  #(627 311 291)
>  #(693 302 292)
>
> Squeak:
>  #(6619 3959 4146)
> #(6558 3976 4126)
>
> If such optimization matter, should be at the VM level...
> need a guru for JIT, method cache or something we have not yet...
> but don't bother too much at upper level.

In Squeak 3.8, #== seems to be even slower than #=. The same here:
#(4941 3081 3101). But in Squeak 1.18, I've got #(8977 5313 4923), both
with 3.7.7 Unix-VM resp. 1.18 Unix-VM.

Greetings and thank you, tim, for your convincing answer.

Wolfgang Helbig

Reply | Threaded
Open this post in threaded view
|

Re: Bug: Use of == for arithmetic equality

Dan Ingalls
In reply to this post by Wolfgang Helbig-2
Wolfgang Helbig <[hidden email]>  wrote...

>But #== and #= is equivalent in ST-80 as described in [Adele Goldberg,
>David Robson: "Smalltalk-80 The Language", 1989, p 115]:
>
> Objects, that can not change their internal state are called immutable
>objects. This means,
>     that, once created, they are not destroyed and then recreated when
>they are needed again.
>     Rather, the 256 instances of Character are created at the time the
>system is initialized and
> remain in the system.
> ...
> Besides Characters, the Smalltalk-80 system includes SmallIntegers and
>Symbols as immutable
> objects.

You have to be careful.  There are two issues: immutability and "interning" (guaranteeing a unique object for each value).  Immutability does not at all guarantee that == will work for arithmetic equality.  Squeak's LargeIntegers and Squeak's Floats are designed to be immutable in their protocol, yet == among these is not equivalent to =.  It is true that a==b implies a=b, but it is *not* true that "(a==b) not" implies "(a=b) not".  To get this second effect requires "interning" wherein the object you get back for a given value is always the same object -- as with characters (there is a table of them), and Symbols (there is a table of them as well).  It *happens* that most modern Smalltalks (ie since ST-76 which did *not* have this property) implement SmallIntegers as a pointer with a tag and the value, so any two SmallIntegers of the same value are encoded perforce as the same object, so they are inherently interned.

>In the same book, there are expressions like
>
>[p 139]
>... we want to know how many of the Characters are a or A.
>  count _ 0.
>  letters do: [:each | each asLowercase  == $a
>                              ifTrue: [count _ count + 1]]
>
>[p 168]
>Thus
>  'a string' asSymbol == 'a string' asSymbol
>answers true.

These are OK exactly for the above reason.

>etc. It might be a bad style to use #== instead
>of #=, but this "bad habit" is certainly not rooted in the usage of
>curly-brace languages alone.

I think we cannot excuse this usage either as merely a bad habit, nor as justifiable because it occurs in other languages.  it is simply not Squeak arithmetic.  I'm sure your curly-brace languages are equally unhappy with the use of = in place of == .

>Now, my question. Are SmallIntegers, Characters and Symbols in Squeak
>immutable objects in the sense of the above definition, i, e. not
>destroyable? If not, why and when was it changed in Squeak. If they are
>still immutable, why is this planned to be changed?

Yes, they are immutable, but not all Integers are interned.  Just evaluate...
        (1e10 + 1 - 1) == 1e10
and be convinced.

I know of no plans to change immutability of any of these objects.  But there is definitely a plan to stop using == to test arithmetic equality in Squeak ;-).

Thanks for your interest.

        - Dan

Reply | Threaded
Open this post in threaded view
|

Re: Use of == for arithmetic equality

Frank Shearar
In reply to this post by Frank Shearar
"Frank Shearar" <[hidden email]> volunteered:

> "Dan Ingalls" <[hidden email]> wrote:
> <snip>
> > I regret that I don't have time to fix these right now.  However, if
> > there is a well-intentioned soul out there, he or she will perhaps
> > find the method below to be quite useful.  It found 165 methods in my
> > system with this pattern.
>
> I'll do it, in my (virgin) 3.9a-6721.

OK, I made a 166kB changeset, and the mail bounced (the attachment was too
large). That's probably for the best, because it forces me to try make mcds.

Now that I know how, I'll post the mcds to Mantis soon. I guess that the
maintainers of the various packages can take a look at the changes there?

frank


Reply | Threaded
Open this post in threaded view
|

Re: Bug: Use of == for arithmetic equality

Diego Gomez Deck
In reply to this post by timrowledge
Hi,

Actually the confusion is in Object class.

All objects can be tested for identity (message #==) but it doesn't make
sense to offer messages for equality (message #=) all over the
hierarchy. Not all object has a good meaning of equality and, imho, to
give default equality (based on identity) make the confusion bigger.

I think to put equality in Object is one the the biggest misconceptions
we suffer daily. Only a small set of objects can answer for equality in
a senseful way.

If you combine this fact with the awful-object-capabilities of the
main-stream languages (where the objects never survive a run), you will
get an abuse of uses of equality.  Most of the times, in those
languages, equality is just used to workaround the problems with the
(nonexistent) identity.


Cheers,

-- Diego


El lun, 13-02-2006 a las 16:25 -0800, tim Rowledge escribió:

> On 13-Feb-06, at 4:09 PM, Wolfgang Helbig wrote:
> >
> > Now, my question. Are SmallIntegers, Characters and Symbols in Squeak
> > immutable objects in the sense of the above definition, i, e. not
> > destroyable? If not, why and when was it changed in Squeak. If they  
> > are
> > still immutable, why is this planned to be changed?
> SmallIntegers, Characters and Symbols are indeed immutable in Squeak.  
> However, numbers in general are not quite the same; it is entirely  
> possible to have several LargeIntegers with the same numeric value  
> and that is one quite obvious case where using #== instead of #=  
> could provide a surprise.
> All one has to do is remember tat #== means 'is the same object' and  
> #= means 'is an equal object' to see that confusion is not smart.
>
> tim



Reply | Threaded
Open this post in threaded view
|

Re: Use of == for arithmetic equality

Frank Shearar
In reply to this post by Frank Shearar
"Frank Shearar" <[hidden email]> wrote:

> "Frank Shearar" <[hidden email]> volunteered:
>
> > "Dan Ingalls" <[hidden email]> wrote:
> > <snip>
> > > I regret that I don't have time to fix these right now.  However, if
> > > there is a well-intentioned soul out there, he or she will perhaps
> > > find the method below to be quite useful.  It found 165 methods in my
> > > system with this pattern.
> >
> > I'll do it, in my (virgin) 3.9a-6721.
>
> OK, I made a 166kB changeset, and the mail bounced (the attachment was too
> large). That's probably for the best, because it forces me to try make
mcds.
>
> Now that I know how, I'll post the mcds to Mantis soon. I guess that the
> maintainers of the various packages can take a look at the changes there?

Er, when I make these MCDs (mark the repository as storing diffs, hit the
Save button) my image (3.9a-6721) pops up a never-ending sequence of
"Diffing..." messages, but never seems to stop. I mean, I'm trying to save
the Collections mcd, which altered about 10 or so methods, and the saving
process has already taken 7 minutes! Am I doing something crazily wrong?

My previous attempt was on the Traits package, and that took at least an
hour before I gave up.

Is there a way to split up a ChangeSet into a set of per-package change
sets? Then I can post those to Mantis instead of MCDs.

frank


1234