What is equivalence? (was: The Trunk: Collections-eem.603.mcz)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

What is equivalence? (was: The Trunk: Collections-eem.603.mcz)

Chris Muller-3
On Wed, Feb 4, 2015 at 1:12 PM, Chris Muller <[hidden email]> wrote:

>> On Wed, Feb 4, 2015 at 9:07 AM, Chris Muller <[hidden email]> wrote:
>>>
>>> isSequenceable is a term that refers to a particular *kind* of
>>> Collection, a sequenceable one.
>>>
>>> Therefore, IMO, I am unable to think of any more clear and explicit
>>> way of expressing that than "isKindOf: SequenceableCollection"...
>>
>>
>> self isCollection and: [self isSequenceable]  is better.  isKinfOf: is a)
>> not object-oriented as it forces an argument to be in a particular hierarchy
>> rather than having a particular interface, and b) is horribly inefficient,
>> causing a potentially long search of an object's class hierarchy.  isKindOf:
>> doesn't just smell, it stinks.
>
> Yes, and class-testing via #class as well, for the same reasons.  But
> sometimes we really do want to know whether we have a _particular
> implementation_ of a Dictionary, not just dictionary behaviors and
> API's.

Eliots subStrings: change is obviously a fine improvement and I think
the hasEqualElements: is too with one of Levente's suggestions.  But
may we scrutinize this change to Dictionary>>#= just a bit more?

Everyone agrees using #isDictionary is faster and more OO and less
smelly than isKindOf: Dictionary.

If #isDictionary refers to particular API and behaviors, one could
argue that a BTree should answer true to #isDictionary, because it has
similar API and behaviors.  In the context of _equivalence testing_
though, a BTree is not a Dictionary.

So as long as we interpret the various #isSomeType methods as truly of
*that type* (same semantics as isKindOf:), and not "similar to", then
I can see no side-effects (unless someone added #isDictionary to
BTree, of course).  However, we may want to visit the other #=
implementations elsewhere in the system too, unless this would appear
to be an inconsistency producing its own odor of sorts.

Reply | Threaded
Open this post in threaded view
|

Re: What is equivalence? (was: The Trunk: Collections-eem.603.mcz)

Eliot Miranda-2


On Wed, Feb 4, 2015 at 12:40 PM, Chris Muller <[hidden email]> wrote:
On Wed, Feb 4, 2015 at 1:12 PM, Chris Muller <[hidden email]> wrote:
>> On Wed, Feb 4, 2015 at 9:07 AM, Chris Muller <[hidden email]> wrote:
>>>
>>> isSequenceable is a term that refers to a particular *kind* of
>>> Collection, a sequenceable one.
>>>
>>> Therefore, IMO, I am unable to think of any more clear and explicit
>>> way of expressing that than "isKindOf: SequenceableCollection"...
>>
>>
>> self isCollection and: [self isSequenceable]  is better.  isKinfOf: is a)
>> not object-oriented as it forces an argument to be in a particular hierarchy
>> rather than having a particular interface, and b) is horribly inefficient,
>> causing a potentially long search of an object's class hierarchy.  isKindOf:
>> doesn't just smell, it stinks.
>
> Yes, and class-testing via #class as well, for the same reasons.  But
> sometimes we really do want to know whether we have a _particular
> implementation_ of a Dictionary, not just dictionary behaviors and
> API's.

Eliots subStrings: change is obviously a fine improvement and I think
the hasEqualElements: is too with one of Levente's suggestions.  But
may we scrutinize this change to Dictionary>>#= just a bit more?

Everyone agrees using #isDictionary is faster and more OO and less
smelly than isKindOf: Dictionary.

If #isDictionary refers to particular API and behaviors, one could
argue that a BTree should answer true to #isDictionary, because it has
similar API and behaviors.  In the context of _equivalence testing_
though, a BTree is not a Dictionary.

So as long as we interpret the various #isSomeType methods as truly of
*that type* (same semantics as isKindOf:), and not "similar to", then
I can see no side-effects (unless someone added #isDictionary to
BTree, of course).  However, we may want to visit the other #=
implementations elsewhere in the system too, unless this would appear
to be an inconsistency producing its own odor of sorts.

"similar to" is vague.   I *don't* interpret isFoo methods as isKindOf: (and think that most experienced Smalltalk programmers don't either).  In Smalltalk type = protocol.  So these methods imply that an object implements a given set of messages, not that they are of any given class.
--
best,
Eliot


Reply | Threaded
Open this post in threaded view
|

Re: What is equivalence? (was: The Trunk: Collections-eem.603.mcz)

Chris Muller-4
>> If #isDictionary refers to particular API and behaviors, one could
>> argue that a BTree should answer true to #isDictionary, because it has
>> similar API and behaviors.  In the context of _equivalence testing_
>> though, a BTree is not a Dictionary.
>>
>> So as long as we interpret the various #isSomeType methods as truly of
>> *that type* (same semantics as isKindOf:), and not "similar to", then
>> I can see no side-effects (unless someone added #isDictionary to
>> BTree, of course).  However, we may want to visit the other #=
>> implementations elsewhere in the system too, unless this would appear
>> to be an inconsistency producing its own odor of sorts.
>
> "similar to" is vague.  I *don't* interpret isFoo methods as isKindOf: (and
> think that most experienced Smalltalk programmers don't either).
> In
> Smalltalk type = protocol.  So these methods imply that an object implements
> a given set of messages, not that they are of any given class.

Then I am very interested to know your thoughts about my BTree
question, above, which shares the same protocol as Dictionary.  Should

    (BTree new at: 3 put: 'three'; yourself) = (Dictionary new at: 3
put: 'three'; yourself)

return true?  Why or why not?

Reply | Threaded
Open this post in threaded view
|

Re: What is equivalence? (was: The Trunk: Collections-eem.603.mcz)

Tobias Pape
In reply to this post by Eliot Miranda-2
Hi,

On 04.02.2015, at 22:24, Eliot Miranda <[hidden email]> wrote:

> "similar to" is vague.   I *don't* interpret isFoo methods as isKindOf: (and think that most experienced Smalltalk programmers don't either).  In Smalltalk type = protocol.  So these methods imply that an object implements a given set of messages, not that they are of any given class.

I second this interpretation.
Taking our example at hand[1] we want the otherCollection to have two
properties:
a) it should be a collection of object; or differently phrased, have a
   multiplicity >1

b) it needs an internal order (or a half-order may be) and a way to
   access things via an index.

a is clear but now we question b. Aren't SequenceableCollections the only
thing that fit here? No. I can think of a domain Object, say
Banquet that is sequenceable (courses), but that I would not call being collection.

So what I would think:

Dictionary new isCollection. "true"
Dictionary new isSequenceable. "false"

OrderedCollection new isCollection. "true"
OrderedCollection new isSequenceable. "true"

Banquet new isCollection. "false"
Banquet new if Sequenceable. "true !"


So what ever we as community expect from a sequenceable should at least in
parts be exhibited by the DinnerCourse, while exhibiting a collection
protocol is unnecessary.[2]


Best
        -Tobias


[1]: Original Code
SequenceableCollection>>hasEqualElements: otherCollection
        "Answer whether the receiver's size is the same as otherCollection's
        size, and each of the receiver's elements equal the corresponding
        element of otherCollection.
        This should probably replace the current definition of #= ."

        | size |
  (otherCollection isKindOf: SequenceableCollection) ifFalse: [^ false].
        (size := self size) = otherCollection size ifFalse: [^ false].
        1 to: size do:
                [:index |
                (self at: index) = (otherCollection at: index) ifFalse: [^ false]].
        ^ true
[2]: One _can_ say a banquet is a collection of courses, and indeed, that would be how
     I would implement it (object composition, delegating to an ordered collection) but
     I wouldn't consider it part of the protocol / exhibited interface.




Reply | Threaded
Open this post in threaded view
|

Re: What is equivalence? (was: The Trunk: Collections-eem.603.mcz)

Eliot Miranda-2
In reply to this post by Chris Muller-4


On Wed, Feb 4, 2015 at 1:40 PM, Chris Muller <[hidden email]> wrote:
>> If #isDictionary refers to particular API and behaviors, one could
>> argue that a BTree should answer true to #isDictionary, because it has
>> similar API and behaviors.  In the context of _equivalence testing_
>> though, a BTree is not a Dictionary.
>>
>> So as long as we interpret the various #isSomeType methods as truly of
>> *that type* (same semantics as isKindOf:), and not "similar to", then
>> I can see no side-effects (unless someone added #isDictionary to
>> BTree, of course).  However, we may want to visit the other #=
>> implementations elsewhere in the system too, unless this would appear
>> to be an inconsistency producing its own odor of sorts.
>
> "similar to" is vague.  I *don't* interpret isFoo methods as isKindOf: (and
> think that most experienced Smalltalk programmers don't either).
> In
> Smalltalk type = protocol.  So these methods imply that an object implements
> a given set of messages, not that they are of any given class.

Then I am very interested to know your thoughts about my BTree
question, above, which shares the same protocol as Dictionary.  Should

    (BTree new at: 3 put: 'three'; yourself) = (Dictionary new at: 3
put: 'three'; yourself)

return true?  Why or why not?

Duck typing applies here, if it quacks like a duck.  So if your BTree behaves like a Dictionary to the extent that e.g. MethodDictionary does (MethodDictionary supports associationAt: but isn't answering an association within it cuz it does't contain associations) then sure, it's a Dictionary.
--
best,
Eliot


Reply | Threaded
Open this post in threaded view
|

Re: What is equivalence? (was: The Trunk: Collections-eem.603.mcz)

Chris Muller-4
>> >> If #isDictionary refers to particular API and behaviors, one could
>> >> argue that a BTree should answer true to #isDictionary, because it has
>> >> similar API and behaviors.  In the context of _equivalence testing_
>> >> though, a BTree is not a Dictionary.
>> >>
>> >> So as long as we interpret the various #isSomeType methods as truly of
>> >> *that type* (same semantics as isKindOf:), and not "similar to", then
>> >> I can see no side-effects (unless someone added #isDictionary to
>> >> BTree, of course).  However, we may want to visit the other #=
>> >> implementations elsewhere in the system too, unless this would appear
>> >> to be an inconsistency producing its own odor of sorts.
>> >
>> > "similar to" is vague.  I *don't* interpret isFoo methods as isKindOf:
>> > (and
>> > think that most experienced Smalltalk programmers don't either).
>> > In
>> > Smalltalk type = protocol.  So these methods imply that an object
>> > implements
>> > a given set of messages, not that they are of any given class.
>>
>> Then I am very interested to know your thoughts about my BTree
>> question, above, which shares the same protocol as Dictionary.  Should
>>
>>     (BTree new at: 3 put: 'three'; yourself) = (Dictionary new at: 3
>> put: 'three'; yourself)
>>
>> return true?  Why or why not?
>
> Duck typing applies here, if it quacks like a duck.  So if your BTree
> behaves like a Dictionary to the extent that e.g. MethodDictionary does
> (MethodDictionary supports associationAt: but isn't answering an association
> within it cuz it does't contain associations) then sure, it's a Dictionary.

I do not wish to be argumentative, but does a WeakArray quack like an
Array?  Not only does WeakArray share exactly the same API, but even
inherits from Array and "is-a" Array.  So why shouldn't this be true?

    (Array with: 1 with: 2 with: 3) = (WeakArray with: 1 with: 2 with:
3)   "false"

I think the answer is because #= (not any other method, just #=) needs
to care about which _implementation_ of the argument which is passed
when considering true equivalence to another object.

I agree we human developers can consider unequal but same-quacking
objects interchangeable in our fuzzy minds when we design our
applications, and that is powerful, however, *within the system*, I
just it needs #= to make implementation-specific distinctions,
especially for as base-classes as Array and Dictionary.

This seems to be reflected by most of the #= implementations in the
system, which check either #class, #species, or #isKindOf:.

Reply | Threaded
Open this post in threaded view
|

Re: What is equivalence? (was: The Trunk: Collections-eem.603.mcz)

Eliot Miranda-2


On Wed, Feb 4, 2015 at 4:49 PM, Chris Muller <[hidden email]> wrote:
>> >> If #isDictionary refers to particular API and behaviors, one could
>> >> argue that a BTree should answer true to #isDictionary, because it has
>> >> similar API and behaviors.  In the context of _equivalence testing_
>> >> though, a BTree is not a Dictionary.
>> >>
>> >> So as long as we interpret the various #isSomeType methods as truly of
>> >> *that type* (same semantics as isKindOf:), and not "similar to", then
>> >> I can see no side-effects (unless someone added #isDictionary to
>> >> BTree, of course).  However, we may want to visit the other #=
>> >> implementations elsewhere in the system too, unless this would appear
>> >> to be an inconsistency producing its own odor of sorts.
>> >
>> > "similar to" is vague.  I *don't* interpret isFoo methods as isKindOf:
>> > (and
>> > think that most experienced Smalltalk programmers don't either).
>> > In
>> > Smalltalk type = protocol.  So these methods imply that an object
>> > implements
>> > a given set of messages, not that they are of any given class.
>>
>> Then I am very interested to know your thoughts about my BTree
>> question, above, which shares the same protocol as Dictionary.  Should
>>
>>     (BTree new at: 3 put: 'three'; yourself) = (Dictionary new at: 3
>> put: 'three'; yourself)
>>
>> return true?  Why or why not?
>
> Duck typing applies here, if it quacks like a duck.  So if your BTree
> behaves like a Dictionary to the extent that e.g. MethodDictionary does
> (MethodDictionary supports associationAt: but isn't answering an association
> within it cuz it does't contain associations) then sure, it's a Dictionary.

I do not wish to be argumentative, but does a WeakArray quack like an
Array?  Not only does WeakArray share exactly the same API, but even
inherits from Array and "is-a" Array.  So why shouldn't this be true?

    (Array with: 1 with: 2 with: 3) = (WeakArray with: 1 with: 2 with:
3)   "false"

IMO that's a bug.   And the bug is that WeakArray species = WeakArray, whereas it should be Array.  If one selects/rejects/copies a WeakArray the result is more useful if it is strong.  It would be interesting to make the change and see what effect it has on the standard test suite.

I think the answer is because #= (not any other method, just #=) needs
to care about which _implementation_ of the argument which is passed
when considering true equivalence to another object.

IMO the problem is because coming up with general purpose #= implementations is hard.  You must have seen the controversy my definition of CompiledMethod>>#= has caused over recent years.  #= has evolved a lot since Smalltalk-80 as its been more affordable to be cleverer, but that evolution also shows that its not absolutely obvious what #= should do; it depends on context, and current definitions of #= are general agreements of useful behaviour, some of these agreements having been made a long time ago.


I agree we human developers can consider unequal but same-quacking
objects interchangeable in our fuzzy minds when we design our
applications, and that is powerful, however, *within the system*, I
just it needs #= to make implementation-specific distinctions,
especially for as base-classes as Array and Dictionary.

One thing #= should do is accept any object as an argument without error.  This wasn't always the case, but Smalltalk was not and never will be perfect.  Apart from that, what #= means is something we have to negotiate.  I don't think my recent changes broke anything.


This seems to be reflected by most of the #= implementations in the
system, which check either #class, #species, or #isKindOf:.

Few of them use isKindOf:.  Some use class or species.  Some use #isFoo.  I don't see that #isFoo is worse than #class or #species.  Do you think it is?



--
best,
Eliot


Reply | Threaded
Open this post in threaded view
|

Re: What is equivalence? (was: The Trunk: Collections-eem.603.mcz)

Chris Muller-4
> IMO that's a bug.   And the bug is that WeakArray species = WeakArray,
> whereas it should be Array.  If one selects/rejects/copies a WeakArray the
> result is more useful if it is strong.  It would be interesting to make the
> change and see what effect it has on the standard test suite.

Indeed..

>> I think the answer is because #= (not any other method, just #=) needs
>> to care about which _implementation_ of the argument which is passed
>> when considering true equivalence to another object.
>
> IMO the problem is because coming up with general purpose #= implementations
> is hard.  You must have seen the controversy my definition of
> CompiledMethod>>#= has caused over recent years.  #= has evolved a lot since
> Smalltalk-80 as its been more affordable to be cleverer, but that evolution
> also shows that its not absolutely obvious what #= should do; it depends on
> context, and current definitions of #= are general agreements of useful
> behaviour, some of these agreements having been made a long time ago.

Being cleverer in #= feels right but general agreements also feels
right in terms of having dependable contracts.  For example, its
really nice that everyone has "agreed" to include a valid #hash method
when they write an #= method.   We just do it regardless whether we
actually think we will put them in a HashedCollection.

>> I agree we human developers can consider unequal but same-quacking
>> objects interchangeable in our fuzzy minds when we design our
>> applications, and that is powerful, however, *within the system*, I
>> just it needs #= to make implementation-specific distinctions,
>> especially for as base-classes as Array and Dictionary.
>
> One thing #= should do is accept any object as an argument without error.

Another very good general agreement for equivalence testing.  I agree!  :)

> This wasn't always the case, but Smalltalk was not and never will be
> perfect.  Apart from that, what #= means is something we have to negotiate.
> I don't think my recent changes broke anything.

I don't think anything will be broken.  None of my outboard
SpecialDictionary's implement isDictionary.  It gave me pause to
wonder what it should answer..

>> This seems to be reflected by most of the #= implementations in the
>> system, which check either #class, #species, or #isKindOf:.
>
> Few of them use isKindOf:.  Some use class or species.  Some use #isFoo.  I
> don't see that #isFoo is worse than #class or #species.  Do you think it is?

Hm.  In any other method besides #=, #isFoo is always better.  In
Dictionary>>#=, it will perform better, it probably won't break
anything even external.  So, definitely not worse.  :)  Just making
sure when changing such a method as Dictionary>>#=...

 - Chris

Reply | Threaded
Open this post in threaded view
|

Re: What is equivalence? (was: The Trunk: Collections-eem.603.mcz)

Nicolas Cellier
As a side note, sometimes I'd like to relax =.
For example (1 to: 3) = #( 1 2 3) is a problem
- it either violate the hash equality or prevent an efficient hash computation for Intervals
- it violates transitivity of = - indeed (3 to: 2) and (5 to: 4) are not equal because they are (historically) used for marking different cursor positions in a Text, but they both equal #()

They behave similarly enough to be easily exchanged, so we might express this with hasSameElements or isSameSequence but not necessarily =, = is too strong IMO.

2015-02-05 3:38 GMT+01:00 Chris Muller <[hidden email]>:
> IMO that's a bug.   And the bug is that WeakArray species = WeakArray,
> whereas it should be Array.  If one selects/rejects/copies a WeakArray the
> result is more useful if it is strong.  It would be interesting to make the
> change and see what effect it has on the standard test suite.

Indeed..

>> I think the answer is because #= (not any other method, just #=) needs
>> to care about which _implementation_ of the argument which is passed
>> when considering true equivalence to another object.
>
> IMO the problem is because coming up with general purpose #= implementations
> is hard.  You must have seen the controversy my definition of
> CompiledMethod>>#= has caused over recent years.  #= has evolved a lot since
> Smalltalk-80 as its been more affordable to be cleverer, but that evolution
> also shows that its not absolutely obvious what #= should do; it depends on
> context, and current definitions of #= are general agreements of useful
> behaviour, some of these agreements having been made a long time ago.

Being cleverer in #= feels right but general agreements also feels
right in terms of having dependable contracts.  For example, its
really nice that everyone has "agreed" to include a valid #hash method
when they write an #= method.   We just do it regardless whether we
actually think we will put them in a HashedCollection.

>> I agree we human developers can consider unequal but same-quacking
>> objects interchangeable in our fuzzy minds when we design our
>> applications, and that is powerful, however, *within the system*, I
>> just it needs #= to make implementation-specific distinctions,
>> especially for as base-classes as Array and Dictionary.
>
> One thing #= should do is accept any object as an argument without error.

Another very good general agreement for equivalence testing.  I agree!  :)

> This wasn't always the case, but Smalltalk was not and never will be
> perfect.  Apart from that, what #= means is something we have to negotiate.
> I don't think my recent changes broke anything.

I don't think anything will be broken.  None of my outboard
SpecialDictionary's implement isDictionary.  It gave me pause to
wonder what it should answer..

>> This seems to be reflected by most of the #= implementations in the
>> system, which check either #class, #species, or #isKindOf:.
>
> Few of them use isKindOf:.  Some use class or species.  Some use #isFoo.  I
> don't see that #isFoo is worse than #class or #species.  Do you think it is?

Hm.  In any other method besides #=, #isFoo is always better.  In
Dictionary>>#=, it will perform better, it probably won't break
anything even external.  So, definitely not worse.  :)  Just making
sure when changing such a method as Dictionary>>#=...

 - Chris