String >> #asSet

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

String >> #asSet

Prof. Andrew P. Black
If you have a String (or a Symbol), and you sent it the message #asSet, what do you expect to get as an answer?

A set of characters, one would think.  But do you care what class is used to implement that set of characters?

-- One argument says that it should be a Set, because that’s what asSet answers when it is sent to other collections.  It’s conceivable that you might initially populate the set with Characters, but then add other kinds of object.

-- Another argument says that it should be a CharacterSet (or a WideCharacterSet), because you know that the receiver contains characters.   The expression 'abcd' asSet is a convenient way to create a CharacterSet with: $a with: $b with: $c with: $d

-- A third argument says that you shouldn’t care.  If you really want a specific class, then you should use (aString as: Set) rather than aString asSet.

Right now, in Pharo 7, there is a test in TConvertAsSetForMultiplinessTest that insists that the class of the result of asSet is actually Set.  This test is already overridden once, in IdentityBagTest, because the result of sending asSet to an IdentityBag is not a Set — it’s an IdentitySet.  So, indeed, there is already a precedent for asSet returning a set object that is not an instance of Set.

I think that the test is bad, and should be changed.  But this raises the larger question: if you want to know whether or not a collection has set-like behaviour, what is the right way of finding out?  (aCollection isKindOf: Set) reads OK, but is actually wrong, because it’s testing the implementation, and not the behaviour.   A predicate on all collections (isSet would be the obvious name) would seem to be indicated.  And that’s what the test should be using.

(You might ask why CharacterSet and WideCharacterSet are not subclasses of Set.  This answer is that subclassing Set implies inheriting a hash table, and the raison d’être for CharacterSet is that it does not use a hash table.)  
Reply | Threaded
Open this post in threaded view
|

Re: String >> #asSet

Richard Sargent (again)
In general, when desiring a specific class, to see (Specific collection class withAll: someOtherCollectionInstance).


When sending a message, you don't typically expect a given class. Rather, you should expect a set of behaviour. (In this case, an object that behaves like a set.)
On Oct 23, 2017, at 05:50, "Prof. Andrew P. Black" <[hidden email]> wrote:
If you have a String (or a Symbol), and you sent it the message #asSet, what do you expect to get as an answer?

A set of characters, one would think. But do you care what class is used to implement that set of characters?

-- One argument says that it should be a Set, because that’s what asSet answers when it is sent to other collections. It’s conceivable that you might initially populate the set with Characters, but then add other kinds of object.

-- Another argument says that it should be a CharacterSet (or a WideCharacterSet), because you know that the receiver contains characters. The expression 'abcd' asSet is a convenient way to create a CharacterSet with: $a with: $b with: $c with: $d

-- A third argument says that you shouldn’t care. If you really want a specific class, then you should use (aString as: Set) rather than aString asSet.

Right now, in Pharo 7, there is a test in TConvertAsSetForMultiplinessTest that insists that the class of the result of asSet is actually Set. This test is already overridden once, in IdentityBagTest, because the result of sending asSet to an IdentityBag is not a Set — it’s an IdentitySet. So, indeed, there is already a precedent for asSet returning a set object that is not an instance of Set.

I think that the test is bad, and should be changed. But this raises the larger question: if you want to know whether or not a collection has set-like behaviour, what is the right way of finding out? (aCollection isKindOf: Set) reads OK, but is actually wrong, because it’s testing the implementation, and not the behaviour. A predicate on all collections (isSet would be the obvious name) would seem to be indicated. And that’s what the test should be using.

(You might ask why CharacterSet and WideCharacterSet are not subclasses of Set. This answer is that subclassing Set implies inheriting a hash table, and the raison d’être for CharacterSet is that it does not use a hash table.)
Reply | Threaded
Open this post in threaded view
|

Re: String >> #asSet

Ben Coman

On Oct 23, 2017, at 05:50, "Prof. Andrew P. Black" <[hidden email]> wrote:
If you have a String (or a Symbol), and you sent it the message #asSet, what do you expect to get as an answer?

A set of characters, one would think. But do you care what class is used to implement that set of characters?

-- One argument says that it should be a Set, because that’s what asSet answers when it is sent to other collections. It’s conceivable that you might initially populate the set with Characters, but then add other kinds of object.

-- Another argument says that it should be a CharacterSet (or a WideCharacterSet), because you know that the receiver contains characters. The expression 'abcd' asSet is a convenient way to create a CharacterSet with: $a with: $b with: $c with: $d

-- A third argument says that you shouldn’t care. If you really want a specific class, then you should use (aString as: Set) rather than aString asSet.

Right now, in Pharo 7, there is a test in TConvertAsSetForMultiplinessTest that insists that the class of the result of asSet is actually Set.
Hard coding a particular return class does seems to break the generality that make traits a good duck typing approach to testing. Instead you want to check it conforms to a series of tests like this...
result := #(1 2 3 4) asSet.
startSize := result size.
result add: s anyOne.
self assert: result size equals: startSize. 

but you don't want to repeat yourself reproduce similar tests by hand everywhere.
One approach to reuse might be to refactor SetTest so it can be called like...
  TConvertAsSetForMultiplinessTest>>testInQuestion
result := self getResult.
       SetTest runDuckTestsFor: result

where #runDuckTestsFor: would exclude Set instance creation tests and suchlike.
Duck tests might all take a single argument, so SetTest could call them with a series of concrete cases.  

The leads me to a philosophical quesition... should methods providing concrete test samples be stored on the class-side.  That is, its simple to use...
   MyClassTest >> someTest
       sample := self sample.
       etc...

but would it be more correct to do...
   MyClassTest >> someTest
       sample := self class sample.
       etc...

since #sample is unrelated to the state of any particularMyClass.
And maybe its better to drive it all from the class side...
SetTest>>runcase
    self samples do: [ :duck | self runDuckTestsFor: duck].
    self runCreationTests.

Now I'm not sure if the following is the RightThing(tm), but for discussion...
SetTest>>runDuckTestsFor: duck
    self testsTakingOneArgument do: [ :method | self perform: method with: duck ].


cheers -ben