The defaullt implementation of isEmpty might do too much work

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
37 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

Tobias Pape

On 25.10.2016, at 23:54, Chris Muller <[hidden email]> wrote:

> Hi Tobias and Eliot,
>
>>>> Folks there is NO BASIS at the level of Collection for assuming that
>>>> do: [ : each | ^
>>>> false ] is faster than ^self size=0.  In fact, the proper assumption
>>>> is the opposite -- that #size is the optimized method, not #do:.
>>>
>>> I don't understand on what _either_ performance-assumption is grounded.
>
> Exactly.  The original basis for making this change is groundless.

No Chris, I said I don't understand _both_ assumptions.

>
>> self size = 0 doesn't work for infinite collections.
>
> What's an infinite collection,

For example, a generator…
or a right-open interval?

[...]
>> ts or more the "clever" implementation is faster.
>
> Then add Bag>>#isEmpty, don't change Collection>>#isEmpty, because it
> changes a 30-year old assumption about how #isEmpty works for every
> other kind of Collection.

Kent Beck gives this example for 'Simple Delegation':

Object subclass: #Vector
  instanceVariableNames: ‘elements’

Vector>>do: aBlock
  elements do: aBlock

It seems the #do:-based implementation is 'more correct' here.

I found no other recommendation in either Beck's or Thomas' style guides.
To _me_ this suggests that what you perceive as 30-year-old-assumption
is not as universal as '#do: should suffice'.


Best regards
        -Tobias


Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

David T. Lewis
In reply to this post by monty-3
On Sat, Oct 22, 2016 at 12:59:48AM +0200, monty wrote:
> All non-trivial collections implement #size or inherit a custom implementation, usually just as a return of an inst var. Your #do:-based one is 3x as slow in my tests, so you've now made #isEmpty slower for every collection that implements #size just to benefit ones whose careless authors didn't and so the implementors of lazy collections have one fewer message to override.
>
> Do not do this. People already avoid #ifEmpty: and related messages where performance matters and shouldn't have to avoid #isEmpty too.

I like Eliot's new implementation, in part because it teaches the reader something
about how iteration works. It is still simple enough for an inexperienced person
to read, but it also challenges the reader to think about the control flow.

That said, I suspect that Monty and Chris are among the people with the most
real-world experience in dealing with performance on very large collections.

So, I would like to ask from a strictly practical point of view, does the change
that we are discussing here cause real-world problems for Gemstone and Magma
applications?

Does the change make the maintenance of large collection classes more difficult
for cross-dialect portability?

Are there any collection classes (especially in Gemstone and Magma) that become
slower in a real-world measurable sense as a result of this change?

Dave


Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

John Pfersich-2
+1.

Sent from my iPhone

> On Oct 25, 2016, at 16:42, David T. Lewis <[hidden email]> wrote:
>
>> On Sat, Oct 22, 2016 at 12:59:48AM +0200, monty wrote:
>> All non-trivial collections implement #size or inherit a custom implementation, usually just as a return of an inst var. Your #do:-based one is 3x as slow in my tests, so you've now made #isEmpty slower for every collection that implements #size just to benefit ones whose careless authors didn't and so the implementors of lazy collections have one fewer message to override.
>>
>> Do not do this. People already avoid #ifEmpty: and related messages where performance matters and shouldn't have to avoid #isEmpty too.
>
> I like Eliot's new implementation, in part because it teaches the reader something
> about how iteration works. It is still simple enough for an inexperienced person
> to read, but it also challenges the reader to think about the control flow.
>
> That said, I suspect that Monty and Chris are among the people with the most
> real-world experience in dealing with performance on very large collections.
>
> So, I would like to ask from a strictly practical point of view, does the change
> that we are discussing here cause real-world problems for Gemstone and Magma
> applications?
>
> Does the change make the maintenance of large collection classes more difficult
> for cross-dialect portability?
>
> Are there any collection classes (especially in Gemstone and Magma) that become
> slower in a real-world measurable sense as a result of this change?
>
> Dave
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

Chris Muller-3
In reply to this post by Nicolas Cellier
Hi Nicolas, I understand the angle you are seeing this from, but these
we distill these points down, we end up with no real-world value; but
some unintended negatives, in fact..

> did you see the implementation of Collection>>size?
>
> size
>     "Answer how many elements the receiver contains."
>
>     | tally |
>     tally := 0.
>     self do: [:each | tally := tally + 1].
>     ^ tally
>
> Let's forget a minute about specific subclasses.
> Does iterating over the whole collection is really the best idea for
> implementing isEmpty?

Asking this question is already assuming too much.
Collection>>#isEmpty should not be optimized in terms of another
method two sends away.  Collection is the ivory tower which only
codifies the elegant definitions of the general behaviors.  We should
let the subclasses be concerned with the nitty-gritty implementation
details needed to go crazy with their (less elegant) optimizations.
**We shouldn't try to make performance optimizations in Collection.

> When we implement a message in Collection, we should think in term of
> Collection, not Interval, Set nor Array.

The old implementation already was implemented in terms of Collection.

> We're talking of several things here:
> - universality of the default implementation:

The old implementation was universal.  #size always has and always
will be part of Collection.

>   we don't need to know about the size to tell if empty

Not any more than we need to avoid it.  Not any more than we need to
initiate a #do: enumeration (which is very expensive for some).

>    not all collection have a known size

So what will happen if they're sent #size then?

If #shouldNotImplement then they will override #isEmpty.

> - optimization:
>   old implementation is fast for collections having an optimized size,
> catastrophic for others.

Those are more rare, they should override #isEmpty for their specific needs.

>   new implementation is reasonably fast, and full optimization can be
> restored with a few 1-liner.

"Reasonably fast" for which subclass?  We can't ignore the subclasses.

> - cleverness:
>   IMO this is not more clever than, and quite logically connected to
> implementation of size
>
> I agree that having several implementations for the sake of optimization is
> not ideal.

It told me something was amiss with this change.  Too many subclasses disagree.

> That's what would have curbed my enthusiasm.
> Though I doubt about any noticeable slowdown in anything else than
> micro-benchmarks.

The SQLCollection example would be a large degradation.
-----
It all distills down to slightly-negative.  We should revert it.

Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

Chris Muller-3
In reply to this post by David T. Lewis
I agreed that it looked good in the ivory tower.  The thing is, there
are these real-world detractors which we should not igore.  To save my
old fingers, let me just summarize them from the earlier posts.  In
summary:

  - it provides no known material benefit
  - it could affect unsuspecting legacy application performance
materially, and in an insidious way
  - it's an anomalous usage of the API which works against performance
expectations about #size and #do:.
  - the justification for this change (performance optimization) in
Collection is not valid in the first place.

No real-world positives, just negatives or zeros.  Please revert.

On Tue, Oct 25, 2016 at 6:42 PM, David T. Lewis <[hidden email]> wrote:

> On Sat, Oct 22, 2016 at 12:59:48AM +0200, monty wrote:
>> All non-trivial collections implement #size or inherit a custom implementation, usually just as a return of an inst var. Your #do:-based one is 3x as slow in my tests, so you've now made #isEmpty slower for every collection that implements #size just to benefit ones whose careless authors didn't and so the implementors of lazy collections have one fewer message to override.
>>
>> Do not do this. People already avoid #ifEmpty: and related messages where performance matters and shouldn't have to avoid #isEmpty too.
>
> I like Eliot's new implementation, in part because it teaches the reader something
> about how iteration works. It is still simple enough for an inexperienced person
> to read, but it also challenges the reader to think about the control flow.
>
> That said, I suspect that Monty and Chris are among the people with the most
> real-world experience in dealing with performance on very large collections.
>
> So, I would like to ask from a strictly practical point of view, does the change
> that we are discussing here cause real-world problems for Gemstone and Magma
> applications?
>
> Does the change make the maintenance of large collection classes more difficult
> for cross-dialect portability?
>
> Are there any collection classes (especially in Gemstone and Magma) that become
> slower in a real-world measurable sense as a result of this change?
>
> Dave
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

Bert Freudenberg
On Wed, Oct 26, 2016 at 6:42 AM, Chris Muller <[hidden email]> wrote:
I agreed that it looked good in the ivory tower.  The thing is, there
are these real-world detractors which we should not igore.  To save my
old fingers, let me just summarize them from the earlier posts.  In
summary:

  - it provides no known material benefit

Having an O(1) runtime vs O(n) is a very large material benefit. Collection>>size is O(n) so it's not a good base for #isEmpty.
 
  - it could affect unsuspecting legacy application performance
materially, and in an insidious way

If you have performance problems, profile.
 
  - it's an anomalous usage of the API which works against performance
expectations about #size and #do:.

The performance expectation of both #size and #do: in Collection is O(n). And there is nothing "anomalous" in using #do:, quite to the contrary, #do: is the *essence* of a collection of objects.
 
  - the justification for this change (performance optimization) in
Collection is not valid in the first place.

It is very valid optimization given the implementation of Collection>>size.
 
No real-world positives, just negatives or zeros.  Please revert.
 
Quite the opposite. It's a major improvement in basic API.

Yes, you will have to provide an optimized #isEmpty now, because you relied on an implementation detail of a superclass.  But there never was a contract that #isEmpty needs to be implemented in terms of #size. The new implementation makes much more sense for an arbitrary abstract Collection, so I'd like to see it stay.

- Bert -



Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

Frank Shearar-3
On 26 October 2016 at 09:35, Bert Freudenberg <[hidden email]> wrote:
On Wed, Oct 26, 2016 at 6:42 AM, Chris Muller <[hidden email]> wrote:
I agreed that it looked good in the ivory tower.  The thing is, there
are these real-world detractors which we should not igore.  To save my
old fingers, let me just summarize them from the earlier posts.  In
summary:

  - it provides no known material benefit

Having an O(1) runtime vs O(n) is a very large material benefit. Collection>>size is O(n) so it's not a good base for #isEmpty.
 
  - it could affect unsuspecting legacy application performance
materially, and in an insidious way

If you have performance problems, profile.
 
  - it's an anomalous usage of the API which works against performance
expectations about #size and #do:.

The performance expectation of both #size and #do: in Collection is O(n). And there is nothing "anomalous" in using #do:, quite to the contrary, #do: is the *essence* of a collection of objects.
 
  - the justification for this change (performance optimization) in
Collection is not valid in the first place.

It is very valid optimization given the implementation of Collection>>size.
 
No real-world positives, just negatives or zeros.  Please revert.
 
Quite the opposite. It's a major improvement in basic API.

Yes, you will have to provide an optimized #isEmpty now, because you relied on an implementation detail of a superclass.  But there never was a contract that #isEmpty needs to be implemented in terms of #size. The new implementation makes much more sense for an arbitrary abstract Collection, so I'd like to see it stay.

+1. Especially since we've seen examples of Collections where #size just doesn't make sense: generators, Actor-style mailboxes, Xtreams (and, by extension, all reactive APIs).

Which is exactly why in C# an IEnumerable (a Collection) doesn't even have a Count method. (It's added as an extension, and People Know(tm) to be careful using it on an abstract collection.)

Really, to be properly Ivory Tower, and Elegant, Collection's #size should be removed. And as Eliot correctly points out, the only elegant way to ask a Collection about which you know nothing, whether it has any elements, is to _evaluate its first value_.

I can't give real world Smalltalk experience reports, but I can say, with many burn marks to prove it, that asking an abstract collection of things for its size, is a terrible idea.

frank
 
- Bert -







Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

Chris Muller-3
In reply to this post by Bert Freudenberg
I don't wish to belabor, but you completely ignored all of my (and
Monty's) arguments.  Yes, there *is* a sort-of "contract" that
#isEmpty should be implemented in terms of #size, and not #do:, that
is now being violated.  I explained why earlier and you even said it
was a point worth discussing, but never did.  The comment that was
added to Collection>>#isEmpty advertises the flawed thinking.

The flaws with all of your other statements were addressed in prior
posts, too.  I don't know why any of the advocates for this change
couldn't address or even acknowledge those arguments.  Even the
comment that was added to Collection>>#isEmpty advertises the flawed
thinking.  I hope someone will go back and really _read_ the arguments
being made in this thread.

Just saw Franks note.  Same thing..    :(



On Wed, Oct 26, 2016 at 11:35 AM, Bert Freudenberg <[hidden email]> wrote:

> On Wed, Oct 26, 2016 at 6:42 AM, Chris Muller <[hidden email]> wrote:
>>
>> I agreed that it looked good in the ivory tower.  The thing is, there
>> are these real-world detractors which we should not igore.  To save my
>> old fingers, let me just summarize them from the earlier posts.  In
>> summary:
>>
>>   - it provides no known material benefit
>
>
> Having an O(1) runtime vs O(n) is a very large material benefit.
> Collection>>size is O(n) so it's not a good base for #isEmpty.
>
>>
>>   - it could affect unsuspecting legacy application performance
>> materially, and in an insidious way
>
>
> If you have performance problems, profile.
>
>>
>>   - it's an anomalous usage of the API which works against performance
>> expectations about #size and #do:.
>
>
> The performance expectation of both #size and #do: in Collection is O(n).
> And there is nothing "anomalous" in using #do:, quite to the contrary, #do:
> is the *essence* of a collection of objects.
>
>>
>>   - the justification for this change (performance optimization) in
>> Collection is not valid in the first place.
>
>
> It is very valid optimization given the implementation of Collection>>size.
>
>>
>> No real-world positives, just negatives or zeros.  Please revert.
>
>
> Quite the opposite. It's a major improvement in basic API.
>
> Yes, you will have to provide an optimized #isEmpty now, because you relied
> on an implementation detail of a superclass.  But there never was a contract
> that #isEmpty needs to be implemented in terms of #size. The new
> implementation makes much more sense for an arbitrary abstract Collection,
> so I'd like to see it stay.
>
> - Bert -
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

Eliot Miranda-2


On Wed, Oct 26, 2016 at 11:41 AM, Chris Muller <[hidden email]> wrote:
I don't wish to belabor, but you completely ignored all of my (and
Monty's) arguments.  Yes, there *is* a sort-of "contract" that
#isEmpty should be implemented in terms of #size, and not #do:, that
is now being violated.  I explained why earlier and you even said it
was a point worth discussing, but never did.  The comment that was
added to Collection>>#isEmpty advertises the flawed thinking.

I don't see any such contract.  Collection has a contract; to be fully functional as a collection class the subclass must implement #do:.  This is stated; #do: is a subclass responsibility of Collection.  I see no stated contract that #isEmpty is implemented in terms of #size.  Given that in Collection #size is implemented in terms of #do:, we are free to implement #isEmpty et al in terms either of #do: or of #size.  The new implementation is better for large collections, works for infinite collections, and is hence to be preferred.
 

The flaws with all of your other statements were addressed in prior
posts, too.  I don't know why any of the advocates for this change
couldn't address or even acknowledge those arguments.  Even the
comment that was added to Collection>>#isEmpty advertises the flawed
thinking.  I hope someone will go back and really _read_ the arguments
being made in this thread.

I don't see what's special here.  We've given arguments for the change; both Bert and Frank (effectively, in my view) refuted your arguments against the change.  This is no different than other performance-related changes that have been made in the past.  You have yet to present an example of code that is broken ny this change.

Just saw Franks note.  Same thing..    :(

Looked pretty cogent and on point to me.
 



On Wed, Oct 26, 2016 at 11:35 AM, Bert Freudenberg <[hidden email]> wrote:
> On Wed, Oct 26, 2016 at 6:42 AM, Chris Muller <[hidden email]> wrote:
>>
>> I agreed that it looked good in the ivory tower.  The thing is, there
>> are these real-world detractors which we should not igore.  To save my
>> old fingers, let me just summarize them from the earlier posts.  In
>> summary:
>>
>>   - it provides no known material benefit
>
>
> Having an O(1) runtime vs O(n) is a very large material benefit.
> Collection>>size is O(n) so it's not a good base for #isEmpty.
>
>>
>>   - it could affect unsuspecting legacy application performance
>> materially, and in an insidious way
>
>
> If you have performance problems, profile.
>
>>
>>   - it's an anomalous usage of the API which works against performance
>> expectations about #size and #do:.
>
>
> The performance expectation of both #size and #do: in Collection is O(n).
> And there is nothing "anomalous" in using #do:, quite to the contrary, #do:
> is the *essence* of a collection of objects.
>
>>
>>   - the justification for this change (performance optimization) in
>> Collection is not valid in the first place.
>
>
> It is very valid optimization given the implementation of Collection>>size.
>
>>
>> No real-world positives, just negatives or zeros.  Please revert.
>
>
> Quite the opposite. It's a major improvement in basic API.
>
> Yes, you will have to provide an optimized #isEmpty now, because you relied
> on an implementation detail of a superclass.  But there never was a contract
> that #isEmpty needs to be implemented in terms of #size. The new
> implementation makes much more sense for an arbitrary abstract Collection,
> so I'd like to see it stay.
>
> - Bert -
>
>
>
>




--
_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

Chris Muller-3
Hi Eliot, I put a lot of effort to convey these arguments for the
betterment of Squeak, I hope you'll oblige this last effort with an
open and critical mind..

> I don't see any such contract.  Collection has a contract; to be fully
> functional as a collection class the subclass must implement #do:.  This is
> stated; #do: is a subclass responsibility of Collection.  I see no stated
> contract that #isEmpty is implemented in terms of #size.

I said a "sort-of 'contract'", because the problem with this change is
more about going against the universal reality of the Smalltalk
environment and ecosystem, not against an explicit API contract.  I
made an explanation of this issue in my earlier post which began with
"Since Smalltalk-80, sends to #do: ..."

> Given that in
> Collection #size is implemented in terms of #do:, we are free to implement
> #isEmpty et al in terms either of #do: or of #size.  The new implementation
> is better for large collections, works for infinite collections, and is
> hence to be preferred.

If you understood that earlier post, then you understand why I believe
we are NOT free to change Collection>>#isEmpty to operate in terms
#do:, because we will have surreptitiously changed legacy applications
which incur a much heavier cost with the #do: implementation.  I
provided a very plausible example of this (the SQLCollection) which
should not be ignored.

Now, I guess this change *already* did even slow down #isEmpty for
some existing classes in the image(!!), to which Nicolas reponded to
by copying-and-pasting the old implementation code into multiple
classes in order to recover their original performance.  Wow.

In exchange for these blemishes, how many classes actually got benefit
from inheriting the new implementation?

       Collection allSubclasses count: [ : each | (each
lookupSelector: #size) methodClass = Collection]   "0"

So we're obviously not talking about benefit to any *in-image* kind of
Collection, maybe just potential benefit to some... external
application's Collection?  But while hurting others..?  Have we
reached a point of enough diminishing returns yet?

The reality is that #size is not only part of Smalltalk's original
concept of a Collection since 30 years ago, but is universally
expected to be fast, and Smalltalkers write there code under that
assumption.  That's why the responsibility should be on those unknown
*exceptional* subclasses, the InfiniteCollection's or whatever, to
deny their #size and provide their alternate #isEmpty's.

Tallying it all up, we've got (-1) potential legacy impacts, (-1)
all-new copy-and-pasted code in our core library, just so (+0) some
exceptional-case external class MIGHT inherit this faster-for-them
implementaiton rather than override it.

I totally understand and agree with the ivory-tower arguments made in
favor of this change, its just that in the practical world, it turns
out to be a net negative.

Best,
  Chris

Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

David T. Lewis
On Wed, Oct 26, 2016 at 05:51:40PM -0500, Chris Muller wrote:
>
> In exchange for these blemishes, how many classes actually got benefit
> from inheriting the new implementation?
>
>        Collection allSubclasses count: [ : each | (each
> lookupSelector: #size) methodClass = Collection]   "0"
>

Fair point. Every subclass of Collection reimplements #size in some way.
Collection>>size is never actually used by any class in the image, and
possibly not in any classes outside the image. So Collection>>size serves
as documentation of the intent, and as a default implementation if someone
were to implement a new subclass of Collection directly.

Some subclasses (e.g. Bag) would have similar characteristics and would presumably
benefit from the new #isEmpty implementation, but most seem to have #size
implementations that do not degrade in performance as size increases.

So Chris is raising a fair question. The new #isEmpty looks like it should be
faster, but is it actually so in practice?

Dave

p.s. I personally still prefer the new implementation on aesthetic grounds.


Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

Bert Freudenberg
In reply to this post by Chris Muller-3
On Thu, Oct 27, 2016 at 12:51 AM, Chris Muller <[hidden email]> wrote:

In exchange for these blemishes, how many classes actually got benefit
from inheriting the new implementation?

| b |
b := (1 to: 10000) asBag.
[b isEmpty] bench

before: :  '5,540 per second. 181 microseconds per run.'
now:  '167,000 per second. 6 microseconds per run.'
 
In my book that's a speedup worth having.

- Bert -


Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

Stéphane Rollandin
In reply to this post by Bert Freudenberg
> there never was
> a contract that #isEmpty needs to be implemented in terms of #size. The
> new implementation makes much more sense for an arbitrary abstract
> Collection, so I'd like to see it stay.
>
> - Bert -

+1

Stef


Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

Chris Muller-4
In reply to this post by Bert Freudenberg
> | b |
> b := (1 to: 10000) asBag.
> [b isEmpty] bench
>
> before: :  '5,540 per second. 181 microseconds per run.'
> now:  '167,000 per second. 6 microseconds per run.'
>
> In my book that's a speedup worth having.

You're depending on the superclass implementation in Collection for
that speedup.  What did you say about that yesterday?

Honestly, Bert, it almost feels like you're being intellectually
dishonest.  You chose the only one that got sped up and ignored all
the ones that got slowed down.  AND, you also ignored all the other
points (again) which render the above point irrelevant.  Sad.

I give up.  You "win".

Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

Hannes Hirzel
Yes, in this case a series of tests across collection classes actually
would be a proof. A single point measurement is not a statistically
relevant result.

--Hannes

On 10/27/16, Chris Muller <[hidden email]> wrote:

>> | b |
>> b := (1 to: 10000) asBag.
>> [b isEmpty] bench
>>
>> before: :  '5,540 per second. 181 microseconds per run.'
>> now:  '167,000 per second. 6 microseconds per run.'
>>
>> In my book that's a speedup worth having.
>
> You're depending on the superclass implementation in Collection for
> that speedup.  What did you say about that yesterday?
>
> Honestly, Bert, it almost feels like you're being intellectually
> dishonest.  You chose the only one that got sped up and ignored all
> the ones that got slowed down.  AND, you also ignored all the other
> points (again) which render the above point irrelevant.  Sad.
>
> I give up.  You "win".
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

Hannes Hirzel
Or to put it diffently: Bert supplied the first contribution in this
thread which is about performance actually measuring performance......

More is welcome.

On 10/27/16, H. Hirzel <[hidden email]> wrote:

> Yes, in this case a series of tests across collection classes actually
> would be a proof. A single point measurement is not a statistically
> relevant result.
>
> --Hannes
>
> On 10/27/16, Chris Muller <[hidden email]> wrote:
>>> | b |
>>> b := (1 to: 10000) asBag.
>>> [b isEmpty] bench
>>>
>>> before: :  '5,540 per second. 181 microseconds per run.'
>>> now:  '167,000 per second. 6 microseconds per run.'
>>>
>>> In my book that's a speedup worth having.
>>
>> You're depending on the superclass implementation in Collection for
>> that speedup.  What did you say about that yesterday?
>>
>> Honestly, Bert, it almost feels like you're being intellectually
>> dishonest.  You chose the only one that got sped up and ignored all
>> the ones that got slowed down.  AND, you also ignored all the other
>> points (again) which render the above point irrelevant.  Sad.
>>
>> I give up.  You "win".
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: The defaullt implementation of isEmpty might do too much work

monty-3
In reply to this post by Bert Freudenberg
Or just make Collection>>#size a subclassResponsibiliity!

This will force implementors to give proper #size implementations that are usually O(1), which almost all do already. Bag could get the #do:-based #isEmpty as an optimization, or we could add a dedicated 'tally' inst var like other collections have and its #size could return that, making the default #size-based #isEmpty OK for it.

This is a compromise that addresses the concerns about the default Collection>>#isEmpty degrading linearly without a constant #size and the concerns about harming the performance of existing libraries and applications with custom collections that assume Collection>>#isEmpty is #size-based.

Please consider it.

> From: "Bert Freudenberg" <[hidden email]>
> To: "Chris Muller" <[hidden email]>, "The general-purpose Squeak developers list" <[hidden email]>
> Subject: Re: [squeak-dev] Re: The defaullt implementation of isEmpty might do too much work
>
> On Thu, Oct 27, 2016 at 12:51 AM, Chris Muller <[hidden email][mailto:[hidden email]]> wrote:
> In exchange for these blemishes, how many classes actually got benefit
> from inheriting the new implementation?
>  
> | b |
> b := (1 to: 10000) asBag.
> [b isEmpty] bench
>  
> before: :  '5,540 per second. 181 microseconds per run.'
> now:  '167,000 per second. 6 microseconds per run.'
>  In my book that's a speedup worth having.
>  

12