The Trunk: Collections-eem.792.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

The Trunk: Collections-eem.792.mcz

commits-2
Eliot Miranda uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-eem.792.mcz

==================== Summary ====================

Name: Collections-eem.792
Author: eem
Time: 3 May 2018, 12:55:52.175146 am
UUID: 7d8995ed-835e-44b0-bf4c-0b0780f5c96f
Ancestors: Collections-pre.791

Four times faster implementation of isAsciiString.

=============== Diff against Collections-pre.791 ===============

Item was changed:
  ----- Method: String>>isAsciiString (in category 'testing') -----
  isAsciiString
+ "Answer if the receiver contains only ascii characters.
+ Inline ^self allSatisfy: [ :each | each asciiValue <= 127 ] for speed."
+ 1 to: self basicSize do: [:i| (self basicAt: i) > 127 ifTrue: [^false]].
+ ^true!
-
- ^self allSatisfy: [ :each | each asciiValue <= 127 ]!


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Tobias Pape
Hi Eliot

> On 03.05.2018, at 09:56, [hidden email] wrote:
>
> Eliot Miranda uploaded a new version of Collections to project The Trunk:
> http://source.squeak.org/trunk/Collections-eem.792.mcz
>
> ==================== Summary ====================
>
> Name: Collections-eem.792
> Author: eem
> Time: 3 May 2018, 12:55:52.175146 am
> UUID: 7d8995ed-835e-44b0-bf4c-0b0780f5c96f
> Ancestors: Collections-pre.791
>
> Four times faster implementation of isAsciiString.
>
> =============== Diff against Collections-pre.791 ===============
>
> Item was changed:
>  ----- Method: String>>isAsciiString (in category 'testing') -----
>  isAsciiString
> + "Answer if the receiver contains only ascii characters.
> + Inline ^self allSatisfy: [ :each | each asciiValue <= 127 ] for speed."
> + 1 to: self basicSize do: [:i| (self basicAt: i) > 127 ifTrue: [^false]].
> + ^true!
> -
> - ^self allSatisfy: [ :each | each asciiValue <= 127 ]!
>
>

Although I am in awe of the performance improvement, I am curious wether it really pays of to inline #allSatify: here, in terms of performance vs. readability.

I presume that, given the current use of #isAsciiString, we can stay with the more compact, readable version that does not need the 'caveat lector'-comment at the beginning of the method.


Best regards
        -Tobias


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Chris Muller-3
+1.  Not only that, why the call to #basicSize and #basicAt:.  Really?

It's like...  not even Smalltalk anymore...   :(

On Thu, May 3, 2018 at 3:41 AM, Tobias Pape <[hidden email]> wrote:

> Hi Eliot
>
>> On 03.05.2018, at 09:56, [hidden email] wrote:
>>
>> Eliot Miranda uploaded a new version of Collections to project The Trunk:
>> http://source.squeak.org/trunk/Collections-eem.792.mcz
>>
>> ==================== Summary ====================
>>
>> Name: Collections-eem.792
>> Author: eem
>> Time: 3 May 2018, 12:55:52.175146 am
>> UUID: 7d8995ed-835e-44b0-bf4c-0b0780f5c96f
>> Ancestors: Collections-pre.791
>>
>> Four times faster implementation of isAsciiString.
>>
>> =============== Diff against Collections-pre.791 ===============
>>
>> Item was changed:
>>  ----- Method: String>>isAsciiString (in category 'testing') -----
>>  isAsciiString
>> +     "Answer if the receiver contains only ascii characters.
>> +      Inline ^self allSatisfy: [ :each | each asciiValue <= 127 ] for speed."
>> +     1 to: self basicSize do: [:i| (self basicAt: i) > 127 ifTrue: [^false]].
>> +     ^true!
>> -
>> -     ^self allSatisfy: [ :each | each asciiValue <= 127 ]!
>>
>>
>
> Although I am in awe of the performance improvement, I am curious wether it really pays of to inline #allSatify: here, in terms of performance vs. readability.
>
> I presume that, given the current use of #isAsciiString, we can stay with the more compact, readable version that does not need the 'caveat lector'-comment at the beginning of the method.
>
>
> Best regards
>         -Tobias
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Eliot Miranda-2


On Thu, May 3, 2018 at 10:12 AM, Chris Muller <[hidden email]> wrote:
+1.  Not only that, why the call to #basicSize and #basicAt:.  Really?

It's like...  not even Smalltalk anymore...   :(

What?  This is a workhorse method, and you want to create a block, and apply the block to each element, and convert each element from a character to an integer, when basicAt: yields (and has always yielded) integers in strings?  You think that one line with a block and a high-level iterator is so much better than a two line with a simpler inner loop?
 
| strings nothing allSatisfy basicAt |
strings := String allSubInstances.
strings := strings, strings, strings, strings, strings.
nothing := [:s|].
allSatisfy := [:s| s isAsciiStringWithAllSatisfy].
basicAt := [:s| s isAsciiStringWithBasicAt].
Smalltalk garbageCollect.
1 to: 2 do:
[:ign| times := { nothing. allSatisfy. basicAt } collect: [:block| [strings do: block] timeToRun]].
(times second - times first / (times third - times first) asFloat) roundTo: 0.1

Answers range from 5.1 to 5.7 on 64-bits Mac OS X.

You really want that elegance for something that is 5 times slower but one line shorter?  I did this because I saw Patrick inline isAsciiString in mail message processing, and I thought a) his method would read better if it used isAsciiString, and b) isAsciiString can be implemented much better without affecting clients, so I went ahead.  Part of the beauty of Smalltalk is that clients are insulated from implementation so that implementations can be improved.  Since when has that not been a part of Smalltalk?


On Thu, May 3, 2018 at 3:41 AM, Tobias Pape <[hidden email]> wrote:
> Hi Eliot
>
>> On 03.05.2018, at 09:56, [hidden email] wrote:
>>
>> Eliot Miranda uploaded a new version of Collections to project The Trunk:
>> http://source.squeak.org/trunk/Collections-eem.792.mcz
>>
>> ==================== Summary ====================
>>
>> Name: Collections-eem.792
>> Author: eem
>> Time: 3 May 2018, 12:55:52.175146 am
>> UUID: 7d8995ed-835e-44b0-bf4c-0b0780f5c96f
>> Ancestors: Collections-pre.791
>>
>> Four times faster implementation of isAsciiString.
>>
>> =============== Diff against Collections-pre.791 ===============
>>
>> Item was changed:
>>  ----- Method: String>>isAsciiString (in category 'testing') -----
>>  isAsciiString
>> +     "Answer if the receiver contains only ascii characters.
>> +      Inline ^self allSatisfy: [ :each | each asciiValue <= 127 ] for speed."
>> +     1 to: self basicSize do: [:i| (self basicAt: i) > 127 ifTrue: [^false]].
>> +     ^true!
>> -
>> -     ^self allSatisfy: [ :each | each asciiValue <= 127 ]!
>>
>>
>
> Although I am in awe of the performance improvement, I am curious wether it really pays of to inline #allSatify: here, in terms of performance vs. readability.
>
> I presume that, given the current use of #isAsciiString, we can stay with the more compact, readable version that does not need the 'caveat lector'-comment at the beginning of the method.
>
>
> Best regards
>         -Tobias
>
>




--
_,,,^..^,,,_
best, Eliot


cbc
Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

cbc


On Thu, May 3, 2018 at 10:34 AM, Eliot Miranda <[hidden email]> wrote:


On Thu, May 3, 2018 at 10:12 AM, Chris Muller <[hidden email]> wrote:
+1.  Not only that, why the call to #basicSize and #basicAt:.  Really?

It's like...  not even Smalltalk anymore...   :(

What?  This is a workhorse method, and you want to create a block, and apply the block to each element, and convert each element from a character to an integer, when basicAt: yields (and has always yielded) integers in strings?  You think that one line with a block and a high-level iterator is so much better than a two line with a simpler inner loop?
 
| strings nothing allSatisfy basicAt |
strings := String allSubInstances.
strings := strings, strings, strings, strings, strings.
nothing := [:s|].
allSatisfy := [:s| s isAsciiStringWithAllSatisfy].
basicAt := [:s| s isAsciiStringWithBasicAt].
Smalltalk garbageCollect.
1 to: 2 do:
[:ign| times := { nothing. allSatisfy. basicAt } collect: [:block| [strings do: block] timeToRun]].
(times second - times first / (times third - times first) asFloat) roundTo: 0.1

Answers range from 5.1 to 5.7 on 64-bits Mac OS X.

You really want that elegance for something that is 5 times slower but one line shorter?  I did this because I saw Patrick inline isAsciiString in mail message processing, and I thought a) his method would read better if it used isAsciiString, and b) isAsciiString can be implemented much better without affecting clients, so I went ahead.  Part of the beauty of Smalltalk is that clients are insulated from implementation so that implementations can be improved.  Since when has that not been a part of Smalltalk?

While I understand (and like) the idea of having nice, clean, elegant methods everywhere in Squeak, I also would like it to be super fast.  I think Eliot's change is a nice example straddling that line - note the elegant way, and then show (and implement) a fast way to do the same thing still just using Smalltalk.

Not to mention that he actually commented the method with what it should do.

-cbc


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Tobias Pape
In reply to this post by Eliot Miranda-2
Hi Eliot
> On 03.05.2018, at 19:34, Eliot Miranda <[hidden email]> wrote:
>
> This is a workhorse method

Just a question for clarification, what is a workhorse method?

Best regards
        -Tobias

cbc
Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

cbc
Workhorse - meaning it takes on a lot of work, is heavily used.

So, I looked around the image, and it is used by the clipboard AND it is also used in basically any type of file-out of code.

On Thu, May 3, 2018 at 10:54 AM, Tobias Pape <[hidden email]> wrote:
Hi Eliot
> On 03.05.2018, at 19:34, Eliot Miranda <[hidden email]> wrote:
>
> This is a workhorse method

Just a question for clarification, what is a workhorse method?

Best regards
        -Tobias




Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Eliot Miranda-2
In reply to this post by Tobias Pape


On Thu, May 3, 2018 at 10:54 AM, Tobias Pape <[hidden email]> wrote:
Hi Eliot
> On 03.05.2018, at 19:34, Eliot Miranda <[hidden email]> wrote:
>
> This is a workhorse method

Just a question for clarification, what is a workhorse method?

Workhorse, noun
- a horse used for work on a farm.
- a person or machine that dependably performs hard work over a long period of time: "he was a workhorse of an actor, often appearing in as many as forty plays in a year".
 


_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Levente Uzonyi
In reply to this post by Chris Muller-3
Uh, I'm glad it wasn't me who rewrote this method. I would have used a
primitive instead to make it even more not-even-Smalltalk-anymore.
If you don't like this kind of code, you'd better not look under the hood
too much. Especially avoid sneaking a peek at Kernel and Collections if
you can. :)

On a serious note, libary methods, like that method, ought to be fast,
because you can't know in what situation they will be used. Saying that
it's not even used is therefore not a valid reason.

Levente

On Thu, 3 May 2018, Chris Muller wrote:

> +1.  Not only that, why the call to #basicSize and #basicAt:.  Really?
>
> It's like...  not even Smalltalk anymore...   :(
>
> On Thu, May 3, 2018 at 3:41 AM, Tobias Pape <[hidden email]> wrote:
>> Hi Eliot
>>
>>> On 03.05.2018, at 09:56, [hidden email] wrote:
>>>
>>> Eliot Miranda uploaded a new version of Collections to project The Trunk:
>>> http://source.squeak.org/trunk/Collections-eem.792.mcz
>>>
>>> ==================== Summary ====================
>>>
>>> Name: Collections-eem.792
>>> Author: eem
>>> Time: 3 May 2018, 12:55:52.175146 am
>>> UUID: 7d8995ed-835e-44b0-bf4c-0b0780f5c96f
>>> Ancestors: Collections-pre.791
>>>
>>> Four times faster implementation of isAsciiString.
>>>
>>> =============== Diff against Collections-pre.791 ===============
>>>
>>> Item was changed:
>>>  ----- Method: String>>isAsciiString (in category 'testing') -----
>>>  isAsciiString
>>> +     "Answer if the receiver contains only ascii characters.
>>> +      Inline ^self allSatisfy: [ :each | each asciiValue <= 127 ] for speed."
>>> +     1 to: self basicSize do: [:i| (self basicAt: i) > 127 ifTrue: [^false]].
>>> +     ^true!
>>> -
>>> -     ^self allSatisfy: [ :each | each asciiValue <= 127 ]!
>>>
>>>
>>
>> Although I am in awe of the performance improvement, I am curious wether it really pays of to inline #allSatify: here, in terms of performance vs. readability.
>>
>> I presume that, given the current use of #isAsciiString, we can stay with the more compact, readable version that does not need the 'caveat lector'-comment at the beginning of the method.
>>
>>
>> Best regards
>>         -Tobias
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Tobias Pape
In reply to this post by Eliot Miranda-2

> On 03.05.2018, at 20:00, Eliot Miranda <[hidden email]> wrote:
>
>
>
> On Thu, May 3, 2018 at 10:54 AM, Tobias Pape <[hidden email]> wrote:
> Hi Eliot
> > On 03.05.2018, at 19:34, Eliot Miranda <[hidden email]> wrote:
> >
> > This is a workhorse method
>
> Just a question for clarification, what is a workhorse method?
>
> Workhorse, noun
> - a horse used for work on a farm.
> - a person or machine that dependably performs hard work over a long period of time: "he was a workhorse of an actor, often appearing in as many as forty plays in a year".
>  

Ok, so I got that right.

I saw it as exactly the opposite: only used rarely, and in non-performance-critical parts, such as Clipboard

d o . O b

I am puzzled.
        -t

>
>
> _,,,^..^,,,_
> best, Eliot
>


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Squeak - Dev mailing list
In reply to this post by Eliot Miranda-2
Are we really arguing about the use of #basicAt: and #basicSize right now ????  Really ?  Really ?!?!?!?

These methods are there for one reason : speed.  If we find them "confusing", we really got a problem!

There are 127 senders of #basicAt: and 125 senders of #basicSize in the current Squeak 6.0 image...




-----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Twitter: @BenLeChialeux
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero".  (A. Einstein)


On Thursday, May 3, 2018, 1:34:17 p.m. EDT, Eliot Miranda <[hidden email]> wrote:




On Thu, May 3, 2018 at 10:12 AM, Chris Muller <[hidden email]> wrote:
+1.  Not only that, why the call to #basicSize and #basicAt:.  Really?

It's like...  not even Smalltalk anymore...   :(

What?  This is a workhorse method, and you want to create a block, and apply the block to each element, and convert each element from a character to an integer, when basicAt: yields (and has always yielded) integers in strings?  You think that one line with a block and a high-level iterator is so much better than a two line with a simpler inner loop?
 
| strings nothing allSatisfy basicAt |
strings := String allSubInstances.
strings := strings, strings, strings, strings, strings.
nothing := [:s|].
allSatisfy := [:s| s isAsciiStringWithAllSatisfy].
basicAt := [:s| s isAsciiStringWithBasicAt].
Smalltalk garbageCollect.
1 to: 2 do:
[:ign| times := { nothing. allSatisfy. basicAt } collect: [:block| [strings do: block] timeToRun]].
(times second - times first / (times third - times first) asFloat) roundTo: 0.1

Answers range from 5.1 to 5.7 on 64-bits Mac OS X.

You really want that elegance for something that is 5 times slower but one line shorter?  I did this because I saw Patrick inline isAsciiString in mail message processing, and I thought a) his method would read better if it used isAsciiString, and b) isAsciiString can be implemented much better without affecting clients, so I went ahead.  Part of the beauty of Smalltalk is that clients are insulated from implementation so that implementations can be improved.  Since when has that not been a part of Smalltalk?


On Thu, May 3, 2018 at 3:41 AM, Tobias Pape <[hidden email]> wrote:
> Hi Eliot
>
>> On 03.05.2018, at 09:56, [hidden email] wrote:
>>
>> Eliot Miranda uploaded a new version of Collections to project The Trunk:
>> http://source.squeak.org/ trunk/Collections-eem.792.mcz
>>
>> ==================== Summary ====================
>>
>> Name: Collections-eem.792
>> Author: eem
>> Time: 3 May 2018, 12:55:52.175146 am
>> UUID: 7d8995ed-835e-44b0-bf4c- 0b0780f5c96f
>> Ancestors: Collections-pre.791
>>
>> Four times faster implementation of isAsciiString.
>>
>> =============== Diff against Collections-pre.791 ===============
>>
>> Item was changed:
>>  ----- Method: String>>isAsciiString (in category 'testing') -----
>>  isAsciiString
>> +     "Answer if the receiver contains only ascii characters.
>> +      Inline ^self allSatisfy: [ :each | each asciiValue <= 127 ] for speed."
>> +     1 to: self basicSize do: [:i| (self basicAt: i) > 127 ifTrue: [^false]].
>> +     ^true!
>> -
>> -     ^self allSatisfy: [ :each | each asciiValue <= 127 ]!
>>
>>
>
> Although I am in awe of the performance improvement, I am curious wether it really pays of to inline #allSatify: here, in terms of performance vs. readability.
>
> I presume that, given the current use of #isAsciiString, we can stay with the more compact, readable version that does not need the 'caveat lector'-comment at the beginning of the method.
>
>
> Best regards

>         -Tobias
>
>




--
_,,,^..^,,,_
best, Eliot



Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Chris Muller-3
In reply to this post by Eliot Miranda-2
> You really want that elegance for something that is 5 times slower but one
> line shorter?

What I want is for Squeak to be a system that is inviting to both
developers and users, but for that to work, the former have to make
some minor sacrifices to accomodate the latter...

> I did this because I saw Patrick inline isAsciiString in mail
> message processing,

... and so sometimes it's best to leave the in-lining out in the
application(s) that need it.  As Tobias pointed out, the in-image uses
WE have for this method are not performance critical, so if no one can
notice a performance difference, then this change is effectively just
a conversion of code in the class library from something that was
developer AND user consumable, to one that is only developer
consumable.

> and I thought a) his method would read better if it used
> isAsciiString,

This is an argument for readability, but the audience of readers that
will likely encounter this application-specific method is far fewer
than the ones that the core class-library method.

> and b) isAsciiString can be implemented much better without
> affecting clients, so I went ahead.  Part of the beauty of Smalltalk is that
> clients are insulated from implementation so that implementations can be
> improved.  Since when has that not been a part of Smalltalk?

"Insulated from implementation" is part of the beauty of modularity
and object-oriented programming present in many languages.  Smalltalk
is the one that does that but with a simplicity and user-readable
syntax.
_____
We all like performance AND readability (and that's what I thought
your dynamic-inlining would bring us -- the best of both worlds, did
it not happen?).  I'm all for in-lining the REAL workhorse methods (low-level)
that truly get called thousands of times every minute, but for
the others, I ask if you would please consider the Dynabook dream,
where high-school kids are reading methods, assembling behaviors for
their own needs, etc.  They need speed AND readability.  We need both.

 - Chris

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Karl Ramberg
In reply to this post by cbc
I'm all for faster code in the standard library.
One could leave the old version in the comment as an example for other ways of writing it.

Best,
Karl


On Thu, May 3, 2018 at 7:51 PM, Chris Cunningham <[hidden email]> wrote:


On Thu, May 3, 2018 at 10:34 AM, Eliot Miranda <[hidden email]> wrote:


On Thu, May 3, 2018 at 10:12 AM, Chris Muller <[hidden email]> wrote:
+1.  Not only that, why the call to #basicSize and #basicAt:.  Really?

It's like...  not even Smalltalk anymore...   :(

What?  This is a workhorse method, and you want to create a block, and apply the block to each element, and convert each element from a character to an integer, when basicAt: yields (and has always yielded) integers in strings?  You think that one line with a block and a high-level iterator is so much better than a two line with a simpler inner loop?
 
| strings nothing allSatisfy basicAt |
strings := String allSubInstances.
strings := strings, strings, strings, strings, strings.
nothing := [:s|].
allSatisfy := [:s| s isAsciiStringWithAllSatisfy].
basicAt := [:s| s isAsciiStringWithBasicAt].
Smalltalk garbageCollect.
1 to: 2 do:
[:ign| times := { nothing. allSatisfy. basicAt } collect: [:block| [strings do: block] timeToRun]].
(times second - times first / (times third - times first) asFloat) roundTo: 0.1

Answers range from 5.1 to 5.7 on 64-bits Mac OS X.

You really want that elegance for something that is 5 times slower but one line shorter?  I did this because I saw Patrick inline isAsciiString in mail message processing, and I thought a) his method would read better if it used isAsciiString, and b) isAsciiString can be implemented much better without affecting clients, so I went ahead.  Part of the beauty of Smalltalk is that clients are insulated from implementation so that implementations can be improved.  Since when has that not been a part of Smalltalk?

While I understand (and like) the idea of having nice, clean, elegant methods everywhere in Squeak, I also would like it to be super fast.  I think Eliot's change is a nice example straddling that line - note the elegant way, and then show (and implement) a fast way to do the same thing still just using Smalltalk.

Not to mention that he actually commented the method with what it should do.

-cbc






Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Karl Ramberg


On Thu, May 3, 2018 at 10:09 PM, karl ramberg <[hidden email]> wrote:
I'm all for faster code in the standard library.
One could leave the old version in the comment as an example for other ways of writing it.

Which he already did.
Why complain about it then :-(

Best,
Karl

 
Best,
Karl


On Thu, May 3, 2018 at 7:51 PM, Chris Cunningham <[hidden email]> wrote:


On Thu, May 3, 2018 at 10:34 AM, Eliot Miranda <[hidden email]> wrote:


On Thu, May 3, 2018 at 10:12 AM, Chris Muller <[hidden email]> wrote:
+1.  Not only that, why the call to #basicSize and #basicAt:.  Really?

It's like...  not even Smalltalk anymore...   :(

What?  This is a workhorse method, and you want to create a block, and apply the block to each element, and convert each element from a character to an integer, when basicAt: yields (and has always yielded) integers in strings?  You think that one line with a block and a high-level iterator is so much better than a two line with a simpler inner loop?
 
| strings nothing allSatisfy basicAt |
strings := String allSubInstances.
strings := strings, strings, strings, strings, strings.
nothing := [:s|].
allSatisfy := [:s| s isAsciiStringWithAllSatisfy].
basicAt := [:s| s isAsciiStringWithBasicAt].
Smalltalk garbageCollect.
1 to: 2 do:
[:ign| times := { nothing. allSatisfy. basicAt } collect: [:block| [strings do: block] timeToRun]].
(times second - times first / (times third - times first) asFloat) roundTo: 0.1

Answers range from 5.1 to 5.7 on 64-bits Mac OS X.

You really want that elegance for something that is 5 times slower but one line shorter?  I did this because I saw Patrick inline isAsciiString in mail message processing, and I thought a) his method would read better if it used isAsciiString, and b) isAsciiString can be implemented much better without affecting clients, so I went ahead.  Part of the beauty of Smalltalk is that clients are insulated from implementation so that implementations can be improved.  Since when has that not been a part of Smalltalk?

While I understand (and like) the idea of having nice, clean, elegant methods everywhere in Squeak, I also would like it to be super fast.  I think Eliot's change is a nice example straddling that line - note the elegant way, and then show (and implement) a fast way to do the same thing still just using Smalltalk.

Not to mention that he actually commented the method with what it should do.

-cbc







Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Ron Teitelbaum
+1 for leave change and close discussion.  

All the best,

Ron Teitelbaum 



Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Nicolas Cellier
In reply to this post by Karl Ramberg
Just for musing with stupid implementations:

CharacterSet>>ascii
    "return a set containing just the ASCII characters"
    ^Ascii ifNil: [ Ascii := self newFrom: ((1 to: 127) collect: [:code | code asCharacter]) ]
CharacterSet>>nonAscii
    ^NonAscii ifNil: [ NonAscii := self ascii complement ].

| bsa |
bsa := ByteString allInstances.
{
[bsa do: [:e | e isAsciiString]] bench.
[bsa do: [:e | (CharacterSet nonAscii findFirstInByteString: e startingAt: 1) = 0]] bench.
}.
 #('2.88 per second. 348 milliseconds per run.' '36.1 per second. 27.7 milliseconds per run.')

So we could do this for ByteString:

ByteString>>isAsciiString
    ^(CharacterSet nonAscii findFirstInByteString: self startingAt: 1) = 0

But WideString requires another hack...



2018-05-03 22:11 GMT+02:00 karl ramberg <[hidden email]>:


On Thu, May 3, 2018 at 10:09 PM, karl ramberg <[hidden email]> wrote:
I'm all for faster code in the standard library.
One could leave the old version in the comment as an example for other ways of writing it.

Which he already did.
Why complain about it then :-(

Best,
Karl

 
Best,
Karl


On Thu, May 3, 2018 at 7:51 PM, Chris Cunningham <[hidden email]> wrote:


On Thu, May 3, 2018 at 10:34 AM, Eliot Miranda <[hidden email]> wrote:


On Thu, May 3, 2018 at 10:12 AM, Chris Muller <[hidden email]> wrote:
+1.  Not only that, why the call to #basicSize and #basicAt:.  Really?

It's like...  not even Smalltalk anymore...   :(

What?  This is a workhorse method, and you want to create a block, and apply the block to each element, and convert each element from a character to an integer, when basicAt: yields (and has always yielded) integers in strings?  You think that one line with a block and a high-level iterator is so much better than a two line with a simpler inner loop?
 
| strings nothing allSatisfy basicAt |
strings := String allSubInstances.
strings := strings, strings, strings, strings, strings.
nothing := [:s|].
allSatisfy := [:s| s isAsciiStringWithAllSatisfy].
basicAt := [:s| s isAsciiStringWithBasicAt].
Smalltalk garbageCollect.
1 to: 2 do:
[:ign| times := { nothing. allSatisfy. basicAt } collect: [:block| [strings do: block] timeToRun]].
(times second - times first / (times third - times first) asFloat) roundTo: 0.1

Answers range from 5.1 to 5.7 on 64-bits Mac OS X.

You really want that elegance for something that is 5 times slower but one line shorter?  I did this because I saw Patrick inline isAsciiString in mail message processing, and I thought a) his method would read better if it used isAsciiString, and b) isAsciiString can be implemented much better without affecting clients, so I went ahead.  Part of the beauty of Smalltalk is that clients are insulated from implementation so that implementations can be improved.  Since when has that not been a part of Smalltalk?

While I understand (and like) the idea of having nice, clean, elegant methods everywhere in Squeak, I also would like it to be super fast.  I think Eliot's change is a nice example straddling that line - note the elegant way, and then show (and implement) a fast way to do the same thing still just using Smalltalk.

Not to mention that he actually commented the method with what it should do.

-cbc











Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Jecel Assumpcao Jr
In reply to this post by Squeak - Dev mailing list
Benoit St-Jean asked:
> Are we really arguing about the use of #basicAt: and
> #basicSize right now ????  Really ?  Really ?!?!?!?
>
> These methods are there for one reason : speed.  If we
> find them "confusing", we really got a problem!

My impression had always been that these methods exist to work around
the limitation of only having "self" and "super" when you need to strip
away all polymorphism and somehow access "the base truth". That they are
also a speedy short cut is a nice side effect.

A more proper reflection mechanism would eliminate the first need for
these methods while a more sophisticated compiler would eliminate the
second need. Both of these were goals of the Self project.

-- Jecel

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Chris Muller-3
In reply to this post by Levente Uzonyi
> On a serious note, libary methods, like that method, ought to be fast,

Every method in the image is a library method, so what do you mean by
"like" that method?  Below, you said can't base on the situation(s)
it's currently used, so are there any methods in the image which
*shouldn't* be so desperately optimized?  I'm just trying to figure
out your basis for saying "ought" above, about fast but not
readability.  Are those not the personal priorities of an expert
developer?  What about the priorities of the students and users in the
Dynabook dream?

> because you can't know in what situation they will be used. Saying that it's
> not even used is therefore not a valid reason.

That's only true for the very lowest-level methods.  Generally, it's
better to design and code for what's known, and NOT try to code for
the future / unknown.  IF and when that future comes, it'd be easy
enough for you to deal with any performance issues at THAT time.  In
the meantime, we "ought"  :) to keep as much beautiful minimal
elegance as we can.

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Levente Uzonyi
On Thu, 3 May 2018, Chris Muller wrote:

>> On a serious note, libary methods, like that method, ought to be fast,
>
> Every method in the image is a library method, so what do you mean by

Really? Explain me how Browser >> #defaultBrowserTitle, a method I chose
randomly, is a library method?

> "like" that method?  Below, you said can't base on the situation(s)
> it's currently used, so are there any methods in the image which
> *shouldn't* be so desperately optimized?  I'm just trying to figure
> out your basis for saying "ought" above, about fast but not
> readability.  Are those not the personal priorities of an expert
> developer?  What about the priorities of the students and users in the
> Dynabook dream?

Others have explained why the readability argument is not very convicing.
About your other argument, I also find it rather weak. If Eliot's code is
too complex for "users", then how could they cope with the
complexity of #anySatisfy:?

>
>> because you can't know in what situation they will be used. Saying that it's
>> not even used is therefore not a valid reason.
>
> That's only true for the very lowest-level methods.  Generally, it's
> better to design and code for what's known, and NOT try to code for
> the future / unknown.  IF and when that future comes, it'd be easy
> enough for you to deal with any performance issues at THAT time.  In
> the meantime, we "ought"  :) to keep as much beautiful minimal
> elegance as we can.

So, if I decided to write a mail transfer agent, and I found that
optimizing #isAsciiString would bump the throughput by, let's say, 10%,
then and only then would I be allowed to change the implementation in
Squeak? Or should I keep my optimized version in my image, because it's
not considered generally useful?

Levente

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-eem.792.mcz

Tobias Pape
In reply to this post by Nicolas Cellier

> On 03.05.2018, at 22:48, Nicolas Cellier <[hidden email]> wrote:
>
>
> But WideString requires another hack...

Like

        ^false

? :D
        -t

12