On Thu, 3 May 2018, Tobias Pape wrote:
> >> On 03.05.2018, at 22:48, Nicolas Cellier <[hidden email]> wrote: >> >> >> But WideString requires another hack... > > Like > > ^false > > ? :D Not really: ((WideString new: 2) first: 1) isAsciiString Levente > -t |
> On 04.05.2018, at 00:08, Levente Uzonyi <[hidden email]> wrote: > > On Thu, 3 May 2018, Tobias Pape wrote: > >> >>> On 03.05.2018, at 22:48, Nicolas Cellier <[hidden email]> wrote: >>> But WideString requires another hack... >> >> Like >> >> ^false >> >> ? :D > > Not really: ((WideString new: 2) first: 1) isAsciiString > Oh. darn :D > Levente > >> -t > |
In reply to this post by Tobias Pape
OK people
I'm hooking on my own reply to somehow See what happened here. > On 03.05.2018, at 23:53, Tobias Pape <[hidden email]> wrote: > Here's what happened: I saw a commit, I saw the documented perf improvement, I wondered whether the improvement justified the increased complexity. Frankly, I'd say the most readable version would be String>>isAsciiString ^ self allSatisfiy: [:character | character isAscii] Because thats what it is all about: a string is an ascii string if and only if all its comprising characters are ascii characters. That's whats just written there Then I looked at all senders in a (then) current Trunk: CP1250ClipboardInterpreter >> #toSystemClipboard: CP1253ClipboardInterpreter >> #toSystemClipboard: ISO88592ClipboardInterpreter >> #toSystemClipboard: ISO88597ClipboardInterpreter >> #toSystemClipboard: MacShiftJISClipboardInterpreter >> #toSystemClipboard: MacUTF8ClipboardInterpreter >> #toSystemClipboard: UnixJPClipboardInterpreter >> #toSystemClipboard: WinGB2312ClipboardInterpreter >> #toSystemClipboard: WinKSX1001ClipboardInterpreter >> #toSystemClipboard: WinShiftJISClipboardInterpreter >> #toSystemClipboard: WideString class >> #allNonAsciiMethods #allNonAsciiMethods looks like a utility method that has no sender (ok), the others, all #toSystemClipboard: look like they won't ever be used in perf-cirtical code, but only upon user interaction on non-extremely-large string. THAT'S why I asked, and THAT'S why I was puzzled to see it classified as a workhorse method, and thats why I questioned my own understanding of when to apply Knuth: "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%" I did not want to stir up any controversies or something. I just prefer clarity over performance ever so often. Do as the majority wishes, I just had liked a short explanation of why the perf improvement matters here. Best and good night -tobias |
In reply to this post by Levente Uzonyi
Hi Tobias, Hi All,
> On May 3, 2018, at 3:08 PM, Levente Uzonyi <[hidden email]> wrote: > >> On Thu, 3 May 2018, Tobias Pape wrote: >> >> >>> On 03.05.2018, at 22:48, Nicolas Cellier <[hidden email]> wrote: >>> But WideString requires another hack... >> >> Like >> >> ^false >> >> ? :D > > Not really: ((WideString new: 2) first: 1) isAsciiString > > Levente Note that this is a common issue in Smalltalk, where we can have different implementations (classes) with the same interface. Take LargeInteger and SmallInteger. The arithmetic system and the VM are both implemented to almost never represent something in the SmallInteger range as a LargeInteger (there are rare circumstances but it's safe to assume that the invariant is always maintained, and the invariant is depended upon). This allows the VM to only ever check for SmallIntegers for things like indices, never having to waste code bloat or cycles checking for denormalised LargeIntegers. Why can we do this with SmallInteger & LargeInteger, but not with ByteString and WideString? Because ByteString and WideString are mutable (and because of the FFI). Were the system to maintain the invariant that strings containing characters in the range 0 to 255 were always represented by ByteString, then, Tobias, your WideString>>isAsciiString ^false would work. But the cost of maintaining that invariant would be scanning the ret of the string every time at:put: deposited a byte character, to see if we had just replaced the last wide character by a byte one and hence needed to do a become:. We'd also potentially spend a lot of time doing becomes, and we'd also have to allow for denormalisation when passing an ascii string through the FFI to code requiring a wide string. And even worse we'd have to avoid WideString new: n like the plague since new strings are always ascii, being full of nuls. So only WideString with:... forms would make sense. So this kind of multiple implementation approach only works well with certain types and access patterns. Interesting, no? |
In reply to this post by Levente Uzonyi
On a serious note, libary methods, like that method, ought to be fast, Well, I was actually asking for clarification from _you_ for the meaning of "library method", since you were the one assigning an "ought" to them :). So I started out with it meaning, "every method in the base image", as a means to open up the question of what you meant. My own criteria is "methods which are called thousands of times per minute" _today_. But as the guy whose done more of these sorts of method optimizations than anyone (most of which I love), I am actually *truly curious* about YOUR criteria on this. What methods in the image would Levente prefer to be more readable than performant, and why? > ....
Yes. That's when you'll have the concrete context necessary to enable you to make the best implementation choice. Until then, degrading code readability in exchange for no real-world performance benefit just... well, degrades the code... |
On Thu, 3 May 2018, Chris Muller wrote:
> On a serious note, libary methods, like that method, ought to be fast, > > > Every method in the image is a library method, so what do you mean by > > > Really? Explain me how Browser >> #defaultBrowserTitle, a method I chose randomly, is a library method? > > > Well, I was actually asking for clarification from _you_ for the meaning of "library method", since you were the one assigning an "ought" to them :). So I started out with it meaning, "every method in the base image", as a means to open up the question of what you > meant. My own criteria is "methods which are called thousands of times per minute" _today_. But as the guy whose done more of these sorts of method optimizations than anyone (most of which I love), I am actually *truly curious* about YOUR criteria on this. What > methods in the image would Levente prefer to be more readable than performant, and why? that String >> #isAsciiString is one, while Browser >> #defaultBrowserTitle is not. A rule of thumb could be that if a core method (e.g. anything in Kernel/Collections) is part of an API and it's not there to support a very specific task (e.g. Dictionary >> #associationDeclareAt:), then its performance should be taken into account. Btw, please have a look at String >> #isOctetString. Levente > > > > > .... > > > > because you can't know in what situation they will be used. Saying that it's > not even used is therefore not a valid reason. > > > That's only true for the very lowest-level methods. Generally, it's > better to design and code for what's known, and NOT try to code for > the future / unknown. IF and when that future comes, it'd be easy > enough for you to deal with any performance issues at THAT time. In > the meantime, we "ought" :) to keep as much beautiful minimal > elegance as we can. > > > So, if I decided to write a mail transfer agent, and I found that optimizing #isAsciiString would bump the throughput by, let's say, 10%, then and only then would I be allowed to change the implementation in Squeak? Or should I keep my optimized version in > my image, because it's not considered generally useful? > > > Yes. That's when you'll have the concrete context necessary to enable you to make the best implementation choice. Until then, degrading code readability in exchange for no real-world performance benefit just... well, degrades the code... |
In reply to this post by Eliot Miranda-2
2018-05-04 0:50 GMT+02:00 Eliot Miranda <[hidden email]>: Hi Tobias, Hi All, In such case we would maintain a count of non-byte characters and avoid scanning... So this kind of multiple implementation approach only works well with certain types and access patterns. Interesting, no? |
On Fri, May 4, 2018 at 12:44 PM, Nicolas Cellier <[hidden email]> wrote:
Only possible if the representation makes room for a count, which could easily require more than 24 bits. It is non-trivialto implement, and of course slows down access.
_,,,^..^,,,_ best, Eliot |
2018-05-04 22:10 GMT+02:00 Eliot Miranda <[hidden email]>:
For example, just allocate one more character code and use it to store the count... Most could be done at image side.
|
In reply to this post by Levente Uzonyi
Hi Levente, meant. My own criteria is "methods which are called thousands of times per minute" _today_. A rule of thumb could be that if a core method (e.g. anything in Kernel/Collections) is part of an API and it's not there to support a very specific task (e.g. Dictionary >> #associationDeclareAt:), then its performance should be taken into account. That sounds like it only exempts utility methods ("specific tasks"). Technically, this should leave me afraid that you might want to inline Dictionary>>#at:, but I'm relatively sure you don't. #at: is the most basic featured accessing method for understanding and using a Dictionary, so I hope you agree that's one that should retain its exemplar status. OTOH, #associationDeclareAt: and #isOctetString, by their nature, are only of interest to developers. Users would never use those methods, so optimizing them would be fine I suppose. Methods which take block arguments also tend to be not as usable by users, since passing a block requries writing some code. So it seems we have _some_ concrete criteria starting to form simply by having looked at a few examples: - methods which run hundreds of times per minute - methods which take block as arguments - methods which are concerned with internals instead of externals #isAsciiString seems like it could go either way, so I'm fine if you and Eliot prefer it to be optimized. For me this discussion itself to get a slightly better understanding of each others' positions with the request to keep Dynabook users in mind when evaluating future such optimizations, is more important. Regards, ChrisOn Fri, May 4, 2018 at 2:43 PM, Levente Uzonyi <[hidden email]> wrote: On Thu, 3 May 2018, Chris Muller wrote: |
tangentally...
On Mon, May 7, 2018 at 1:02 PM, Chris Muller <[hidden email]> wrote:
But Chris, Dictionary>>#at: has already been heavily optimized via scanFor:. Here's the original Smalltalk-80 v2 code: Dictionary methods for accessing at: key "Answer the value at key. If key is not found, create an error message." ^self at: key ifAbsent: [self errorKeyNotFound] at: key ifAbsent: aBlock "Answer the value at key. If key is not found, answer the result of evaluating aBlock." | index | index _ self findKey: key ifAbsent: [^aBlock value]. ^(self basicAt: index) value Dictionary methods for private findKey: key ifAbsent: aBlock | index | index _ self findKeyOrNil: key. (self basicAt: index) == nil ifTrue: [^aBlock value]. ^index findKeyOrNil: key | location length probe pass | length _ self basicSize. pass _ 1. location _ key hash \\ length + 1. [(probe _ self basicAt: location) == nil or: [probe key = key]] whileFalse: [(location _ location + 1) > length ifTrue: [location _ 1. pass _ pass + 1. pass > 2 ifTrue: [^self grow findKeyOrNil: key]]]. ^location Here's the current Squeak code at: key "Answer the value associated with the key." ^ self at: key ifAbsent: [self errorKeyNotFound: key] at: key ifAbsent: aBlock "Answer the value associated with the key or, if key isn't found, answer the result of evaluating aBlock." ^((array at: (self scanFor: key)) ifNil: [ aBlock ]) value "Blocks and Associations expect #value" Dictionary methods for private scanFor: anObject "Scan the key array for the first slot containing either a nil (indicating an empty slot) or an element that matches anObject. Answer the index of that slot or raise an error if no slot is found. This method will be overridden in various subclasses that have different interpretations for matching elements." | index start size | index := start := anObject hash \\ (size := array size) + 1. [ | element | ((element := array at: index) == nil or: [ anObject = element key ]) ifTrue: [ ^index ]. (index := index \\ size + 1) = start ] whileFalse. self errorNoFreeSpace In the v2.0 code, at: always creates a block, at:ifAbsent: always creates a block (which means two block activations on error), and the scan for elements is three activations deep. In Squeak, at: always creates a block, but at:ifAbsent: never creates a block (which means one block activation on error), and the scan is two activations deep. Note also that findKeyOrNil: wraps around, making more than a single pass if it can't find an element whereas scanFor: makes at most a single pass. The Squeak code is much better. Has the optimization made the code any the less reusable or elegant? I find it no less reusable and more elegant. scanFor: is a better thought-out kernel than findKeyOrNil:. So optimization doesn't always make things less readable, reusable or elegant.
_,,,^..^,,,_ best, Eliot |
In reply to this post by Chris Muller-4
On Mon, May 07, 2018 at 03:02:10PM -0500, Chris Muller wrote:
> Hi Levente, > > > > meant. My own criteria is "methods which are called thousands of times > >> per minute" _today_. > >> > > > > > A rule of thumb could be that if a core method (e.g. anything in > > Kernel/Collections) is part of an API and it's not there to support a very > > specific task (e.g. Dictionary >> #associationDeclareAt:), then its > > performance should be taken into account. > > > > That sounds like it only exempts utility methods ("specific tasks"). > Technically, this should leave me afraid that you might want to inline > Dictionary>>#at:, but I'm relatively sure you don't. #at: is the most > basic featured accessing method for understanding and using a Dictionary, > so I hope you agree that's one that should retain its exemplar status. > > OTOH, #associationDeclareAt: and #isOctetString, by their nature, are only > of interest to developers. Users would never use those methods, so > optimizing them would be fine I suppose. Methods which take block > arguments also tend to be not as usable by users, since passing a block > requries writing some code. So it seems we have _some_ concrete criteria > starting to form simply by having looked at a few examples: > > - methods which run hundreds of times per minute > - methods which take block as arguments > - methods which are concerned with internals instead of externals > > #isAsciiString seems like it could go either way, so I'm fine if you and > Eliot prefer it to be optimized. For me this discussion itself to get a > slightly better understanding of each others' positions with the request to > keep Dynabook users in mind when evaluating future such optimizations, is > more important. > I have no strong opinion one way or the other, but for what it's worth I though that Eliot's optimization was readable enough, and the method comment made the intent clear. As a learning tool, it might even be helpful for someone reading the method for the first time to understand the alternative implementations, one favoring speed and another expressing the same idea in terms of collections with #allSatisfy: Dave |
In reply to this post by Eliot Miranda-2
> But Chris, Dictionary>>#at: has already been heavily optimized via scanFor:.
> Here's the original Smalltalk-80 v2 code: > > Dictionary methods for accessing > at: key > "Answer the value at key. If key is not found, create an error message." > > ^self at: key ifAbsent: [self errorKeyNotFound] No it hasn't, what do you mean? See, ^^^ right there, there's no call to #scanFor:, and nor should there be, ever. Please don't do it. > at: key ifAbsent: aBlock > "Answer the value at key. If key is not found, answer the > result of evaluating aBlock." > > | index | > index _ self findKey: key ifAbsent: [^aBlock value]. > ^(self basicAt: index) value This method takes a block. I said in my note that methods that take a block are fine to be optimized because passing a block requires writing code which is a 'developer' role anyway. > ... snip... > Here's the current Squeak code > > at: key > "Answer the value associated with the key." > > ^ self at: key ifAbsent: [self errorKeyNotFound: key] Yup. No #scanFor:. Good. This is a method that absolutely should _not_ be inlined unless the VM can do it outside the users eyes. > In the v2.0 code, at: always creates a block, at:ifAbsent: always creates a > block (which means two block activations on error), and the scan for > elements is three activations deep. > > In Squeak, at: always creates a block, but at:ifAbsent: never creates a > block (which means one block activation on error), and the scan is two > activations deep. Note also that findKeyOrNil: wraps around, making more > than a single pass if it can't find an element whereas scanFor: makes at > most a single pass. > > The Squeak code is much better. Has the optimization made the code any the > less reusable or elegant? I find it no less reusable and more elegant. > scanFor: is a better thought-out kernel than findKeyOrNil:. So optimization > doesn't always make things less readable, reusable or elegant. As I said, methods that take a block are fine to be optimized because passing a block requires writing code which is a 'developer' role anyway. - Chris |
Free forum by Nabble | Edit this page |