Chris Muller <[hidden email]> schrieb am Di., 4. Feb. 2020, 04:39:
Your commit at the top of this thread changes the default capacity and that is what we are writing about.
I think the analogy is flawed. There is a space/time tradeoff with the size of the preallocated array. I see no such thing with Dictionary>>at:. Also new allows its user to defer a decision/analysis, at: does not.
No, not at all! They are relevant since we are discussing about new and the default capacity, not new:. Also you saying 99% of my cases are not worthy of the performance they used to have... does not sound appealing to me.
"Allocate a new instance of me with the given initial capacity. Use new: instead of new if you want to avoid either unused capacity or repeated growing (and thus, copying) of the collection's memory." Not quite man page length, but seems adequate to cover my expectations.
What is that supposed to mean? Do you think the Java designers renamed almost all of the classes and methods, rewrote all of the documentation, but forgot to think about changing the initial capacity of ArrayList ex OrderedCollection? In fact they did change something about it in Java 8 as I found out after my last post: initially the array is empty (singleton), but it still allocates 10 slots once the first element is added. This optimizes space for empty collections, but not small ones.
If I use new: to optimize performance I am fairly certain that it won't grow afterwards. My original statement still stands.
Then the only sensible solution would be to deprecate and remove new for preallocating collections altogether. Forcing everyone to always think ahead of their collection sizes. For collections that are supposed to grow automatically. Slowing down developent of even non-performance-critical parts. I don't want that either. Changing the current behavior can make it worse for existing applications (again: we lack the numbers to be sure of the contrary). Keeping the current behavior is bad with regards to your concerns. I say: let's therefore not change it, since user code has already been written and most of it is unlikely to be improved again. |
The goal of using #new: is efficiency in general, space OR time (or, both). So w.r.t. the case of optimizing for time, you forgot to mention the 40% performance hit they would take if the size happened to be 1 or 2...
Make a spreadsheet with the cartesian product of the following, one per row, and then manually fill out an extra column whether you'd write #new or #new:. (knows size | does't know size) * (knows at runtime | knows at compile time) * (will possibly grow | won't possibly grow) * ( ... | ...) When designing a class libary, please think in these general terms and consider all cases, instead of only your own expectations.
You said "fairly" certain, now please consider the case where it's certain never to grow (perhaps row 187 on the spreadsheet, above). I gave you a common, concrete example in an earlier post. That's the case where your incomplete manpage comment (under Collections-ul.871) could easily trick the developer into hurting her apps performance when she thought she was helping it. You argued (with flawed logic) that people would have to "think about the sizes all the time" but that's exactly what this would introduce, except in a much more insidious way! Because it affects negatively depending on the runtime parameter value passed to #new:.
You say you're "fine" with that certain performance hit, but you're not fine with this uncertain one whose worst-case possibility was measured to be an equal performance cost. And, you're even willing to trade away certainty of space-efficiency, too.
Sigh. With statements like the above, I wonder whether you're actually interested in finding a solution or simply arguing... :(
This was your best argument, but it only takes about 10 seconds to understand its logical flaw -- if you're writing performance-critical code, you're going to write #new: anyway. If you're not, an extra grow from writing #new won't affect performance.
The worst-case scenario was benched and provided for you, and shown to be a similar effect as Collections-ul.871 on #new: 1.
As I said multiple times, the "current behavior" is fine to my concerns, Jakob, it was Levente's proposal which introduced them. This was just one of two different solutions proposed, but you handily rejected them both, even while offering zero alternatives or even acknowledgement that my concerns are valid. I'm going to cut my losses from this discussion now. Your arguments seem like you either aren't understanding mine, or aren't willing to look for consensus. Levente has committed Collections-ul.872 with only his other changes, e.g., excluding the one which introduced my concern. He's a gentleman. It's sad that our inability to resolve this petty squabble kept back his other improvements, which would've included the ability to create minimally small Dictionary's with a minimum internal array size of 3 instead of 5. :( - Chris |
This conversation might well get a prize for lowest signal to noise ratio
in the entire history of the Internet. It began with a proposed change that would have required toggling two bits in the source code, which is close to a theoritical minumum for information content: - Current code: '^ self basicNew initialize: 5' - Proposed change" '^ self basicNew initialize: 3' This requires toggling two bits in a single ascii character: ($5 asciiValue bitXor: $3 asciiValue) bitCount ==> 2 Changing two bits in the source code of Squeak comes impressively close to being the smallest thoeretically possible change, and these two bit-twiddles now form the basis of a vast email thread that has evolved to include an impressive mix of faulty assumptions, untested metrics, passionate opinions, and invalid conclusions. It might seem easy to suggest moving Collections-cmm.874 to the treated inbox - after all, everyone who reviewed it gave it a solid thumbs down - but that would spoil the fun. Let's keep this thing going as long as we can, and in the end we can submit the thread for inclusion in the Guinness book of world records. ;-) Dave On Wed, Feb 05, 2020 at 04:39:09PM -0600, Chris Muller wrote: > > > > try writing a manpage-quality comment for #new: explaining all the > >> nuances... when to use it, when not to, etc... My hope that exercise > >> would illuminate the issue I have. > >> > > > > "Allocate a new instance of me with the given initial capacity. Use new: > > instead of new if you want to avoid either unused capacity or repeated > > growing (and thus, copying) of the collection's memory." > > > > The goal of using #new: is efficiency in general, space OR time (or, > both). So w.r.t. the case of optimizing for time, you forgot to mention > the 40% performance hit they would take if the size happened to be 1 or 2... > > > > Not quite man page length, but seems adequate to cover my expectations. > > > > Make a spreadsheet with the cartesian product of the following, one per > row, and then manually fill out an extra column whether you'd write #new or > #new:. > > (knows size | does't know size) > * (knows at runtime | knows at compile time) > * (will possibly grow | won't possibly grow) > * ( ... | ...) > > When designing a class libary, please think in these general terms and > consider *all cases*, instead of only your own expectations. > > > > Also I don't expect that initial allocation with new: will always be > >>> faster. I expect that I can save reallocations later on. > >>> > >> > >> As I said before, not if you know your Dictionary will never grow. This > >> is not app-specific. > >> > > > > If I use new: to optimize performance I am fairly certain that it won't > > grow afterwards. My original statement still stands. > > > > You said "fairly" certain, now please consider the case where it's > *certain* never > to grow (perhaps row 187 on the spreadsheet, above). I gave you a common, > concrete example in an earlier post. That's the case where your incomplete > manpage comment (under Collections-ul.871) could easily trick the developer > into hurting her apps performance when she thought she was helping it. You > argued (with flawed logic) that people would have to "think about the sizes > all the time" but that's exactly what this would introduce, except in a > much more insidious way! Because it affects negatively depending on > the *runtime > parameter* value passed to #new:. > > > > That is where the time is supposed to be saved. If new is faster than new: > >>> 1, I am fine with that. > >>> > >> > You say you're "fine" with that *certain* performance hit, but you're not > fine with this uncertain one whose worst-case possibility was measured to > be an equal performance cost. *And*, you're even willing to trade away > certainty of space-efficiency, too. > > This thread is about reducing the defaultSize. The other thread is about > >> #new:1 needing to be same speed as #new. They're two separate > >> discussions. An API design that forces developers to trade one > >> optimization for another is bad design, plain and simple. > >> > > > > Then the only sensible solution would be to deprecate and remove new for > > preallocating collections altogether. Forcing everyone to always think > > ahead of their collection sizes. > > > > Sigh. With statements like the above, I wonder whether you're actually > interested in finding a solution or simply arguing... :( > > > > For collections that are supposed to grow automatically. Slowing down > > developent of even non-performance-critical parts. I don't want that either. > > > > This was your best argument, but it only takes about 10 seconds to > understand its logical flaw -- if you're writing performance-critical code, > you're going to write #new: anyway. If you're not, an extra grow from > writing #new won't affect performance. > > Changing the current behavior can make it worse for existing applications > > (again: we lack the numbers to be sure of the contrary). > > > > The worst-case scenario was benched and provided for you, and shown to be a > similar effect as Collections-ul.871 on #new: 1. > > > > Keeping the current behavior is bad with regards to your concerns. > > > > As I said multiple times, the "current behavior" is *fine* to my concerns, > Jakob, it was Levente's proposal which introduced them. This was just one > of two different solutions proposed, but you handily rejected them both, > even while offering zero alternatives or even acknowledgement that my > concerns are valid. I'm going to cut my losses from this discussion now. > Your arguments seem like you either aren't understanding mine, or aren't > willing to look for consensus. > > Levente has committed Collections-ul.872 with only his other changes, e.g., > excluding the one which introduced my concern. He's a gentleman. It's sad > that our inability to resolve this petty squabble kept back his other > improvements, which would've included the ability to create minimally small > Dictionary's with a minimum internal array size of 3 instead of 5. :( > > - Chris > |
> This conversation might well get a prize for lowest signal to noise ratio
> in the entire history of the Internet. Indeed, but some sort of gentle bliss is to be found in the contemplation of an epic fight over such a small issue - at least this time the world as we know it is not at stake, and we can cosily relax while listening to the mildly terrifying wind sounds coming from the tempest in a teapot. > It might seem easy to suggest moving Collections-cmm.874 to the treated > inbox - after all, everyone who reviewed it gave it a solid thumbs > down - but that would spoil the fun. Yes, keep it rolling - it may become its own art form. Stef |
Free forum by Nabble | Edit this page |