Re: [pharo-project/pharo] Proposed improvement : performance of #atRandom: in class Bag (#5392)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: [pharo-project/pharo] Proposed improvement : performance of #atRandom: in class Bag (#5392)

Pharo Smalltalk Users mailing list

While I'm at it, another improvement would be to buffer #newFrom: by reading large chunks of the argument <aCollection> via a ReadStream, adding the objects read to a <Bag> buffer and inserting into the receiver via #add:withOccurrences:.  That way, if you read say 1000 object (of which lots are similar), you don't have to "add" 1000 items, you could reduce the number of adds by reducing the duplicates with this buffer bag.  Wouldn't make much of a difference in the worst case scenario but one can assume that since you're using a bag, you normally expect a lot of dups!  So we could gain a lot that way.  Anyway, that's what I'm seeing with my experiments (millions of objects) and I'm still contemplating that improvement with more & more tests (such as adjusting the buffer size).  So far, this is really promising!

On 2019-12-18 06:32, Sven Van Caekenberghe wrote:

OK, now I see. I was thrown off by the weird formatting ;-)

The final expression is not needed, since you started with an emptyCheck.

Reformatted then:

Bag>>#atRandom: aGenerator
	"Answer a random element of the receiver. Uses aGenerator which
	should be kept by the user in a variable and used every time. Use
	this instead of #atRandom for better uniformity of random numbers because 
	only you use the generator. Causes an error if self has no elements."

	| rand index |
	self emptyCheck.
	rand := aGenerator nextInt: self size.
	index := 0.
	self doWithOccurrences: [ :key :count | 
		index := index + count.
		rand <= index ifTrue: [ ^ key ] ]

Still, we need a PR as well. And maybe a specific test unless this is already covered by other tests.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.

<script type="application/ld+json">[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/pharo-project/pharo/issues/5392?email_source=notifications\u0026email_token=AEFVQ735UX3VKIB5ORBTBMTQZIC4VA5CNFSM4J4HLVYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHF2N2Y#issuecomment-566994667", "url": "https://github.com/pharo-project/pharo/issues/5392?email_source=notifications\u0026email_token=AEFVQ735UX3VKIB5ORBTBMTQZIC4VA5CNFSM4J4HLVYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHF2N2Y#issuecomment-566994667", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]</script>
-- 
-----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Twitter: @BenLeChialeux
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
GitHub: bstjean
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero".  (A. Einstein)