In addition, open addressing with linear probing has superior cache line
read behavior (no indirection / random traversal, and if the first probe misses the second one was likely cached by the first one). On 6/8/18 1:37 , Levente Uzonyi wrote: > On Fri, 8 Jun 2018, Clément Bera wrote: > >> Hi Max, >> Theoretically, for a small number of elements, usually somewhere >> between 3 and 30 depending on implementations, a linear search is >> faster than a hash search, especially in the Pharo dictionary hash search >> implementation. >> >> Efficient dictionary implementations are usually bucket-based. The >> dictionary holds a certain number of buckets, and based on the key >> hash, the bucket where the key value is present is determined. Small >> buckets are linear (arrays or linked list). Large buckets are >> typically balanced binary trees (red-black trees typically). Under a >> certain number of elements there is a single bucket, which means a linear >> search is performed, as for the SmallDictionary. When it grows the >> dictionary search becomes a combination between a hash search and a >> linear or tree search. > > The bucket-based list you described is called separate chaining > collision resolution, and it won't have any better performance in Squeak > than the currently used open addressing. > However, it is preferrable, when you can't trust your hash function. > When you do "crazy" things, like using a tree as a bucket, you just > exchange the possible O(n) worst case time with O(log(n)). > >> >> Pharo dictionary search is first hash-based, then all the buckets are >> represented next to each other in the same arrays and a linear search >> is performed there, leading to many collisions and slower search >> time (especially when the element is not found), sometimes the code >> searches over multiple buckets because the dictionary is too full or >> there are too many near-collisions. The memory consumption is >> competitive with the advanced implementations though (precise >> measurements would need to be made). > > This is called open addressing, and if your hash function is good > enough, the lists are short, so those "many collisions and slower search > time" just don't happen. And no, separate chaining needs a lot more > memory than open addressing, unless you manage to get rid of > Associations, but that's a lot harder than it sounds like. > >> >> Method dictionaries are represented differently to optimize the >> look-up logic. > > MethodDictionary is optimized to let the VM do lookups more easily. > Memory consumption is worse when the dictionary is sparse, but gets > better as it gets filled up compared to the current Dictionary > implementation (assuming Pharo still uses the same representation as > Squeak does). > >> >> If you want to improve things and have one dictionary implementation >> instead of two, implement or look for a bucket based dictionary and >> put it in the base image instead of Dictionary. This is quite some work >> since there are many APIs to port. You can look at the Pinnochio >> implementation, it's quite good but they've not implemented large >> buckets. > > Well, as an experiment, I'm sure it'll be fun. But don't expect better > performance nor better memory usage. > > Levente > >> >> >> >> On Fri, Jun 8, 2018 at 8:46 AM, Max Leske <[hidden email]> wrote: >> Hi, >> >> I was messing around with SmallDictionary when I suddenly >> realised that I can't find a single reason to use it over a normal >> Dictionary. While its name and class comment imply that it is somehow >> an optimised Dictionary, I don't see any measurement where that >> actually holds up. The following was run in a Pharo 7 image on a >> recent VM (see below): >> >> | d | >> d := SmallDictionary new. >> d sizeInMemory. "24" >> [100000 timesRepeat: [ >> 1 to: 100 do: [ :i | d at:i put: i] ] ] timeToRun. >> "0:00:00:05.226" >> >> [100000 timesRepeat: [ >> d at: 48 ifAbsent: [] ] ] timeToRun. "0:00:00:00.041" >> >> >> >> | d | >> d := Dictionary new. >> d sizeInMemory. "16" >> [100000 timesRepeat: [ >> 1 to: 100 do: [ :i | d at:i put: i] ] ] timeToRun. >> "0:00:00:00.385" >> [100000 timesRepeat: [ >> d at: 48 ifAbsent: [] ] ] timeToRun. "0:00:00:00.006" >> >> >> As you can see, SmallDictionary is 8 bytes larger per instance >> and significantly faster while reading and writing (I know that this >> isn't a good benchmark but it suffices to make my point). >> >> >> Is anyone aware of a reason for hanging on to SmallDictionary? >> I'm also curious to know how SmallDictionary came to be. There must >> have been some advantage over Dictionary at some point in the >> past. >> >> >> Cheers, >> Max >> >> >> >> >> >> Image version: Pharo 7.0 >> Build information: >> Pharo-7.0+alpha.build.961.sha.a69e72a97136bc3f93831584b6efa2b1703deb84 >> (32 Bit) >> >> VM version: CoInterpreter VMMaker.oscog- nice.2281 uuid: >> 4beeaee7-567e-1a4b-b0fb-bd95ce302516 Nov 27 2017 >> StackToRegisterMappingCogit VMMaker.oscog-nice.2283 uuid: >> 2d20324d-a2ab-48d6-b0f6-9fc3d66899da Nov 27 2017 >> VM: 201711262336 >> https://github.com/OpenSmalltalk/opensmalltalk-vm.git $ Date: Mon Nov >> 27 00:36:29 2017 +0100 $ Plugins: 201711262336 >> https://github.com/OpenSmalltalk/opensmalltalk-vm.git $ >> >> OS: macOS 10.13.5 >> Machine: MacBook Pro (13-inch, 2016, Four Thunderbolt 3 Ports) >> >> >> >> >> -- >> Clément Béra >> https://clementbera.github.io/https://clementbera.wordpress.com/ >> >> > > > |
Free forum by Nabble | Edit this page |