On 06/08/2018 02:02 AM, Marcus Denker wrote:
>
>
>> On 8 Jun 2018, at 08:46, Max Leske <
[hidden email]> wrote:
>> Is anyone aware of a reason for hanging on to SmallDictionary? I'm also curious to know how SmallDictionary came to be. There must have been some advantage over Dictionary at some point in the past.
>>
> It came from RB… the idea was that there are (in the Refactoring engine) a lot of very small dictionaries with <10 elements.
> The idea is that for such dictionaries, the overhead of hashing was higher than just linear search.
I created its ancestor in VW some 20+ years ago (as RBSmallDictionary).
It was used when doing pattern matching. When it performs a pattern
match against an AST, it puts the potential value of the pattern
variable in the dictionary. If the value is used later in the pattern,
then we can get the previous value and make sure that we have an
equivalent AST. This allows you to write patterns like:
`@a = `@a
to find where someone has the same expression on both sides of the #=
message. Since most patterns have very few pattern variables, these
dictionaries won't hold many entries. Furthermore, we are likely to
abort the match when we have 0 entries.
The original RBSmallDictionary had an #empty method that "emptied" the
dictionary without actually removing any elements -- it just set the
size to 0. In a general dictionary this would lead to memory leaks since
the previous values are still held by the dictionary. However, these
dictionaries were only used during the matching process and went away
after the process completed.
Anyway, at the time when we converted our pattern matching code from
using the VW parser with our pattern matching extensions to use the new
RB parser with pattern matching, the time to run Smalllint on the image
was cut in half even though our parser was quite a bit slower than the
VW parser. I don't remember everything that was done, but I think that
most of the speedup came from having special pattern AST nodes and the
small dictionary.
John Brant