String >> #=

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

String >> #=

Philippe Marschall
Hi

I am currently looking at your dictionary look up performance [1]. We
use GRSmallDictionary instead of Dictionary in a lot of places with
string keys (request and response headers, url query fields, …).
Unfortunately on Squeak/Pharo with 11 keys Dictionary is twice as
fast. The reason seems to be that String >> #= is quite slow on
Squeak/Pharo especially with prefix matches. Special casing string
keys and first sending and comparing #size before #= seems to fix this
and make GRSmallDictionary again slightly faster than Dictionary.

Before I take any action I wanted to ask how String >> #= is
implemented in other dialects
 * Do any dialects first compare #size before iterating over the characters?
 * Do any dialects do Unicode normalization in String >> #=?
 * Do any dialects implement String >> #= in a way such that two
strings with different sizes could be considered equal (eg. because
they do normalization)?

 [1] https://code.google.com/p/seaside/issues/detail?id=793

Cheers
Philippe
_______________________________________________
seaside-dev mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/seaside-dev
Reply | Threaded
Open this post in threaded view
|

Re: String >> #=

John O'Keefe
Philippe -

In VA Smalltalk:
 * String>>#= is implemented as a primitive. The primitive compares #size before iterating over the characters
 * No Unicode normalization is done
 * No, this cannot happen.

John

John O'Keefe [|], CTO/Principal Smalltalk Architect, Instantiations Inc.
Skype: john_okeefe2     Mobile:  +1 919 417-3181 (Business hours USA Eastern Time zone (GMT -5))
[hidden email]
http://www.instantiations.com
VA Smalltalk...Onward and Upward!


On Sun, Jun 1, 2014 at 11:55 AM, Philippe Marschall <[hidden email]> wrote:
Hi

I am currently looking at your dictionary look up performance [1]. We
use GRSmallDictionary instead of Dictionary in a lot of places with
string keys (request and response headers, url query fields, …).
Unfortunately on Squeak/Pharo with 11 keys Dictionary is twice as
fast. The reason seems to be that String >> #= is quite slow on
Squeak/Pharo especially with prefix matches. Special casing string
keys and first sending and comparing #size before #= seems to fix this
and make GRSmallDictionary again slightly faster than Dictionary.

Before I take any action I wanted to ask how String >> #= is
implemented in other dialects
 * Do any dialects first compare #size before iterating over the characters?
 * Do any dialects do Unicode normalization in String >> #=?
 * Do any dialects implement String >> #= in a way such that two
strings with different sizes could be considered equal (eg. because
they do normalization)?

 [1] https://code.google.com/p/seaside/issues/detail?id=793

Cheers
Philippe
_______________________________________________
seaside-dev mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/seaside-dev


_______________________________________________
seaside-dev mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/seaside-dev