[vwnc] URI hash and =

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[vwnc] URI hash and =

Holger Kleinsorgen-3
while working with a large Dictionary with URI keys, I stumbled on URI's
implementation of #hash and #=

= aURI
   ^self class = aURI class
     and: [self asString = aURI asString]

URI>>hash
  ^ self asString hash

needless to say that both are horrible with regards to performance,
especially #hash. I've started to implement = and hash as extensions in
some subclasses (FileURL, HttpURL), but IMHO a more sane implementation
should be in the base image.
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] URI hash and =

giorgiof
Hi,
Witch version of VW are you running on?
A huge works on Hashing in VW has been done on 7.6, so it's better for you to wait for the publication of the 7.6 NC version that should happen on short time. They probably solved you problem.

ciao

Giorgio

On Wed, Apr 23, 2008 at 1:12 PM, Holger Kleinsorgen <[hidden email]> wrote:
while working with a large Dictionary with URI keys, I stumbled on URI's
implementation of #hash and #=

= aURI
  ^self class = aURI class
    and: [self asString = aURI asString]

URI>>hash
 ^ self asString hash

needless to say that both are horrible with regards to performance,
especially #hash. I've started to implement = and hash as extensions in
some subclasses (FileURL, HttpURL), but IMHO a more sane implementation
should be in the base image.
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] URI hash and =

giorgiof
oops, obviously was <span onclick="dr4sdgryt(event)">which... not a witch..

On Wed, Apr 23, 2008 at 3:10 PM, giorgio ferraris <[hidden email]> wrote:
Hi,
Witch version of VW are you running on?
A huge works on Hashing in VW has been done on 7.6, so it's better for you to wait for the publication of the 7.6 NC version that should happen on short time. They probably solved you problem.

ciao

Giorgio


On Wed, Apr 23, 2008 at 1:12 PM, Holger Kleinsorgen <[hidden email]> wrote:
while working with a large Dictionary with URI keys, I stumbled on URI's
implementation of #hash and #=

= aURI
  ^self class = aURI class
    and: [self asString = aURI asString]

URI>>hash
 ^ self asString hash

needless to say that both are horrible with regards to performance,
especially #hash. I've started to implement = and hash as extensions in
some subclasses (FileURL, HttpURL), but IMHO a more sane implementation
should be in the base image.
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc



_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] URI hash and =

Alan Knight-2
In reply to this post by giorgiof
No, that didn't change in 7.6.

At 09:10 AM 4/23/2008, giorgio ferraris wrote:
Hi,
Witch version of VW are you running on?
A huge works on Hashing in VW has been done on 7.6, so it's better for you to wait for the publication of the 7.6 NC version that should happen on short time. They probably solved you problem.

ciao

Giorgio

On Wed, Apr 23, 2008 at 1:12 PM, Holger Kleinsorgen <[hidden email]> wrote:
while working with a large Dictionary with URI keys, I stumbled on URI's
implementation of #hash and #=

= aURI
  ^self class = aURI class
    and: [self asString = aURI asString]

URI>>hash
 ^ self asString hash

needless to say that both are horrible with regards to performance,
especially #hash. I've started to implement = and hash as extensions in
some subclasses (FileURL, HttpURL), but IMHO a more sane implementation
should be in the base image.
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc

--
Alan Knight [|], Engineering Manager, Cincom Smalltalk

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] URI hash and =

Andres Valloud-3
In reply to this post by Holger Kleinsorgen-3
Holger,

Would you mind sending me a dataset of reasonable size so I can look at
this?

Thanks,
Andres.

Holger Kleinsorgen wrote:

> while working with a large Dictionary with URI keys, I stumbled on URI's
> implementation of #hash and #=
>
> = aURI
>    ^self class = aURI class
>      and: [self asString = aURI asString]
>
> URI>>hash
>   ^ self asString hash
>
> needless to say that both are horrible with regards to performance,
> especially #hash. I've started to implement = and hash as extensions in
> some subclasses (FileURL, HttpURL), but IMHO a more sane implementation
> should be in the base image.
> _______________________________________________
> vwnc mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
>
>  

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] URI hash and =

Andres Valloud-6
Holger,

I took a quick look and I have a feeling that you are running into
asString being very expensive, particularly because the resulting string
will be hashed and then thrown away.

I did a small experiment with FileURL, and it's not hard to get a 3x
performance boost by not creating the string at all and just hashing the
stuff the string is manufactured from.

Other URLs see more or less the same speedup boost.  An improved
PartialURL hash method runs almost 3x faster.  And, after fixing a
number of bugs in URLwithAuthority, the speedup factor there was about
4.7x.

Is this what you had in mind?

Andres.


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On
Behalf Of Andres Valloud
Sent: Wednesday, April 23, 2008 9:29 AM
To: [hidden email]
Subject: Re: [vwnc] URI hash and =

Holger,

Would you mind sending me a dataset of reasonable size so I can look at
this?

Thanks,
Andres.

Holger Kleinsorgen wrote:

> while working with a large Dictionary with URI keys, I stumbled on
> URI's implementation of #hash and #=
>
> = aURI
>    ^self class = aURI class
>      and: [self asString = aURI asString]
>
> URI>>hash
>   ^ self asString hash
>
> needless to say that both are horrible with regards to performance,
> especially #hash. I've started to implement = and hash as extensions
> in some subclasses (FileURL, HttpURL), but IMHO a more sane
> implementation should be in the base image.
> _______________________________________________
> vwnc mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
>
>  

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] URI hash and =

Holger Kleinsorgen-3
Valloud, Andres schrieb:
> Holger,
>
> I took a quick look and I have a feeling that you are running into
> asString being very expensive, particularly because the resulting string
> will be hashed and then thrown away.
>  
given the new string hash in 7.6, results should be less devastating (I
encountered the problem in 7.4.1).

the problem was also boosted by the fact the many of the URIs only
differed in the fragment part, e.g.

http://www.my-hostname-is-longer-than-yours.com/insane-ontology#topic
http://www.my-hostname-is-longer-than-yours.com/insane-ontology#another_topic

and so on.

the string conversion overhead is also visible. I have to access the
dictionary quite frequently, so implementing hash / = without asString
should still benefit from the performance improvements you've noticed.

> I did a small experiment with FileURL, and it's not hard to get a 3x
> performance boost by not creating the string at all and just hashing the
> stuff the string is manufactured from.
>
> Other URLs see more or less the same speedup boost.  An improved
> PartialURL hash method runs almost 3x faster.  And, after fixing a
> number of bugs in URLwithAuthority, the speedup factor there was about
> 4.7x.
>
>  
coolness
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] URI hash and =

Andres Valloud-3
Ok, I will look into the = methods too then.

Andres.

Holger Kleinsorgen wrote:

> Valloud, Andres schrieb:
>  
>> Holger,
>>
>> I took a quick look and I have a feeling that you are running into
>> asString being very expensive, particularly because the resulting string
>> will be hashed and then thrown away.
>>  
>>    
> given the new string hash in 7.6, results should be less devastating (I
> encountered the problem in 7.4.1).
>
> the problem was also boosted by the fact the many of the URIs only
> differed in the fragment part, e.g.
>
> http://www.my-hostname-is-longer-than-yours.com/insane-ontology#topic
> http://www.my-hostname-is-longer-than-yours.com/insane-ontology#another_topic
>
> and so on.
>
> the string conversion overhead is also visible. I have to access the
> dictionary quite frequently, so implementing hash / = without asString
> should still benefit from the performance improvements you've noticed.
>  
>> I did a small experiment with FileURL, and it's not hard to get a 3x
>> performance boost by not creating the string at all and just hashing the
>> stuff the string is manufactured from.
>>
>> Other URLs see more or less the same speedup boost.  An improved
>> PartialURL hash method runs almost 3x faster.  And, after fixing a
>> number of bugs in URLwithAuthority, the speedup factor there was about
>> 4.7x.
>>
>>  
>>    
> coolness
> _______________________________________________
> vwnc mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
>
>  

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc