The Trunk: Kernel-dtl.1096.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

The Trunk: Kernel-dtl.1096.mcz

commits-2
David T. Lewis uploaded a new version of Kernel to project The Trunk:
http://source.squeak.org/trunk/Kernel-dtl.1096.mcz

==================== Summary ====================

Name: Kernel-dtl.1096
Author: dtl
Time: 18 April 2017, 11:59:08.257358 am
UUID: 877ab97f-9ebe-4405-af31-fe9caab25eb6
Ancestors: Kernel-eem.1095

SmallInteger>>hashMultiply comment provided by Andres Valloud

=============== Diff against Kernel-eem.1095 ===============

Item was changed:
  ----- Method: SmallInteger>>hashMultiply (in category 'bit manipulation') -----
  hashMultiply
+ "Multiply by 1664525 mod 2^28 without overflowing into large integers."
  | low |
-
  low := self bitAnd: 16383.
  ^(16r260D * low + ((16r260D * (self bitShift: -14) + (16r0065 * low) bitAnd: 16383) * 16384))
  bitAnd: 16r0FFFFFFF!


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: The Trunk: Kernel-dtl.1096.mcz

Chris Muller-3
Okay, but it's a given that hash functions are desired to be fast, so
one can always assume optimized code in them.  I was hoping for a
comment that would refer to the origin of this hash function, so one
could research and possibly understand it...


On Tue, Apr 18, 2017 at 10:59 AM,  <[hidden email]> wrote:

> David T. Lewis uploaded a new version of Kernel to project The Trunk:
> http://source.squeak.org/trunk/Kernel-dtl.1096.mcz
>
> ==================== Summary ====================
>
> Name: Kernel-dtl.1096
> Author: dtl
> Time: 18 April 2017, 11:59:08.257358 am
> UUID: 877ab97f-9ebe-4405-af31-fe9caab25eb6
> Ancestors: Kernel-eem.1095
>
> SmallInteger>>hashMultiply comment provided by Andres Valloud
>
> =============== Diff against Kernel-eem.1095 ===============
>
> Item was changed:
>   ----- Method: SmallInteger>>hashMultiply (in category 'bit manipulation') -----
>   hashMultiply
> +       "Multiply by 1664525 mod 2^28 without overflowing into large integers."
>         | low |
> -
>         low := self bitAnd: 16383.
>         ^(16r260D * low + ((16r260D * (self bitShift: -14) + (16r0065 * low) bitAnd: 16383) * 16384))
>                         bitAnd: 16r0FFFFFFF!
>
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: The Trunk: Kernel-dtl.1096.mcz

Levente Uzonyi
Hi Chris,

The recent mails of Andres Valloud answer your questions[1][2].

Now that I took a deeper look at the function, I found the following
issues:
1. It'll overflow into LargeIntegers during computation for some
SmallInteger input. That can and should be fixed by adding a bitAnd:
16r3FFF.
2. It will accept negative integers, but I'm not sure if it will work the
same way the primitive does.

Levente

[1] http://lists.squeakfoundation.org/pipermail/vm-dev/2017-April/024826.html
[2] http://lists.squeakfoundation.org/pipermail/vm-dev/2017-April/024831.html

On Thu, 20 Apr 2017, Chris Muller wrote:

> Okay, but it's a given that hash functions are desired to be fast, so
> one can always assume optimized code in them.  I was hoping for a
> comment that would refer to the origin of this hash function, so one
> could research and possibly understand it...
>
>
> On Tue, Apr 18, 2017 at 10:59 AM,  <[hidden email]> wrote:
>> David T. Lewis uploaded a new version of Kernel to project The Trunk:
>> http://source.squeak.org/trunk/Kernel-dtl.1096.mcz
>>
>> ==================== Summary ====================
>>
>> Name: Kernel-dtl.1096
>> Author: dtl
>> Time: 18 April 2017, 11:59:08.257358 am
>> UUID: 877ab97f-9ebe-4405-af31-fe9caab25eb6
>> Ancestors: Kernel-eem.1095
>>
>> SmallInteger>>hashMultiply comment provided by Andres Valloud
>>
>> =============== Diff against Kernel-eem.1095 ===============
>>
>> Item was changed:
>>   ----- Method: SmallInteger>>hashMultiply (in category 'bit manipulation') -----
>>   hashMultiply
>> +       "Multiply by 1664525 mod 2^28 without overflowing into large integers."
>>         | low |
>> -
>>         low := self bitAnd: 16383.
>>         ^(16r260D * low + ((16r260D * (self bitShift: -14) + (16r0065 * low) bitAnd: 16383) * 16384))
>>                         bitAnd: 16r0FFFFFFF!
>>
>>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: The Trunk: Kernel-dtl.1096.mcz

Chris Muller-4
That's not the point.  I'm saying if you're going to commit a new
version of Kernel solely to improve a comment, then make it something
which can lead someone reading it, way in the future, to have a
reference link to more information -- like Andres Valloud's name,
and/or book title.

On Thu, Apr 20, 2017 at 4:08 PM, Levente Uzonyi <[hidden email]> wrote:

> Hi Chris,
>
> The recent mails of Andres Valloud answer your questions[1][2].
>
> Now that I took a deeper look at the function, I found the following issues:
> 1. It'll overflow into LargeIntegers during computation for some
> SmallInteger input. That can and should be fixed by adding a bitAnd:
> 16r3FFF.
> 2. It will accept negative integers, but I'm not sure if it will work the
> same way the primitive does.
>
> Levente
>
> [1]
> http://lists.squeakfoundation.org/pipermail/vm-dev/2017-April/024826.html
> [2]
> http://lists.squeakfoundation.org/pipermail/vm-dev/2017-April/024831.html
>
> On Thu, 20 Apr 2017, Chris Muller wrote:
>
>> Okay, but it's a given that hash functions are desired to be fast, so
>> one can always assume optimized code in them.  I was hoping for a
>> comment that would refer to the origin of this hash function, so one
>> could research and possibly understand it...
>>
>>
>> On Tue, Apr 18, 2017 at 10:59 AM,  <[hidden email]> wrote:
>>>
>>> David T. Lewis uploaded a new version of Kernel to project The Trunk:
>>> http://source.squeak.org/trunk/Kernel-dtl.1096.mcz
>>>
>>> ==================== Summary ====================
>>>
>>> Name: Kernel-dtl.1096
>>> Author: dtl
>>> Time: 18 April 2017, 11:59:08.257358 am
>>> UUID: 877ab97f-9ebe-4405-af31-fe9caab25eb6
>>> Ancestors: Kernel-eem.1095
>>>
>>> SmallInteger>>hashMultiply comment provided by Andres Valloud
>>>
>>> =============== Diff against Kernel-eem.1095 ===============
>>>
>>> Item was changed:
>>>   ----- Method: SmallInteger>>hashMultiply (in category 'bit
>>> manipulation') -----
>>>   hashMultiply
>>> +       "Multiply by 1664525 mod 2^28 without overflowing into large
>>> integers."
>>>         | low |
>>> -
>>>         low := self bitAnd: 16383.
>>>         ^(16r260D * low + ((16r260D * (self bitShift: -14) + (16r0065 *
>>> low) bitAnd: 16383) * 16384))
>>>                         bitAnd: 16r0FFFFFFF!
>>>
>>>
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: The Trunk: Kernel-dtl.1096.mcz

David T. Lewis
On Thu, Apr 20, 2017 at 05:14:50PM -0500, Chris Muller wrote:
> That's not the point.  I'm saying if you're going to commit a new
> version of Kernel solely to improve a comment, then make it something
> which can lead someone reading it, way in the future, to have a
> reference link to more information -- like Andres Valloud's name,
> and/or book title.

You're right, the commit notice should have had a reference to the squeak-dev
discussion for the benefit of future readers. Sorry!

It may seem odd to commit an entire Kernel version just for the sake of a
one line comment, but in this case it was a crucial comment that had been
accidentally omitted for the last 17 years, so I though it was worth the
noise. But I should have documented that in the commit notice.

Dave




>
> On Thu, Apr 20, 2017 at 4:08 PM, Levente Uzonyi <[hidden email]> wrote:
> > Hi Chris,
> >
> > The recent mails of Andres Valloud answer your questions[1][2].
> >
> > Now that I took a deeper look at the function, I found the following issues:
> > 1. It'll overflow into LargeIntegers during computation for some
> > SmallInteger input. That can and should be fixed by adding a bitAnd:
> > 16r3FFF.
> > 2. It will accept negative integers, but I'm not sure if it will work the
> > same way the primitive does.
> >
> > Levente
> >
> > [1]
> > http://lists.squeakfoundation.org/pipermail/vm-dev/2017-April/024826.html
> > [2]
> > http://lists.squeakfoundation.org/pipermail/vm-dev/2017-April/024831.html
> >
> > On Thu, 20 Apr 2017, Chris Muller wrote:
> >
> >> Okay, but it's a given that hash functions are desired to be fast, so
> >> one can always assume optimized code in them.  I was hoping for a
> >> comment that would refer to the origin of this hash function, so one
> >> could research and possibly understand it...
> >>
> >>
> >> On Tue, Apr 18, 2017 at 10:59 AM,  <[hidden email]> wrote:
> >>>
> >>> David T. Lewis uploaded a new version of Kernel to project The Trunk:
> >>> http://source.squeak.org/trunk/Kernel-dtl.1096.mcz
> >>>
> >>> ==================== Summary ====================
> >>>
> >>> Name: Kernel-dtl.1096
> >>> Author: dtl
> >>> Time: 18 April 2017, 11:59:08.257358 am
> >>> UUID: 877ab97f-9ebe-4405-af31-fe9caab25eb6
> >>> Ancestors: Kernel-eem.1095
> >>>
> >>> SmallInteger>>hashMultiply comment provided by Andres Valloud
> >>>
> >>> =============== Diff against Kernel-eem.1095 ===============
> >>>
> >>> Item was changed:
> >>>   ----- Method: SmallInteger>>hashMultiply (in category 'bit
> >>> manipulation') -----
> >>>   hashMultiply
> >>> +       "Multiply by 1664525 mod 2^28 without overflowing into large
> >>> integers."
> >>>         | low |
> >>> -
> >>>         low := self bitAnd: 16383.
> >>>         ^(16r260D * low + ((16r260D * (self bitShift: -14) + (16r0065 *
> >>> low) bitAnd: 16383) * 16384))
> >>>                         bitAnd: 16r0FFFFFFF!
> >>>
> >>>
> >
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: The Trunk: Kernel-dtl.1096.mcz

Eliot Miranda-2
In reply to this post by Levente Uzonyi
Hi Levente,

On Thu, Apr 20, 2017 at 2:08 PM, Levente Uzonyi <[hidden email]> wrote:
Hi Chris,

The recent mails of Andres Valloud answer your questions[1][2].

I still think we could and should make the comment more informative.  Andrew' responses are short on references.  I understand he has a lot of experience in hashing; all the more reason he should be able to provide a good reference.
 

Now that I took a deeper look at the function, I found the following issues:
1. It'll overflow into LargeIntegers during computation for some SmallInteger input. That can and should be fixed by adding a bitAnd: 16r3FFF.

I don't see that for SmallInteger maxVal.  Try the following:
Debug:
| v low |
v := SmallInteger maxVal.
low := v bitAnd: 16383.
^ 9741 * low + ((9741
* (v bitShift: -14) + (101 * low) bitAnd: 16383)
* 16384) bitAnd: 268435455 

and then do run until... ThisContext top class ~= SmallInteger

When I try this it runs to the end.

2. It will accept negative integers, but I'm not sure if it will work the same way the primitive does.

It should.  The last bitAnd: should produce the same results.  I tested this and IIRC it worked.  Again easy to test.  Create a non-primitive hashMultiply and then test for all the negative SmallIntegers, something quite possible in 32-bits :-)
 

Levente

[1] http://lists.squeakfoundation.org/pipermail/vm-dev/2017-April/024826.html
[2] http://lists.squeakfoundation.org/pipermail/vm-dev/2017-April/024831.html

On Thu, 20 Apr 2017, Chris Muller wrote:

Okay, but it's a given that hash functions are desired to be fast, so
one can always assume optimized code in them.  I was hoping for a
comment that would refer to the origin of this hash function, so one
could research and possibly understand it...


On Tue, Apr 18, 2017 at 10:59 AM,  <[hidden email]> wrote:
David T. Lewis uploaded a new version of Kernel to project The Trunk:
http://source.squeak.org/trunk/Kernel-dtl.1096.mcz

==================== Summary ====================

Name: Kernel-dtl.1096
Author: dtl
Time: 18 April 2017, 11:59:08.257358 am
UUID: 877ab97f-9ebe-4405-af31-fe9caab25eb6
Ancestors: Kernel-eem.1095

SmallInteger>>hashMultiply comment provided by Andres Valloud

=============== Diff against Kernel-eem.1095 ===============

Item was changed:
  ----- Method: SmallInteger>>hashMultiply (in category 'bit manipulation') -----
  hashMultiply
+       "Multiply by 1664525 mod 2^28 without overflowing into large integers."
        | low |
-
        low := self bitAnd: 16383.
        ^(16r260D * low + ((16r260D * (self bitShift: -14) + (16r0065 * low) bitAnd: 16383) * 16384))
                        bitAnd: 16r0FFFFFFF!






--
_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: The Trunk: Kernel-dtl.1096.mcz

Chris Muller-3
In reply to this post by David T. Lewis
Heh, no need to apologize, I was just taking the opportunity to
express the idea of... sculpting The Class Library (capitalized to
garner grand respect) with deliberate consideration for the full
audience of present and future consumers.

Thanks.

On Thu, Apr 20, 2017 at 7:10 PM, David T. Lewis <[hidden email]> wrote:

> On Thu, Apr 20, 2017 at 05:14:50PM -0500, Chris Muller wrote:
>> That's not the point.  I'm saying if you're going to commit a new
>> version of Kernel solely to improve a comment, then make it something
>> which can lead someone reading it, way in the future, to have a
>> reference link to more information -- like Andres Valloud's name,
>> and/or book title.
>
> You're right, the commit notice should have had a reference to the squeak-dev
> discussion for the benefit of future readers. Sorry!
>
> It may seem odd to commit an entire Kernel version just for the sake of a
> one line comment, but in this case it was a crucial comment that had been
> accidentally omitted for the last 17 years, so I though it was worth the
> noise. But I should have documented that in the commit notice.
>
> Dave
>
>
>
>
>>
>> On Thu, Apr 20, 2017 at 4:08 PM, Levente Uzonyi <[hidden email]> wrote:
>> > Hi Chris,
>> >
>> > The recent mails of Andres Valloud answer your questions[1][2].
>> >
>> > Now that I took a deeper look at the function, I found the following issues:
>> > 1. It'll overflow into LargeIntegers during computation for some
>> > SmallInteger input. That can and should be fixed by adding a bitAnd:
>> > 16r3FFF.
>> > 2. It will accept negative integers, but I'm not sure if it will work the
>> > same way the primitive does.
>> >
>> > Levente
>> >
>> > [1]
>> > http://lists.squeakfoundation.org/pipermail/vm-dev/2017-April/024826.html
>> > [2]
>> > http://lists.squeakfoundation.org/pipermail/vm-dev/2017-April/024831.html
>> >
>> > On Thu, 20 Apr 2017, Chris Muller wrote:
>> >
>> >> Okay, but it's a given that hash functions are desired to be fast, so
>> >> one can always assume optimized code in them.  I was hoping for a
>> >> comment that would refer to the origin of this hash function, so one
>> >> could research and possibly understand it...
>> >>
>> >>
>> >> On Tue, Apr 18, 2017 at 10:59 AM,  <[hidden email]> wrote:
>> >>>
>> >>> David T. Lewis uploaded a new version of Kernel to project The Trunk:
>> >>> http://source.squeak.org/trunk/Kernel-dtl.1096.mcz
>> >>>
>> >>> ==================== Summary ====================
>> >>>
>> >>> Name: Kernel-dtl.1096
>> >>> Author: dtl
>> >>> Time: 18 April 2017, 11:59:08.257358 am
>> >>> UUID: 877ab97f-9ebe-4405-af31-fe9caab25eb6
>> >>> Ancestors: Kernel-eem.1095
>> >>>
>> >>> SmallInteger>>hashMultiply comment provided by Andres Valloud
>> >>>
>> >>> =============== Diff against Kernel-eem.1095 ===============
>> >>>
>> >>> Item was changed:
>> >>>   ----- Method: SmallInteger>>hashMultiply (in category 'bit
>> >>> manipulation') -----
>> >>>   hashMultiply
>> >>> +       "Multiply by 1664525 mod 2^28 without overflowing into large
>> >>> integers."
>> >>>         | low |
>> >>> -
>> >>>         low := self bitAnd: 16383.
>> >>>         ^(16r260D * low + ((16r260D * (self bitShift: -14) + (16r0065 *
>> >>> low) bitAnd: 16383) * 16384))
>> >>>                         bitAnd: 16r0FFFFFFF!
>> >>>
>> >>>
>> >
>>
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: The Trunk: Kernel-dtl.1096.mcz

Levente Uzonyi
In reply to this post by Eliot Miranda-2
Hi Eliot,

On Thu, 20 Apr 2017, Eliot Miranda wrote:

> Hi Levente,
> On Thu, Apr 20, 2017 at 2:08 PM, Levente Uzonyi <[hidden email]> wrote:
>       Hi Chris,
>
>       The recent mails of Andres Valloud answer your questions[1][2].
>
>
> I still think we could and should make the comment more informative.  Andrew' responses are short on references.  I understand he has a lot of experience in hashing; all the more reason he should be able to
> provide a good reference.
>  
>
>       Now that I took a deeper look at the function, I found the following issues:
>       1. It'll overflow into LargeIntegers during computation for some SmallInteger input. That can and should be fixed by adding a bitAnd: 16r3FFF.
>
>
> I don't see that for SmallInteger maxVal.  Try the following:
> Debug:
> | v low |
> v := SmallInteger maxVal.
> low := v bitAnd: 16383.
> ^ 9741 * low + ((9741
> * (v bitShift: -14) + (101 * low) bitAnd: 16383)
> * 16384) bitAnd: 268435455 
>
> and then do run until... ThisContext top class ~= SmallInteger
>
> When I try this it runs to the end.
Right. My calculations were one bit off. The smallest input where the
internal computation leaves the 30-bits range is 16r6B7B3FB7, which has
31-bits.
It also works for 60-bits SmallIntegers for the same reason, the
multiplier constant's low 14-bits part is small enough.

>
>       2. It will accept negative integers, but I'm not sure if it will work the same way the primitive does.
>
>
> It should.  The last bitAnd: should produce the same results.  I tested this and IIRC it worked.  Again easy to test.  Create a non-primitive hashMultiply and then test for all the negative SmallIntegers,
> something quite possible in 32-bits :-)

Right, but I'd rather not try the same in 64-bits. :)

Levente

>  
>
>       Levente
>
>       [1] http://lists.squeakfoundation.org/pipermail/vm-dev/2017-April/024826.html
>       [2] http://lists.squeakfoundation.org/pipermail/vm-dev/2017-April/024831.html
>
>       On Thu, 20 Apr 2017, Chris Muller wrote:
>
>             Okay, but it's a given that hash functions are desired to be fast, so
>             one can always assume optimized code in them.  I was hoping for a
>             comment that would refer to the origin of this hash function, so one
>             could research and possibly understand it...
>
>
>             On Tue, Apr 18, 2017 at 10:59 AM,  <[hidden email]> wrote:
>                   David T. Lewis uploaded a new version of Kernel to project The Trunk:
>                   http://source.squeak.org/trunk/Kernel-dtl.1096.mcz
>
>                   ==================== Summary ====================
>
>                   Name: Kernel-dtl.1096
>                   Author: dtl
>                   Time: 18 April 2017, 11:59:08.257358 am
>                   UUID: 877ab97f-9ebe-4405-af31-fe9caab25eb6
>                   Ancestors: Kernel-eem.1095
>
>                   SmallInteger>>hashMultiply comment provided by Andres Valloud
>
>                   =============== Diff against Kernel-eem.1095 ===============
>
>                   Item was changed:
>                     ----- Method: SmallInteger>>hashMultiply (in category 'bit manipulation') -----
>                     hashMultiply
>                   +       "Multiply by 1664525 mod 2^28 without overflowing into large integers."
>                           | low |
>                   -
>                           low := self bitAnd: 16383.
>                           ^(16r260D * low + ((16r260D * (self bitShift: -14) + (16r0065 * low) bitAnd: 16383) * 16384))
>                                           bitAnd: 16r0FFFFFFF!
>
>
>
>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>
>

Loading...