The Trunk: Kernel-nice.1308.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

The Trunk: Kernel-nice.1308.mcz

commits-2
Nicolas Cellier uploaded a new version of Kernel to project The Trunk:
http://source.squeak.org/trunk/Kernel-nice.1308.mcz

==================== Summary ====================

Name: Kernel-nice.1308
Author: nice
Time: 4 March 2020, 12:36:54.10186 am
UUID: 39004bb3-ce72-49c4-af24-6c3f079210ba
Ancestors: Kernel-nice.1307

Nuke obsolete comments at end of LargePositiveInteger>>squared* methods

Connect the highBit primitive provided by new VM.
Since the primitive is jitted, it's more than 3x faster than highBitOfPositiveReceiver.
This restores changes of Kernel-nice.1293

=============== Diff against Kernel-nice.1307 ===============

Item was changed:
  ----- Method: LargePositiveInteger>>squaredByFourth (in category 'private') -----
  squaredByFourth
  "Use a 4-way Toom-Cook divide and conquer algorithm to perform the multiplication.
  See Asymmetric Squaring Formulae Jaewook Chung and M. Anwar Hasan
  https://www.lirmm.fr/arith18/papers/Chung-Squaring.pdf"
 
  | p a0 a1 a2 a3 a02 a13 s0 s1 s2 s3 s4 s5 s6 t2 t3 |
  "divide in 4 parts, rounded to upper multiple of 4"
  p := (self digitLength + 15 bitShift: -4) bitShift: 2.
  a3 := self butLowestNDigits: p * 3.
  a2 := self copyDigitsFrom: p * 2 + 1 to: p * 3.
  a1 := self copyDigitsFrom: p + 1 to: p * 2.
  a0 := self lowestNDigits: p.
 
  "Toom-4 trick: 7 multiplications instead of 16"
  a02 := a0 - a2.
  a13 := a1 - a3.
  s0 := a0 squared.
  s1 := (a0 * a1) bitShift: 1.
  s2 := (a02 + a13) * (a02 - a13).
  s3 := ((a0 + a1) + (a2 + a3)) squared.
  s4 := (a02 * a13) bitShift: 1.
  s5 := (a3 * a2) bitShift: 1.
  s6 := a3 squared.
 
  "Interpolation"
  t2 := s1 + s5.
  t3 := (s2 + s3 + s4 bitShift: -1) - t2.
  s3 := t2 - s4.
  s4 := t3 - s0.
  s2 := t3 - s2 - s6.
 
  "Sum the parts of decomposition"
  ^s0 + (s1 bitShift: 8*p) + (s2 + (s3 bitShift: 8*p) bitShift: 16*p)
+ +(s4 + (s5 bitShift: 8*p) + (s6 bitShift: 16*p) bitShift: 32*p)!
- +(s4 + (s5 bitShift: 8*p) + (s6 bitShift: 16*p) bitShift: 32*p)
-
- "
- | a |
- a := 770 factorial-1.
- a digitLength.
- [a * a - a squaredToom4 = 0] assert.
- [Smalltalk garbageCollect.
- [1000 timesRepeat: [a squaredToom4]] timeToRun] value /
- [Smalltalk garbageCollect.
- [1000 timesRepeat: [a squaredKaratsuba]] timeToRun] value asFloat
- "!

Item was changed:
  ----- Method: LargePositiveInteger>>squaredByHalf (in category 'private') -----
  squaredByHalf
  "Use a divide and conquer algorithm to perform the multiplication.
  Split in two parts like Karatsuba, but economize 2 additions by using asymetrical product."
 
  | half xHigh xLow low high mid |
 
  "Divide digits in two halves rounded tp upper multiple of 4"
  half := (self digitLength + 1 bitShift: -3) bitShift: 2.
  xLow := self lowestNDigits: half.
  xHigh := self butLowestNDigits: half.
 
  "eventually use karatsuba"
  low := xLow squared.
  high := xHigh squared.
  mid := xLow multiplyByInteger: xHigh.
 
  "Sum the parts of decomposition"
  ^(high bitShift: 16*half)
  inplaceAddNonOverlapping: low digitShiftBy: 0;
+ + (mid bitShift: 8*half+1)!
- + (mid bitShift: 8*half+1)
-
- "
- | a |
- a := 440 factorial-1.
- a digitLength.
- self assert: a * a - a squaredKaratsuba = 0.
- [Smalltalk garbageCollect.
- [2000 timesRepeat: [a squaredKaratsuba]] timeToRun] value /
- [Smalltalk garbageCollect.
- [2000 timesRepeat: [a * a]] timeToRun] value asFloat
- "!

Item was changed:
  ----- Method: LargePositiveInteger>>squaredByThird (in category 'private') -----
  squaredByThird
  "Use a 3-way Toom-Cook divide and conquer algorithm to perform the multiplication"
 
  | third x0 x1 x2 x20 z0 z1 z2 z3 z4 |
  "divide in 3 parts, rounded to upper multiple of 4"
  third := self digitLength + 11 // 3 bitShift: 2.
  x2 := self butLowestNDigits: third * 2.
  x1 := self copyDigitsFrom: third + 1 to: third * 2.
  x0 := self lowestNDigits: third.
 
  "Toom-3 trick: 5 multiplications instead of 9"
  z0 := x0 squared.
  z4 := x2 squared.
  x20 := x2 + x0.
  z1 := (x20 + x1) squared.
  x20 := x20 - x1.
  z2 := x20 squared.
  z3 := ((x20 + x2 bitShift: 1) - x0) squared.
 
  "Sum the parts of decomposition"
  z3 := z3 - z1 quo: 3.
  z1 := z1 - z2 bitShift: -1.
  z2 := z2 - z0.
 
  z3 := (z2 - z3 bitShift: -1) + (z4 bitShift: 1).
  z2 := z2 + z1 - z4.
  z1 := z1 - z3.
+ ^z0 + (z1 bitShift: 8*third) + (z2 bitShift: 16*third) + (z3 + (z4 bitShift: 8*third) bitShift: 24*third)!
- ^z0 + (z1 bitShift: 8*third) + (z2 bitShift: 16*third) + (z3 + (z4 bitShift: 8*third) bitShift: 24*third)
-
- "
- | a |
- a := 1400 factorial-1.
- a digitLength.
- self assert: a * a - a squaredToom3 = 0.
- [Smalltalk garbageCollect.
- [1000 timesRepeat: [a squaredToom3]] timeToRun] value /
- [Smalltalk garbageCollect.
- [1000 timesRepeat: [a squaredKaratsuba]] timeToRun] value asFloat
- "!

Item was changed:
  ----- Method: SmallInteger>>highBit (in category 'bit manipulation') -----
  highBit
  "Answer the index of the high order bit of the receiver, or zero if the  
  receiver is zero. Raise an error if the receiver is negative, since  
  negative integers are defined to have an infinite number of leading 1's
  in 2's-complement arithmetic. Use >>highBitOfMagnitude if you want to
  get the highest bit of the magnitude."
+ <primitive: 575>
  self < 0 ifTrue: [^ self error: 'highBit is not defined for negative integers'].
  ^ self highBitOfPositiveReceiver!

Item was changed:
  ----- Method: SmallInteger>>highBitOfMagnitude (in category 'bit manipulation') -----
  highBitOfMagnitude
  "Answer the index of the high order bit of the receiver, or zero if the  
  receiver is zero. This method is used for negative SmallIntegers as well,  
  since Squeak's LargeIntegers are sign/magnitude."
 
+ <primitive: 575>
+ self < 0 ifTrue: [^self negated highBit].
- self < 0 ifTrue: [
- "Beware: do not use highBitOfPositiveReceiver
- because self negated is not necessarily a SmallInteger
- (see SmallInteger minVal)"
- ^self negated highBitOfMagnitude].
-
- "Implementation note: this method could be as well inlined here."
  ^self highBitOfPositiveReceiver!