The Trunk: Kernel-tfel.1046.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

The Trunk: Kernel-tfel.1046.mcz

commits-2
Tim Felgentreff uploaded a new version of Kernel to project The Trunk:
http://source.squeak.org/trunk/Kernel-tfel.1046.mcz

==================== Summary ====================

Name: Kernel-tfel.1046
Author: tfel
Time: 19 October 2016, 5:34:18.991659 pm
UUID: 12ebf345-b919-2845-8147-56a8b2f4cae4
Ancestors: Kernel-eem.1045

Improve performance of large integer fallback code for multiplication and division with small integers, by avoiding recalculating SmallInteger>>digitLength a lot of times. (This yields a 3x speedup in some integer heavy shootout benchmarks like pidigits for RSqueak, where we don't have a LargeIntegers plugin)

=============== Diff against Kernel-eem.1045 ===============

Item was changed:
  ----- Method: Integer>>digitDiv:neg: (in category 'private') -----
  digitDiv: arg neg: ng
  "Answer with an array of (quotient, remainder)."
+ | quo rem ql d div dh dnh dl qhi qlo j l hi lo r3 a t divDigitLength remDigitLength |
- | quo rem ql d div dh dnh dl qhi qlo j l hi lo r3 a t |
  <primitive: 'primDigitDivNegative' module:'LargeIntegers'>
  arg = 0 ifTrue: [^ (ZeroDivide dividend: self) signal].
  "TFEI added this line"
  l := self digitLength - arg digitLength + 1.
  l <= 0 ifTrue: [^ Array with: 0 with: self].
  "shortcut against #highBit"
  d := 8 - arg lastDigit highBitOfByte.
  div := arg digitLshift: d.
+ divDigitLength := div digitLength + 1.
+ div := div growto: divDigitLength.
- div := div growto: div digitLength + 1.
  "shifts so high order word is >=128"
  rem := self digitLshift: d.
  rem digitLength = self digitLength ifTrue: [rem := rem growto: self digitLength + 1].
+ remDigitLength := rem digitLength.
  "makes a copy and shifts"
  quo := Integer new: l neg: ng.
+ dl := divDigitLength - 1.
- dl := div digitLength - 1.
  "Last actual byte of data"
  ql := l.
  dh := div digitAt: dl.
  dnh := dl = 1
  ifTrue: [0]
  ifFalse: [div digitAt: dl - 1].
  1 to: ql do:
  [:k |
  "maintain quo*arg+rem=self"
  "Estimate rem/div by dividing the leading to bytes of rem by dh."
  "The estimate is q = qhi*16+qlo, where qhi and qlo are nibbles."
+ j := remDigitLength + 1 - k.
- j := rem digitLength + 1 - k.
  "r1 := rem digitAt: j."
  (rem digitAt: j)
  = dh
  ifTrue: [qhi := qlo := 15
  "i.e. q=255"]
  ifFalse:
  ["Compute q = (r1,r2)//dh, t = (r1,r2)\\dh.  
  Note that r1,r2 are bytes, not nibbles.  
  Be careful not to generate intermediate results exceeding 13  
  bits."
  "r2 := (rem digitAt: j - 1)."
  t := ((rem digitAt: j)
  bitShift: 4)
  + ((rem digitAt: j - 1)
  bitShift: -4).
  qhi := t // dh.
  t := (t \\ dh bitShift: 4)
  + ((rem digitAt: j - 1)
  bitAnd: 15).
  qlo := t // dh.
  t := t \\ dh.
  "Next compute (hi,lo) := q*dnh"
  hi := qhi * dnh.
  lo := qlo * dnh + ((hi bitAnd: 15)
  bitShift: 4).
  hi := (hi bitShift: -4)
  + (lo bitShift: -8).
  lo := lo bitAnd: 255.
  "Correct overestimate of q.  
  Max of 2 iterations through loop -- see Knuth vol. 2"
  r3 := j < 3
  ifTrue: [0]
  ifFalse: [rem digitAt: j - 2].
  [(t < hi
  or: [t = hi and: [r3 < lo]])
  and:
  ["i.e. (t,r3) < (hi,lo)"
  qlo := qlo - 1.
  lo := lo - dnh.
  lo < 0
  ifTrue:
  [hi := hi - 1.
  lo := lo + 256].
  hi >= dh]]
  whileTrue: [hi := hi - dh].
  qlo < 0
  ifTrue:
  [qhi := qhi - 1.
  qlo := qlo + 16]].
  "Subtract q*div from rem"
  l := j - dl.
  a := 0.
+ 1 to: divDigitLength do:
- 1 to: div digitLength do:
  [:i |
  hi := (div digitAt: i)
  * qhi.
  lo := a + (rem digitAt: l) - ((hi bitAnd: 15)
  bitShift: 4) - ((div digitAt: i)
  * qlo).
  rem digitAt: l put: lo - (lo // 256 * 256).
  "sign-tolerant form of (lo bitAnd: 255)"
  a := lo // 256 - (hi bitShift: -4).
  l := l + 1].
  a < 0
  ifTrue:
  ["Add div back into rem, decrease q by 1"
  qlo := qlo - 1.
  l := j - dl.
  a := 0.
+ 1 to: divDigitLength do:
- 1 to: div digitLength do:
  [:i |
  a := (a bitShift: -8)
  + (rem digitAt: l) + (div digitAt: i).
  rem digitAt: l put: (a bitAnd: 255).
  l := l + 1]].
+ quo digitAt: ql + 1 - k put: (qhi bitShift: 4)
- quo digitAt: quo digitLength + 1 - k put: (qhi bitShift: 4)
  + qlo].
  rem := rem
  digitRshift: d
  bytes: 0
  lookfirst: dl.
  ^ Array with: quo with: rem!

Item was changed:
  ----- Method: Integer>>digitMultiply:neg: (in category 'private') -----
  digitMultiply: arg neg: ng
+ | selfLen argLen prod prodLen carry digit k ab |
- | prod prodLen carry digit k ab |
  <primitive: 'primDigitMultiplyNegative' module:'LargeIntegers'>
+ argLen := arg digitLength.
+ (argLen = 1 and: [(arg digitAt: 1)
- (arg digitLength = 1 and: [(arg digitAt: 1)
  = 0])
  ifTrue: [^ 0].
+ selfLen := self digitLength.
+ (selfLen = 1 and: [(self digitAt: 1)
- (self digitLength = 1 and: [(self digitAt: 1)
  = 0])
  ifTrue: [^ 0].
+ prodLen := selfLen + argLen.
- prodLen := self digitLength + arg digitLength.
  prod := Integer new: prodLen neg: ng.
  "prod starts out all zero"
+ 1 to: selfLen do: [:i | (digit := self digitAt: i) ~= 0
- 1 to: self digitLength do: [:i | (digit := self digitAt: i) ~= 0
  ifTrue:
  [k := i.
  carry := 0.
  "Loop invariant: 0<=carry<=0377, k=i+j-1"
+ 1 to: argLen do:
- 1 to: arg digitLength do:
  [:j |
  ab := (arg digitAt: j)
  * digit + carry + (prod digitAt: k).
  carry := ab bitShift: -8.
  prod digitAt: k put: (ab bitAnd: 255).
  k := k + 1].
  prod digitAt: k put: carry]].
  ^ prod normalize!