The Inbox: Collections-nice.807.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

The Inbox: Collections-nice.807.mcz

commits-2
Nicolas Cellier uploaded a new version of Collections to project The Inbox:
http://source.squeak.org/inbox/Collections-nice.807.mcz

==================== Summary ====================

Name: Collections-nice.807
Author: nice
Time: 13 September 2018, 9:29:57.057799 pm
UUID: f4686e6e-1820-4c3f-841d-c1904914c147
Ancestors: Collections-topa.806

Let isSeparator be consistent with Character separators after introduction of nbsp.

The performance cost should be marginal.

Note: some (CharacterSet cleanUp: false) might be necessary in postscript, unless this is included in the release process...

=============== Diff against Collections-topa.806 ===============

Item was changed:
  ----- Method: Character>>isSeparator (in category 'testing') -----
  isSeparator
  "Answer whether the receiver is one of the separator characters--space,
+ cr, tab, line feed, or form feed and nbsp."
- cr, tab, line feed, or form feed."
 
  | integerValue |
+ (integerValue := self asInteger) > 32 ifTrue: [ ^integerValue = 160 "non breaking space" ].
- (integerValue := self asInteger) > 32 ifTrue: [ ^false ].
  integerValue
  caseOf: {
  [ 32 "space" ] -> [ ^true ].
  [ 9 "tab" ] -> [ ^true ].
  [ 13 "cr"] -> [ ^true ].
  [ 10 "line feed" ] -> [ ^true ].
  [ 12 "form feed"] -> [ ^true ] }
  otherwise: [ ^false  ]!


Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.807.mcz

Levente Uzonyi
On Thu, 13 Sep 2018, [hidden email] wrote:

> Nicolas Cellier uploaded a new version of Collections to project The Inbox:
> http://source.squeak.org/inbox/Collections-nice.807.mcz
>
> ==================== Summary ====================
>
> Name: Collections-nice.807
> Author: nice
> Time: 13 September 2018, 9:29:57.057799 pm
> UUID: f4686e6e-1820-4c3f-841d-c1904914c147
> Ancestors: Collections-topa.806
>
> Let isSeparator be consistent with Character separators after introduction of nbsp.
>
> The performance cost should be marginal.
>
> Note: some (CharacterSet cleanUp: false) might be necessary in postscript, unless this is included in the release process...
>
> =============== Diff against Collections-topa.806 ===============
>
> Item was changed:
>  ----- Method: Character>>isSeparator (in category 'testing') -----
>  isSeparator
>   "Answer whether the receiver is one of the separator characters--space,
> + cr, tab, line feed, or form feed and nbsp."
> - cr, tab, line feed, or form feed."
>
>   | integerValue |
> + (integerValue := self asInteger) > 32 ifTrue: [ ^integerValue = 160 "non breaking space" ].

Unfortunately the current JIT will not optimize that construct the same
way as the ones used in the current implementation, so it'd still be
~1-1.5x slower to use this compared to the "compare with constant and do
a quick return" formula. If we really want to push this change, I'd rather
try something like this:

  (integerValue := self asInteger) > 32 ifTrue: [
  integerValue = 160 ifTrue: [ ^true ].
  ^false ].

Levente

> - (integerValue := self asInteger) > 32 ifTrue: [ ^false ].
>   integerValue
>   caseOf: {
>   [ 32 "space" ] -> [ ^true ].
>   [ 9 "tab" ] -> [ ^true ].
>   [ 13 "cr"] -> [ ^true ].
>   [ 10 "line feed" ] -> [ ^true ].
>   [ 12 "form feed"] -> [ ^true ] }
>   otherwise: [ ^false  ]!

cbc
Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.807.mcz

cbc
In reply to this post by commits-2


On Thu, Sep 13, 2018, 12:30 <[hidden email]> wrote:
Nicolas Cellier uploaded a new version of Collections to project The Inbox:
http://source.squeak.org/inbox/Collections-nice.807.mcz

==================== Summary ====================

Name: Collections-nice.807
Author: nice
Time: 13 September 2018, 9:29:57.057799 pm
UUID: f4686e6e-1820-4c3f-841d-c1904914c147
Ancestors: Collections-topa.806

Let isSeparator be consistent with Character separators after introduction of nbsp.

The performance cost should be marginal.

Note: some (CharacterSet cleanUp: false) might be necessary in postscript, unless this is included in the release process...

Please include the postscript. There are still some of us that keep images alive for years (or decades) and just update through releases instead of reloading.  Thanks,
CBC

=============== Diff against Collections-topa.806 ===============

Item was changed:
  ----- Method: Character>>isSeparator (in category 'testing') -----
  isSeparator
        "Answer whether the receiver is one of the separator characters--space,
+       cr, tab, line feed, or form feed and nbsp."
-       cr, tab, line feed, or form feed."

        | integerValue |
+       (integerValue := self asInteger) > 32 ifTrue: [ ^integerValue = 160 "non breaking space" ].
-       (integerValue := self asInteger) > 32 ifTrue: [ ^false ].
        integerValue
                caseOf: {
                        [ 32 "space" ] -> [ ^true ].
                        [ 9 "tab" ] -> [ ^true ].
                        [ 13 "cr"] -> [ ^true ].
                        [ 10 "line feed" ] -> [ ^true ].
                        [ 12 "form feed"] -> [ ^true ] }
                otherwise: [ ^false  ]!




Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.807.mcz

Eliot Miranda-2
In reply to this post by Levente Uzonyi
Hi Levente,
On Fri, Sep 14, 2018 at 12:35 AM Levente Uzonyi <[hidden email]> wrote:
On Thu, 13 Sep 2018, [hidden email] wrote:

> Nicolas Cellier uploaded a new version of Collections to project The Inbox:
> http://source.squeak.org/inbox/Collections-nice.807.mcz
>
> ==================== Summary ====================
>
> Name: Collections-nice.807
> Author: nice
> Time: 13 September 2018, 9:29:57.057799 pm
> UUID: f4686e6e-1820-4c3f-841d-c1904914c147
> Ancestors: Collections-topa.806
>
> Let isSeparator be consistent with Character separators after introduction of nbsp.
>
> The performance cost should be marginal.
>
> Note: some (CharacterSet cleanUp: false) might be necessary in postscript, unless this is included in the release process...
>
> =============== Diff against Collections-topa.806 ===============
>
> Item was changed:
>  ----- Method: Character>>isSeparator (in category 'testing') -----
>  isSeparator
>       "Answer whether the receiver is one of the separator characters--space,
> +     cr, tab, line feed, or form feed and nbsp."
> -     cr, tab, line feed, or form feed."
>
>       | integerValue |
> +     (integerValue := self asInteger) > 32 ifTrue: [ ^integerValue = 160 "non breaking space" ].

Unfortunately the current JIT will not optimize that construct the same
way as the ones used in the current implementation, so it'd still be
~1-1.5x slower to use this compared to the "compare with constant and do
a quick return" formula. If we really want to push this change, I'd rather
try something like this:

        (integerValue := self asInteger) > 32 ifTrue: [
                integerValue = 160 ifTrue: [ ^true ].
                ^false ].

I'd rather you provide a test case and encourager me to fix the JIT to produce faster code.  It seems absurd to have to write such crappy code, and maybe the JIT will be easy to fix.
 

Levente

> -     (integerValue := self asInteger) > 32 ifTrue: [ ^false ].
>       integerValue
>               caseOf: {
>                       [ 32 "space" ] -> [ ^true ].
>                       [ 9 "tab" ] -> [ ^true ].
>                       [ 13 "cr"] -> [ ^true ].
>                       [ 10 "line feed" ] -> [ ^true ].
>                       [ 12 "form feed"] -> [ ^true ] }
>               otherwise: [ ^false  ]!



--
_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.807.mcz

Levente Uzonyi
Hi Eliot,

Here is a very simple test case:

{
  [ 1 to: 500000000 do: [ :i | i = 1 ifTrue: [ true ] ifFalse: [ false ] ] ] timeToRun.
  [ 1 to: 500000000 do: [ :i | i = 1 ] ] timeToRun.
}

On my machine it gives #(992 1436).

Levente

On Sun, 16 Sep 2018, Eliot Miranda wrote:

> Hi Levente,
> On Fri, Sep 14, 2018 at 12:35 AM Levente Uzonyi <[hidden email]> wrote:
>       On Thu, 13 Sep 2018, [hidden email] wrote:
>
>       > Nicolas Cellier uploaded a new version of Collections to project The Inbox:
>       > http://source.squeak.org/inbox/Collections-nice.807.mcz
>       >
>       > ==================== Summary ====================
>       >
>       > Name: Collections-nice.807
>       > Author: nice
>       > Time: 13 September 2018, 9:29:57.057799 pm
>       > UUID: f4686e6e-1820-4c3f-841d-c1904914c147
>       > Ancestors: Collections-topa.806
>       >
>       > Let isSeparator be consistent with Character separators after introduction of nbsp.
>       >
>       > The performance cost should be marginal.
>       >
>       > Note: some (CharacterSet cleanUp: false) might be necessary in postscript, unless this is included in the release process...
>       >
>       > =============== Diff against Collections-topa.806 ===============
>       >
>       > Item was changed:
>       >  ----- Method: Character>>isSeparator (in category 'testing') -----
>       >  isSeparator
>       >       "Answer whether the receiver is one of the separator characters--space,
>       > +     cr, tab, line feed, or form feed and nbsp."
>       > -     cr, tab, line feed, or form feed."
>       >
>       >       | integerValue |
>       > +     (integerValue := self asInteger) > 32 ifTrue: [ ^integerValue = 160 "non breaking space" ].
>
>       Unfortunately the current JIT will not optimize that construct the same
>       way as the ones used in the current implementation, so it'd still be
>       ~1-1.5x slower to use this compared to the "compare with constant and do
>       a quick return" formula. If we really want to push this change, I'd rather
>       try something like this:
>
>               (integerValue := self asInteger) > 32 ifTrue: [
>                       integerValue = 160 ifTrue: [ ^true ].
>                       ^false ].
>
>
> I'd rather you provide a test case and encourager me to fix the JIT to produce faster code.  It seems absurd to have to write such crappy code, and maybe the JIT will be easy to fix.
>  
>
>       Levente
>
>       > -     (integerValue := self asInteger) > 32 ifTrue: [ ^false ].
>       >       integerValue
>       >               caseOf: {
>       >                       [ 32 "space" ] -> [ ^true ].
>       >                       [ 9 "tab" ] -> [ ^true ].
>       >                       [ 13 "cr"] -> [ ^true ].
>       >                       [ 10 "line feed" ] -> [ ^true ].
>       >                       [ 12 "form feed"] -> [ ^true ] }
>       >               otherwise: [ ^false  ]!
>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.807.mcz

Bert Freudenberg
On Sat, Sep 15, 2018 at 4:58 PM Levente Uzonyi <[hidden email]> wrote:
Hi Eliot,

Here is a very simple test case:

{
        [ 1 to: 500000000 do: [ :i | i = 1 ifTrue: [ true ] ifFalse: [ false ] ] ] timeToRun.
        [ 1 to: 500000000 do: [ :i | i = 1 ] ] timeToRun.
}

On my machine it gives #(992 1436).

Huh, interesting.

Eliot, if you figure this out, please let us know what it was ...

- Bert -


Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.807.mcz

David T. Lewis
On Sat, Sep 15, 2018 at 06:35:52PM -0700, Bert Freudenberg wrote:

> On Sat, Sep 15, 2018 at 4:58 PM Levente Uzonyi <[hidden email]> wrote:
>
> > Hi Eliot,
> >
> > Here is a very simple test case:
> >
> > {
> >         [ 1 to: 500000000 do: [ :i | i = 1 ifTrue: [ true ] ifFalse: [
> > false ] ] ] timeToRun.
> >         [ 1 to: 500000000 do: [ :i | i = 1 ] ] timeToRun.
> > }
> >
> > On my machine it gives #(992 1436).
> >
>
> Huh, interesting.
>
> Eliot, if you figure this out, please let us know what it was ...
>
> - Bert -

On my machine with cog/spur 64 bit I see this:

  {
        [ 1 to: 500000000 do: [ :i | i = 1 ifTrue: [ true ] ifFalse: [false ] ] ] timeToRun.
        [ 1 to: 500000000 do: [ :i | i = 1 ] ] timeToRun.
        [ 1 to: 500000000 do: [ :i | true ] ] timeToRun.
        [ 1 to: 500000000 do: [ :i | ] ] timeToRun.
  } ==> #(859 1576 921 922)


On the same machine with context interpreter 64-bit VM 32-bit image, it is:

  {
        [ 1 to: 500000000 do: [ :i | i = 1 ifTrue: [ true ] ifFalse: [false ] ] ] timeToRun.
        [ 1 to: 500000000 do: [ :i | i = 1 ] ] timeToRun.
        [ 1 to: 500000000 do: [ :i | true ] ] timeToRun.
        [ 1 to: 500000000 do: [ :i | ] ] timeToRun.
  } ==> #(8806 8816 5547 5554)

It looks like the first block with #ifTrue:ifFalse: may be benefiting
from a Cog optimization that the other three cases do not receive.

It's interesting that the optimized block containing #ifTrue:ifFalse:
is actually faster than an empty block that does nothing at all.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.807.mcz

Eliot Miranda-2
In reply to this post by Bert Freudenberg
Hi All,

On Sat, Sep 15, 2018 at 6:36 PM Bert Freudenberg <[hidden email]> wrote:
On Sat, Sep 15, 2018 at 4:58 PM Levente Uzonyi <[hidden email]> wrote:
Hi Eliot,

Here is a very simple test case:

{
        [ 1 to: 500000000 do: [ :i | i = 1 ifTrue: [ true ] ifFalse: [ false ] ] ] timeToRun.
        [ 1 to: 500000000 do: [ :i | i = 1 ] ] timeToRun.
}

On my machine it gives #(992 1436).

Huh, interesting.

Eliot, if you figure this out, please let us know what it was ...

hmmm.  It's my fault.  It is to do with the criteria the StackToRegisterMappingCogit uses to decide whether to inline comparison operations.  The code I wrote in genSpecialSelectorComparison says

"Only interested in inlining if followed by a conditional branch."
inlineCAB := branchDescriptor isBranchTrue or: [branchDescriptor isBranchFalse].
"Further, only interested in inlining = and ~= if there's a SmallInteger constant involved.
The relational operators successfully statically predict SmallIntegers; the equality operators do not."
(inlineCAB and: [primDescriptor opcode = JumpZero or: [primDescriptor opcode = JumpNonZero]]) ifTrue:
[inlineCAB := argIsIntConst or: [rcvrIsInt]].
inlineCAB ifFalse:
[^self genSpecialSelectorSend].

So the Cogit inlines the = 1 in the first case because it is followed by the branch for the ifTrue: [true] ifFalse: [false], but does not inline = 1 in the second case.  So the second case is a true send, and the true send code (load a register with i, another with 1, another with the SmallInteger tag, call the checked entry point of SmallInteger>>#=, tag test the receiver, tag test the argument, compare, reify as true or false, return) is more costly, essentially because it has a call/return.  I'll discuss with Clément and Sophie we'll think about measuring this tradeoff.  This is perhaps a good student project.

If anyone has good arguments why the heuristics above are bogus or good measurements please post.  In the mean time sorry for the late reply; I was airborne on the way back from Cagliari, or sleeping in the terminal in Rome and was none too fresh.  ESUG was glorious though.  Thanks to all the organisers!!

_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.807.mcz

Levente Uzonyi
Hi Eliot,

On Mon, 17 Sep 2018, Eliot Miranda wrote:

> Hi All,
> On Sat, Sep 15, 2018 at 6:36 PM Bert Freudenberg <[hidden email]> wrote:
>       On Sat, Sep 15, 2018 at 4:58 PM Levente Uzonyi <[hidden email]> wrote:
>       Hi Eliot,
>
>       Here is a very simple test case:
>
>       {
>               [ 1 to: 500000000 do: [ :i | i = 1 ifTrue: [ true ] ifFalse: [ false ] ] ] timeToRun.
>               [ 1 to: 500000000 do: [ :i | i = 1 ] ] timeToRun.
>       }
>
>       On my machine it gives #(992 1436).
>
>
> Huh, interesting.
>
> Eliot, if you figure this out, please let us know what it was ...
>
>
> hmmm.  It's my fault.  It is to do with the criteria the StackToRegisterMappingCogit uses to decide whether to inline comparison operations.  The code I wrote in genSpecialSelectorComparison says
>
> "Only interested in inlining if followed by a conditional branch."
> inlineCAB := branchDescriptor isBranchTrue or: [branchDescriptor isBranchFalse].
> "Further, only interested in inlining = and ~= if there's a SmallInteger constant involved.
> The relational operators successfully statically predict SmallIntegers; the equality operators do not."
> (inlineCAB and: [primDescriptor opcode = JumpZero or: [primDescriptor opcode = JumpNonZero]]) ifTrue:
> [inlineCAB := argIsIntConst or: [rcvrIsInt]].
> inlineCAB ifFalse:
> [^self genSpecialSelectorSend].
>
> So the Cogit inlines the = 1 in the first case because it is followed by the branch for the ifTrue: [true] ifFalse: [false], but does not inline = 1 in the second case.  So the second case is a true
> send, and the true send code (load a register with i, another with 1, another with the SmallInteger tag, call the checked entry point of SmallInteger>>#=, tag test the receiver, tag test the argument,
> compare, reify as true or false, return) is more costly, essentially because it has a call/return.  I'll discuss with Clément and Sophie we'll think about measuring this tradeoff.  This is perhaps a
> good student project.
>
> If anyone has good arguments why the heuristics above are bogus or good measurements please post.  In the mean time sorry for the late reply; I was airborne on the way back from Cagliari, or sleeping
I see no reason to limit inlining when the result is used. When it's not
used, then the program is probably incorrect.
But if you want specific situations when such expressions should be
jitted, then here's a quick list:

- when the resulting boolean is returned. E.g.:

isEmpty

  ^tally = 0

- when the resulting boolean is the value returned by a block:

  array select: [ :each | each > 0 ]

- when the resulting boolean is part of a larger boolean expression
(don't know how this can be detected):

  (x = 1 or: [ y = 2 ]) ifTrue: [ ... ]

All in all, I think it's easier to always inline it than not. And I can't
really find a reason not to inline it.

Levente

> in the terminal in Rome and was none too fresh.  ESUG was glorious though.  Thanks to all the organisers!!
>
> _,,,^..^,,,_
> best, Eliot
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.807.mcz

Eliot Miranda-2
Hi Levente,

On Tue, Sep 18, 2018 at 8:21 AM Levente Uzonyi <[hidden email]> wrote:
Hi Eliot,

On Mon, 17 Sep 2018, Eliot Miranda wrote:

> Hi All,
> On Sat, Sep 15, 2018 at 6:36 PM Bert Freudenberg <[hidden email]> wrote:
>       On Sat, Sep 15, 2018 at 4:58 PM Levente Uzonyi <[hidden email]> wrote:
>       Hi Eliot,
>
>       Here is a very simple test case:
>
>       {
>               [ 1 to: 500000000 do: [ :i | i = 1 ifTrue: [ true ] ifFalse: [ false ] ] ] timeToRun.
>               [ 1 to: 500000000 do: [ :i | i = 1 ] ] timeToRun.
>       }
>
>       On my machine it gives #(992 1436).
>
>
> Huh, interesting.
>
> Eliot, if you figure this out, please let us know what it was ...
>
>
> hmmm.  It's my fault.  It is to do with the criteria the StackToRegisterMappingCogit uses to decide whether to inline comparison operations.  The code I wrote in genSpecialSelectorComparison says
>
> "Only interested in inlining if followed by a conditional branch."
> inlineCAB := branchDescriptor isBranchTrue or: [branchDescriptor isBranchFalse].
> "Further, only interested in inlining = and ~= if there's a SmallInteger constant involved.
> The relational operators successfully statically predict SmallIntegers; the equality operators do not."
> (inlineCAB and: [primDescriptor opcode = JumpZero or: [primDescriptor opcode = JumpNonZero]]) ifTrue:
> [inlineCAB := argIsIntConst or: [rcvrIsInt]].
> inlineCAB ifFalse:
> [^self genSpecialSelectorSend].
>
> So the Cogit inlines the = 1 in the first case because it is followed by the branch for the ifTrue: [true] ifFalse: [false], but does not inline = 1 in the second case.  So the second case is a true
> send, and the true send code (load a register with i, another with 1, another with the SmallInteger tag, call the checked entry point of SmallInteger>>#=, tag test the receiver, tag test the argument,
> compare, reify as true or false, return) is more costly, essentially because it has a call/return.  I'll discuss with Clément and Sophie we'll think about measuring this tradeoff.  This is perhaps a
> good student project.
>
> If anyone has good arguments why the heuristics above are bogus or good measurements please post.  In the mean time sorry for the late reply; I was airborne on the way back from Cagliari, or sleeping

I see no reason to limit inlining when the result is used. When it's not
used, then the program is probably incorrect.
But if you want specific situations when such expressions should be
jitted, then here's a quick list:

- when the resulting boolean is returned. E.g.:

isEmpty

        ^tally = 0

- when the resulting boolean is the value returned by a block:

        array select: [ :each | each > 0 ]

- when the resulting boolean is part of a larger boolean expression
(don't know how this can be detected):

        (x = 1 or: [ y = 2 ]) ifTrue: [ ... ]

All in all, I think it's easier to always inline it than not. And I can't
really find a reason not to inline it.

Cool.  Always great to have an excuse for more JIT hacking :-)
 
Levente

> in the terminal in Rome and was none too fresh.  ESUG was glorious though.  Thanks to all the organisers!!
>
> _,,,^..^,,,_
> best, Eliot

_,,,^..^,,,_
best, Eliot