mantis http://bugs.squeak.org/view.php?id=7349

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
39 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Immediates

Eliot Miranda-2
 


On Thu, May 7, 2009 at 10:11 AM, Bert Freudenberg <[hidden email]> wrote:
 
On 07.05.2009, at 18:53, Eliot Miranda wrote:

On Thu, May 7, 2009 at 9:36 AM, Igor Stasenko <[hidden email]> wrote:

2009/5/7 Eliot Miranda <[hidden email]>:
>I intend to add immediate characters within the next few months.
>
so, does that means that you will extend the oop tag to 2 bits (or more)?

Yes.  Keep 31-bit SmallIntegers, provide e.g. 24-bit immediate characters.  Andreas wrote a thorough sketch of this scheme in 2006.
 
Or just reserve a non-movable heap space for character objects, like:
isCharacterObject: oop
 ^ oop >= charsStart and: [ oop < charsEnd ]

No. This doesn't scale to unicode.  The tagged approach provides much faster string access, and identity comparison for all characters, not just the byte range.

Do we have evidence that Character allocation is an actual performance bottleneck?

The problem is not so much character allocation because by far the most character access in e.g. IDE usage is with byte characters. The problem is character indirection.  To assign a character to a string in ByteString>>at:put: requires indirecting through the character to extract the character code.  To answer a character in ByteString>>at: involves indirecting through the specialObjectsOop to fetch the characterTable and indexing the character table with the byte character code.  But allocation is very slow in Squeak so for Unicode the problem for at: is much worse because we also have to allocate the result box.  

This is very slow compared to merely adding/removing a tag bit.  As for evidence as to whether this is a bottle-neck it is obscured by plugin primitives that mitigate the effects, such as primitiveFindSubstring.  In general these are a bad idea because they're hard to debug, effectively impossible to change and are not polymorphic.  But take a look at the following, which uses identityIndexOf: to avoid primitive machinery (but is no slower since invoking the primitive machinery is in itself expensive)

| wa ws ba bs bc wc n |
ba := (120 to: 125) collect: [:cc| Character value: cc].
bs := ba asString.
bc := ba last.
wa := (12345 to: 12350) collect: [:cc| Character value: cc].
ws := wa asString.
wc := wa last.
n := 1000000.
{ Time millisecondsToRun: [1 to: n do: [:ign| ba identityIndexOf: bc ifAbsent: 0]].
  Time millisecondsToRun: [1 to: n do: [:ign| bs identityIndexOf: bc ifAbsent: 0]].
  Time millisecondsToRun: [1 to: n do: [:ign| wa identityIndexOf: wc ifAbsent: 0]].
  Time millisecondsToRun: [1 to: n do: [:ign| ws identityIndexOf: wc ifAbsent: 0]] }
Squeak 4.0 beta1 Closure: #(448 1058 451 8631)
Stack VM: #(581 1531 579 7303)
Cog: #(214 600 213 1970)

So string access is two to three times slower than array access when the result is fetched from the character table and an order of magnitude worse when the result must be boxed.  (I don't know why the Stack VM is slower than Squeak 4 for non-boxed access; I probably need to do a merge :) ).




- Bert -




Reply | Threaded
Open this post in threaded view
|

Re: mantis http://bugs.squeak.org/view.php?id=7349

Eliot Miranda-2
In reply to this post by Yoshiki Ohshima-2
 


On Thu, May 7, 2009 at 10:45 AM, Yoshiki Ohshima <[hidden email]> wrote:

At Thu, 7 May 2009 09:04:54 -0700,
Eliot Miranda wrote:
>
> - remove stack access from the API, writing them as SmartSyntaxPlugins where arguments are passed in as parameters,
> returning the result on success and 0 (not SmallInteger 0) on
> failure

 In these days, nobody would care much about it, but this would
make it harder to simulate a platform independent performance primitive in the image?

I don't think it makes any difference.  In the simulator the VM could e.g. use perform:withArguments: to invoke the primitive.  The real VM needs to do something similar and have glue to the platform's native calling convention, which can be as simple as a 32-element switch statement:
switch (numArgs) {
case 0: result = primitiveFunctionPointer(stackTop()); break;
case 1: result = primtiveFunctionPointer(stackValue(1),stackTop); break;
...
or as sophisticated as machine code generated on the fly.

> - provide isImmediateObject: and use it in place of isIntegerObject: when the objective is to select heap objects. Use
> isCharacterObject: when the objective is to select a character. I
> intend to add immediate characters within the next few months.

 Are you going to use UTF-32 or UTF-16 for it?

Characters would be Unicode code points (WideString is UTF-32 right?).  UTF-16 is a variable-length string encoding.  Presumably there will be primitive converters to/from UTF-16 to WideString.



-- Yoshiki


Reply | Threaded
Open this post in threaded view
|

Re: mantis http://bugs.squeak.org/view.php?id=7349

Yoshiki Ohshima-2

At Thu, 7 May 2009 11:09:32 -0700,
Eliot Miranda wrote:

>
>     > - remove stack access from the API, writing them as SmartSyntaxPlugins where arguments are passed in as
>     parameters,
>     > returning the result on success and 0 (not SmallInteger 0) on
>     > failure
>    
>      In these days, nobody would care much about it, but this would
>     make it harder to simulate a platform independent performance primitive in the image?
>
> I don't think it makes any difference. In the simulator the VM could e.g. use perform:withArguments: to invoke the
> primitive. The real VM needs to do something similar and have glue to the platform's native calling convention, which
> can be as simple as a 32-element switch statement:
> switch (numArgs) {
> case 0: result = primitiveFunctionPointer(stackTop()); break;
> case 1: result = primtiveFunctionPointer(stackValue(1),stackTop); break;
> ...
> or as sophisticated as machine code generated on the fly.

  What I mean was to debug the Slang-ish code in the Smalltalk
Debugger.  Putting "halt" in the primitive code in Slang and doing
#doPrimitive: lets you do it, but code written in
SmartSyntaxInterpreter syntax doesn't do what it says so Smalltalk
debugger cannot handle it.  But again, this is a minor issue now.

>     > - provide isImmediateObject: and use it in place of isIntegerObject: when the objective is to select heap objects.
>     Use
>     > isCharacterObject: when the objective is to select a character. I
>     > intend to add immediate characters within the next few months.
>    
>     Are you going to use UTF-32 or UTF-16 for it?
>
> Characters would be Unicode code points (WideString is UTF-32 right?). UTF-16 is a variable-length string encoding.
> Presumably there will be primitive converters to/from UTF-16 to WideString.

  Yes, among these choices, my vote would be for UTF-32 (for 21-bit
space).  But variable-length-ness doesn't really go away when even
when using UTF-32, as there are composition characters.

  Alternatively, we could go for all UTF-8 in image representation for
Strings (as a data buffer) and when you need a Character, create an
instance, or return the one in a table, that is in UTF-32.  And in the
image side, displayable "String" should (almost) always accompany the
attributes like Text.

-- Yoshiki

Reply | Threaded
Open this post in threaded view
|

Re: mantis http://bugs.squeak.org/view.php?id=7349

Eliot Miranda-2
 


On Thu, May 7, 2009 at 11:29 AM, Yoshiki Ohshima <[hidden email]> wrote:

At Thu, 7 May 2009 11:09:32 -0700,
Eliot Miranda wrote:
>
>     > - remove stack access from the API, writing them as SmartSyntaxPlugins where arguments are passed in as
>     parameters,
>     > returning the result on success and 0 (not SmallInteger 0) on
>     > failure
>
>      In these days, nobody would care much about it, but this would
>     make it harder to simulate a platform independent performance primitive in the image?
>
> I don't think it makes any difference. In the simulator the VM could e.g. use perform:withArguments: to invoke the
> primitive. The real VM needs to do something similar and have glue to the platform's native calling convention, which
> can be as simple as a 32-element switch statement:
> switch (numArgs) {
> case 0: result = primitiveFunctionPointer(stackTop()); break;
> case 1: result = primtiveFunctionPointer(stackValue(1),stackTop); break;
> ...
> or as sophisticated as machine code generated on the fly.

 What I mean was to debug the Slang-ish code in the Smalltalk
Debugger.  Putting "halt" in the primitive code in Slang and doing
#doPrimitive: lets you do it, but code written in
SmartSyntaxInterpreter syntax doesn't do what it says so Smalltalk
debugger cannot handle it.  But again, this is a minor issue now.

Ah, OK, now I get it.  I think we can fix this.  If the type information is moved into pragmas then I think the debug issue can be made to go away.  the simulator would have to read the pragma and type convert before it called perform: but I think this is straight-forward.  The pragma could be e.g. performable by the VM to do the type conversion.
 
>     > - provide isImmediateObject: and use it in place of isIntegerObject: when the objective is to select heap objects.
>     Use
>     > isCharacterObject: when the objective is to select a character. I
>     > intend to add immediate characters within the next few months.
>
>     Are you going to use UTF-32 or UTF-16 for it?
>
> Characters would be Unicode code points (WideString is UTF-32 right?). UTF-16 is a variable-length string encoding.
> Presumably there will be primitive converters to/from UTF-16 to WideString.

 Yes, among these choices, my vote would be for UTF-32 (for 21-bit
space).  But variable-length-ness doesn't really go away when even
when using UTF-32, as there are composition characters.

 Alternatively, we could go for all UTF-8 in image representation for
Strings (as a data buffer) and when you need a Character, create an
instance, or return the one in a table, that is in UTF-32.  And in the
image side, displayable "String" should (almost) always accompany the
attributes like Text.

I'm a bit out of my depth here.  I would have thought that you would want the basic string types to be fixed width for fast accessing, simply because variable length doesn't scale to e.g. indexing 1 megabyte strings.  But that for the platform interface one would want efficient conversion to/from fixed and variable length encodings.  But that's just my gut.  I expect I'll implement whatever y'all say makes sense.



-- Yoshiki


Reply | Threaded
Open this post in threaded view
|

Re: mantis http://bugs.squeak.org/view.php?id=7349

Yoshiki Ohshima-2
 
At Thu, 7 May 2009 11:37:10 -0700,
Eliot Miranda wrote:

>
>     Yes, among these choices, my vote would be for UTF-32 (for 21-bit
>     space). But variable-length-ness doesn't really go away when even
>     when using UTF-32, as there are composition characters.
>    
>     Alternatively, we could go for all UTF-8 in image representation for
>     Strings (as a data buffer) and when you need a Character, create an
>     instance, or return the one in a table, that is in UTF-32. And in the
>     image side, displayable "String" should (almost) always accompany the
>     attributes like Text.
>
> I'm a bit out of my depth here. I would have thought that you would want the basic string types to be fixed width for
> fast accessing, simply because variable length doesn't scale to e.g. indexing 1 megabyte strings. But that for the
> platform interface one would want efficient conversion to/from fixed and variable length encodings. But that's just my
> gut. I expect I'll implement whatever y'all say makes sense.

  Basically, I think UTF-32 is ok for the time being and requires very
little change to the code.

  With the presence of composition characters, the situation where you
randomly access to an element and expect it to be a meaningful value
itself is rarer.

  My proposition is that for a String (as data), we would rather avoid
random access anyway and always access it via a Stream.  Then, the
actual representation can be different.

-- Yoshiki
Reply | Threaded
Open this post in threaded view
|

Re: Immediates

Jecel Assumpcao Jr
In reply to this post by Bert Freudenberg
 
Bert Freudenberg wrote:

> Yes.  Keep 31-bit SmallIntegers, provide e.g. 24-bit immediate characters.
> Andreas wrote a thorough sketch of this scheme in 2006. 

I would like to see this scheme get adopted. My own suggestion for the
spare encoding (spare if the GC is changed, that is) was for unboxed 30
bit floats, but that seems to be unpopular and I guess I can live with
float arrays instead.

Certainly immediate charaters are very important as we move away from
ASCII, and I liked Andreas' suggestions of colors and short points as
well. But I would be particularly interested in having an immediate
encoding for symbols. You already have a global table from converting
to/from strings, so I see no need to store anything in the symbols
themselves. Having a trivial way to know that an object is a symbol
without chasing a class pointer could make serializing/restoring objects
a little faster.

-- Jecel

Reply | Threaded
Open this post in threaded view
|

Re: Immediates

Bert Freudenberg
In reply to this post by Bert Freudenberg
 

On 08.05.2009, at 01:13, Jecel Assumpcao Jr wrote:

>
> Bert Freudenberg wrote:
>
>> Yes.  Keep 31-bit SmallIntegers, provide e.g. 24-bit immediate  
>> characters.
>> Andreas wrote a thorough sketch of this scheme in 2006.
>
> I would like to see this scheme get adopted. My own suggestion for the
> spare encoding (spare if the GC is changed, that is) was for unboxed  
> 30
> bit floats, but that seems to be unpopular and I guess I can live with
> float arrays instead.
>
> Certainly immediate charaters are very important as we move away from
> ASCII, and I liked Andreas' suggestions of colors and short points as
> well. But I would be particularly interested in having an immediate
> encoding for symbols. You already have a global table from converting
> to/from strings, so I see no need to store anything in the symbols
> themselves. Having a trivial way to know that an object is a symbol
> without chasing a class pointer could make serializing/restoring  
> objects
> a little faster.
>
> -- Jecel

The quote above was Eliot's, not mine.

Having immediate Floats seems rather compelling to me, too. Or  
immediate doubles, which might be a good reason to want a 64-bit image.

- Bert -


Reply | Threaded
Open this post in threaded view
|

Re: Immediates

Eliot Miranda-2
In reply to this post by Bert Freudenberg
 


On Thu, May 7, 2009 at 4:13 PM, Jecel Assumpcao Jr <[hidden email]> wrote:

Bert Freudenberg wrote:

> Yes.  Keep 31-bit SmallIntegers, provide e.g. 24-bit immediate characters.
> Andreas wrote a thorough sketch of this scheme in 2006. 

I would like to see this scheme get adopted. My own suggestion for the
spare encoding (spare if the GC is changed, that is) was for unboxed 30
bit floats, but that seems to be unpopular and I guess I can live with
float arrays instead.

Certainly immediate charaters are very important as we move away from
ASCII, and I liked Andreas' suggestions of colors and short points as
well. But I would be particularly interested in having an immediate
encoding for symbols. You already have a global table from converting
to/from strings, so I see no need to store anything in the symbols
themselves. Having a trivial way to know that an object is a symbol
without chasing a class pointer could make serializing/restoring objects
a little faster.

This brings up an important point, which is what immediates are good for and what they're not (IMO).  They are good for computation, but not without incurring other costs for compactness.

Yes one can, especially in a 64-bit VM, imagine e.g. a 7 character immediate Symbol, but it won't save you that much because by definition the symbols that become immediate are the smallest, and it will have additional costs, another class in the image, different code for creating immediate symbols from strings, complications in all string comparison code (e.g. the string comparison primitives) to cope with immediate symbols.  So computationally immediate symbols are slower and add complication.

Compare that to e.g. 61-bit immediate floats in a 64-bit VM where the cost (an additional class in the image and a set of primitives) is repaid by substantially faster float arithmetic.

IMO, in general use immedates for computational objects where computation is performed on the immediate bit pattern and instantiation rates are high.  This applies to integers characters and floats.  Do not use them for atoms such as nil, false true, the symbols, etc.  These are perfectly fine as ordinary objects.  Adding them as immediates complicates (bloats and slows down) the immediate/non-immediate test that is the highest dynamic frequency operation in the VM (e.g. one per send), accessing the class etc.

Short points and compact colours might make sense in a very graphical environment, but the cost of the blits would I think far outweigh the cost of the point & colour manipulation so the advantages would get lost in the noise.

So for me I'm only interested in SmallInteger, SmallFloat & Character, and keeping the immediate test as lean as possible.

One thing I find difficult with Andreas' proposal is the use of 31-bit SmallIntegers instead of 30-bit SmallIntegers because it complicates the immediate test.  One can't simply use oop bitAnd: 3 to determine the tag value because both 2r01 and 2r11 are SmallIntegers; instead one has to use  (oop bitAnd: 1) ifTrue: [1] ifFalse: [oop bitAnd: 3].

The isImmediate test is fine:
    isImmediate: oop ^(oop bitAnd: 3) ~= 0
or, if 0 is being used as the SmallInteger tag (a la V8), e.g.
    isImmediate: oop ^(oop bitAnd: 3) ~= 3
bit the "does this oop have the immediate pattern foo?" test is slow and this is the test that is in in-line caches on every send:
    oop: oop hasTag: pattern
        | tagAndLSB |
        tagAndLSB := oop bitAnd: 3.
        ^pattern = 2 ifTrue: [tagAndLSB = 2] ifFalse: [(tagAndLSB bitAnd: 1) ~= 0]
or some such.  Its much nicer if it is just
        oop: oop hasTag: pattern ^pattern = (oop bitAnd: 3)
 
I'm told that having 31-bit and opposed to 30-bit SmallIntegers is a bug advantage but I remain to be convinced; VW has always had 30-bit SmallIntegers and seems none the worse for it.


-- Jecel


Reply | Threaded
Open this post in threaded view
|

Re: Immediates

Igor Stasenko

2009/5/8 Eliot Miranda <[hidden email]>:

>
>
>
> On Thu, May 7, 2009 at 4:13 PM, Jecel Assumpcao Jr <[hidden email]> wrote:
>>
>> Bert Freudenberg wrote:
>>
>> > Yes.  Keep 31-bit SmallIntegers, provide e.g. 24-bit immediate characters.
>> > Andreas wrote a thorough sketch of this scheme in 2006.
>>
>> I would like to see this scheme get adopted. My own suggestion for the
>> spare encoding (spare if the GC is changed, that is) was for unboxed 30
>> bit floats, but that seems to be unpopular and I guess I can live with
>> float arrays instead.
>>
>> Certainly immediate charaters are very important as we move away from
>> ASCII, and I liked Andreas' suggestions of colors and short points as
>> well. But I would be particularly interested in having an immediate
>> encoding for symbols. You already have a global table from converting
>> to/from strings, so I see no need to store anything in the symbols
>> themselves. Having a trivial way to know that an object is a symbol
>> without chasing a class pointer could make serializing/restoring objects
>> a little faster.
>
> This brings up an important point, which is what immediates are good for and what they're not (IMO).  They are good for computation, but not without incurring other costs for compactness.
> Yes one can, especially in a 64-bit VM, imagine e.g. a 7 character immediate Symbol, but it won't save you that much because by definition the symbols that become immediate are the smallest, and it will have additional costs, another class in the image, different code for creating immediate symbols from strings, complications in all string comparison code (e.g. the string comparison primitives) to cope with immediate symbols.  So computationally immediate symbols are slower and add complication.
> Compare that to e.g. 61-bit immediate floats in a 64-bit VM where the cost (an additional class in the image and a set of primitives) is repaid by substantially faster float arithmetic.
> IMO, in general use immedates for computational objects where computation is performed on the immediate bit pattern and instantiation rates are high.  This applies to integers characters and floats.  Do not use them for atoms such as nil, false true, the symbols, etc.  These are perfectly fine as ordinary objects.  Adding them as immediates complicates (bloats and slows down) the immediate/non-immediate test that is the highest dynamic frequency operation in the VM (e.g. one per send), accessing the class etc.
> Short points and compact colours might make sense in a very graphical environment, but the cost of the blits would I think far outweigh the cost of the point & colour manipulation so the advantages would get lost in the noise.
> So for me I'm only interested in SmallInteger, SmallFloat & Character, and keeping the immediate test as lean as possible.
> One thing I find difficult with Andreas' proposal is the use of 31-bit SmallIntegers instead of 30-bit SmallIntegers because it complicates the immediate test.  One can't simply use oop bitAnd: 3 to determine the tag value because both 2r01 and 2r11 are SmallIntegers; instead one has to use  (oop bitAnd: 1) ifTrue: [1] ifFalse: [oop bitAnd: 3].
> The isImmediate test is fine:
>     isImmediate: oop ^(oop bitAnd: 3) ~= 0
> or, if 0 is being used as the SmallInteger tag (a la V8), e.g.
>     isImmediate: oop ^(oop bitAnd: 3) ~= 3
> bit the "does this oop have the immediate pattern foo?" test is slow and this is the test that is in in-line caches on every send:
>     oop: oop hasTag: pattern
>         | tagAndLSB |
>         tagAndLSB := oop bitAnd: 3.
>         ^pattern = 2 ifTrue: [tagAndLSB = 2] ifFalse: [(tagAndLSB bitAnd: 1) ~= 0]
> or some such.  Its much nicer if it is just
>         oop: oop hasTag: pattern ^pattern = (oop bitAnd: 3)
>
> I'm told that having 31-bit and opposed to 30-bit SmallIntegers is a bug advantage but I remain to be convinced; VW has always had 30-bit SmallIntegers and seems none the worse for it.

Eliot, i encourage you to write the Cog in such way, that it would be
very easy to replace/change the tagging rules.
Then, you can bench the performance and choose best alternative.

>>
>> -- Jecel
>>
>
>
>



--
Best regards,
Igor Stasenko AKA sig.
Reply | Threaded
Open this post in threaded view
|

Re: Immediates

Andreas.Raab
In reply to this post by Eliot Miranda-2
 
Eliot Miranda wrote:
> I'm told that having 31-bit and opposed to 30-bit SmallIntegers is a bug
> advantage but I remain to be convinced; VW has always had 30-bit
> SmallIntegers and seems none the worse for it.

I was going to make a contrary argument, except when I was doing the
math it didn't go my way ;-) Here is why: Obviously, when having
computations that can overflow into large ints the performance
difference is huge (factors of 30-100). However, most algorithms that we
care about (like crypto) is strictly 32 bit where even with 31 bit
SmallIntegers we go LargeInt half the time. Consequently the statistical
difference between 31 and 30 bit for those algorithms should be in the
range of 50% which, although certainly not insignificant, is nothing
compared to when you can run the entire algorithm as SmallInteger
computations.

Cheers,
   - Andreas
Reply | Threaded
Open this post in threaded view
|

Re: Immediates

Eliot Miranda-2
In reply to this post by Igor Stasenko
 


On Thu, May 7, 2009 at 4:56 PM, Igor Stasenko <[hidden email]> wrote:

2009/5/8 Eliot Miranda <[hidden email]>:
>
>
>
> On Thu, May 7, 2009 at 4:13 PM, Jecel Assumpcao Jr <[hidden email]> wrote:
>>
>> Bert Freudenberg wrote:
>>
>> > Yes.  Keep 31-bit SmallIntegers, provide e.g. 24-bit immediate characters.
>> > Andreas wrote a thorough sketch of this scheme in 2006.
>>
>> I would like to see this scheme get adopted. My own suggestion for the
>> spare encoding (spare if the GC is changed, that is) was for unboxed 30
>> bit floats, but that seems to be unpopular and I guess I can live with
>> float arrays instead.
>>
>> Certainly immediate charaters are very important as we move away from
>> ASCII, and I liked Andreas' suggestions of colors and short points as
>> well. But I would be particularly interested in having an immediate
>> encoding for symbols. You already have a global table from converting
>> to/from strings, so I see no need to store anything in the symbols
>> themselves. Having a trivial way to know that an object is a symbol
>> without chasing a class pointer could make serializing/restoring objects
>> a little faster.
>
> This brings up an important point, which is what immediates are good for and what they're not (IMO).  They are good for computation, but not without incurring other costs for compactness.
> Yes one can, especially in a 64-bit VM, imagine e.g. a 7 character immediate Symbol, but it won't save you that much because by definition the symbols that become immediate are the smallest, and it will have additional costs, another class in the image, different code for creating immediate symbols from strings, complications in all string comparison code (e.g. the string comparison primitives) to cope with immediate symbols.  So computationally immediate symbols are slower and add complication.
> Compare that to e.g. 61-bit immediate floats in a 64-bit VM where the cost (an additional class in the image and a set of primitives) is repaid by substantially faster float arithmetic.
> IMO, in general use immedates for computational objects where computation is performed on the immediate bit pattern and instantiation rates are high.  This applies to integers characters and floats.  Do not use them for atoms such as nil, false true, the symbols, etc.  These are perfectly fine as ordinary objects.  Adding them as immediates complicates (bloats and slows down) the immediate/non-immediate test that is the highest dynamic frequency operation in the VM (e.g. one per send), accessing the class etc.
> Short points and compact colours might make sense in a very graphical environment, but the cost of the blits would I think far outweigh the cost of the point & colour manipulation so the advantages would get lost in the noise.
> So for me I'm only interested in SmallInteger, SmallFloat & Character, and keeping the immediate test as lean as possible.
> One thing I find difficult with Andreas' proposal is the use of 31-bit SmallIntegers instead of 30-bit SmallIntegers because it complicates the immediate test.  One can't simply use oop bitAnd: 3 to determine the tag value because both 2r01 and 2r11 are SmallIntegers; instead one has to use  (oop bitAnd: 1) ifTrue: [1] ifFalse: [oop bitAnd: 3].
> The isImmediate test is fine:
>     isImmediate: oop ^(oop bitAnd: 3) ~= 0
> or, if 0 is being used as the SmallInteger tag (a la V8), e.g.
>     isImmediate: oop ^(oop bitAnd: 3) ~= 3
> bit the "does this oop have the immediate pattern foo?" test is slow and this is the test that is in in-line caches on every send:
>     oop: oop hasTag: pattern
>         | tagAndLSB |
>         tagAndLSB := oop bitAnd: 3.
>         ^pattern = 2 ifTrue: [tagAndLSB = 2] ifFalse: [(tagAndLSB bitAnd: 1) ~= 0]
> or some such.  Its much nicer if it is just
>         oop: oop hasTag: pattern ^pattern = (oop bitAnd: 3)
>
> I'm told that having 31-bit and opposed to 30-bit SmallIntegers is a bug advantage but I remain to be convinced; VW has always had 30-bit SmallIntegers and seems none the worse for it.

Eliot, i encourage you to write the Cog in such way, that it would be
very easy to replace/change the tagging rules.
Then, you can bench the performance and choose best alternative.

I have done just this :)  There is an abstraction of the object representation "objectRepresentation" to which the JIT defers whenever accessing objects.  So e.g.

genPrimitiveSubtract 
        
"Stack looks like 
                 receiver (also in ResultReceiverReg) 
                 arg 
                 return address"
 
        
| jumpNotSI jumpOvfl | 
        <var: 
#jumpNotSI type: #'AbstractInstruction *'
        <var: 
#jumpOvfl type: #'AbstractInstruction *'
        
self MoveMw: BytesPerWord r: SPReg R: TempReg
        
self MoveR: TempReg R: ClassReg
>>   
jumpNotSI := objectRepresentation genJumpNotSmallIntegerInScratchReg: TempReg
        
self MoveR: ReceiverResultReg R: TempReg
        
self SubR: ClassReg R: TempReg
        
jumpOvfl := self JumpOverflow: 0
>>   objectRepresentation 
genAddSmallIntegerTagsTo: TempReg
        
self MoveR: TempReg R: ReceiverResultReg
        
self flag: 'currently caller pushes result'
        
self RetN: BytesPerWord * 2
        
jumpOvfl jmpTarget: (jumpNotSI jmpTarget: self Label). 
        
^0


objectRepresentation takes care of slot access, tagging/detagging, class access and in-line cache tag omparison.  In the current VM an in-line cache tag is either the SmallInteger tag bit for a SmallInteger or the compact class index * 4 for an instance with a compact class or the class.  The JIT knows nothing of this scheme.  So I should be able to create a new object representation without affecting it.  But nothing works until it's tested ;)


best
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Immediates

David T. Lewis
In reply to this post by Eliot Miranda-2
 
On Thu, May 07, 2009 at 04:39:58PM -0700, Eliot Miranda wrote:
>
> I'm told that having 31-bit and opposed to 30-bit SmallIntegers is a bug
> advantage but I remain to be convinced

I presume a freudian slip in the choice of adjectives here ;)

Dave

Reply | Threaded
Open this post in threaded view
|

Re: mantis http://bugs.squeak.org/view.php?id=7349

David T. Lewis
In reply to this post by Andreas.Raab
 
On Wed, May 06, 2009 at 10:15:11PM -0700, Andreas Raab wrote:
>
> Second, there is a higher level notion of whether something is
> compatible or not - for example the return value from certain functions
> change in a 64 bit image accordingly (I don't even know how a 32 bit
> plugin is prevented from interacting with a 64 bit image today).

There is nothing to prevent this interaction. In order to confirm the
obvious, I copied a UnixOSProcessPlugin compiled for a 64-bit image
into the directory containing plugins for a 32-bit image. The plugin
loads and runs. A primitive that does not involve stack operations
(primitiveGetPid) works fine. Other primitives that need to access
the stack result in a VM crash as you might expect.

In practice I have never encountered a case in which I accidentally
mixed a plugin for 64-bit images with a VM for 32-bit images (or vice
versa). I'm sure it's possible, but it has never happened to me, so
this may a problem analogous to 'Smalltalk become: nil' for which
the solution is "don't do that".

Dave

Reply | Threaded
Open this post in threaded view
|

Re: Immediates

Eliot Miranda-2
In reply to this post by David T. Lewis
 


On Thu, May 7, 2009 at 7:27 PM, David T. Lewis <[hidden email]> wrote:

On Thu, May 07, 2009 at 04:39:58PM -0700, Eliot Miranda wrote:
>
> I'm told that having 31-bit and opposed to 30-bit SmallIntegers is a bug
> advantage but I remain to be convinced

I presume a freudian slip in the choice of adjectives here ;)

<blush> :)
 


Dave


Reply | Threaded
Open this post in threaded view
|

Re: mantis http://bugs.squeak.org/view.php?id=7349

David T. Lewis
In reply to this post by Igor Stasenko
 
On Thu, May 07, 2009 at 05:06:48PM +0300, Igor Stasenko wrote:

>
> 2009/5/7 Andreas Raab <[hidden email]>:
> >
> > Second, there is a higher level notion of whether something is compatible or
> > not - for example the return value from certain functions change in a 64 bit
> > image accordingly (I don't even know how a 32 bit plugin is prevented from
> > interacting with a 64 bit image today). Sometimes you really need a
> > high-level bit that tells you that the world has changed even if the names
> > stay the same.
> >
> this can be solved simply: add a function with 32 bit value, which
> could answer is VM 32 bit, or 64 bit.

It's already there. The expression "self bytesPerWord" translates to either
4 or 8 for 32-bit and 64-bit images respectively.

The implementation is in CCodeGenerator>>generateBytesPerWord:on:indent:
in VMMaker since VMMaker-dtl.90 on SqS. The change set is on Mantis 7182
if you need to load it into a different VMMaker.

Dave
 
Reply | Threaded
Open this post in threaded view
|

Re: mantis http://bugs.squeak.org/view.php?id=7349

David T. Lewis
 
On Thu, May 07, 2009 at 11:34:46PM -0400, David T. Lewis wrote:

>  
> On Thu, May 07, 2009 at 05:06:48PM +0300, Igor Stasenko wrote:
> >
> > 2009/5/7 Andreas Raab <[hidden email]>:
> > >
> > > Second, there is a higher level notion of whether something is compatible or
> > > not - for example the return value from certain functions change in a 64 bit
> > > image accordingly (I don't even know how a 32 bit plugin is prevented from
> > > interacting with a 64 bit image today). Sometimes you really need a
> > > high-level bit that tells you that the world has changed even if the names
> > > stay the same.
> > >
> > this can be solved simply: add a function with 32 bit value, which
> > could answer is VM 32 bit, or 64 bit.
>
> It's already there. The expression "self bytesPerWord" translates to either
> 4 or 8 for 32-bit and 64-bit images respectively.
>
> The implementation is in CCodeGenerator>>generateBytesPerWord:on:indent:
> in VMMaker since VMMaker-dtl.90 on SqS. The change set is on Mantis 7182
> if you need to load it into a different VMMaker.

Apologies, I just realized that I missed your point entirely. For a plugin
to check if its compile-time view of "self bytesPerWord" is the same as
the bytes per word of the VM that loaded the plugin would presumably
require an entry in the interpreter proxy, which does not currently exist.
Sorry for the noise.

Dave

Reply | Threaded
Open this post in threaded view
|

Re: Immediates

Jecel Assumpcao Jr
In reply to this post by Eliot Miranda-2
 
Sorry about the wrong attribution - Celeste makes a big mess of all of
Eliot's emails and most replies to those emails. Unfortunately, fixing
this is not currently near the top of my "to do" list so I'll just have
to deal with this for a few more months.

I totally agree about the value of immediates being to speed up
computations by avoiding allocations. My idea for symbols was not to
avoid the costly mapping of strings to new instances but rather speed up
class lookup a little bit. This wouldn't help us now, but for a future
modular Squeak that would be loading and unloading object graphs all the
time, this could make a difference.

Like VisualWorks, Self uses 30 bit integers with a 00 tag (which makes
detagging/retagging unnecessary for addition, subtraction and bitwise
logical operations). The other tag values represent 30 bit floats,
object pointers (you always use a constant offset with these anyway, so
the detagging can be built into that constant) and object headers. The
memory is divided into segments (generations) and each segment stores
tagged data from the bottom and binary data from the top. ByteVectors
are normal tagged objects with a SmallInteger pointing to the actual
bytes.

The idea of a tag pattern for object headers is that you can "flatten"
the memory scanning operations. You just scan from top to the limit
until you found what you were looking for and then back up to the
previous header to see what object contains that oop. This can be many
times faster than a objects do: [ :obj | obj fieldsDo: [ :oop | ....]]
nested loop.

For my old RISC42 design I came up with the idea of having the top two
bits the same to indicate SmallIntegers. This is hard to check in
software, but in hardware is just a two input XOR gate. This allows you
to avoid detagging not only for the operations I mentioned above, but
also for multiplies, divides, left shift and signed right shift too. A
few weeks ago I found out that the old Swamp Smalltalk computer from
1986 used exactly the same scheme (the two patterns where the top two
bits were different were used for oops and for pointers to context
objects).

-- Jecel

Reply | Threaded
Open this post in threaded view
|

Re: Immediates

Eliot Miranda-2
In reply to this post by Eliot Miranda-2
 


On Fri, May 8, 2009 at 2:26 PM, Jecel Assumpcao Jr <[hidden email]> wrote:

Sorry about the wrong attribution - Celeste makes a big mess of all of
Eliot's emails and most replies to those emails. Unfortunately, fixing
this is not currently near the top of my "to do" list so I'll just have
to deal with this for a few more months.

I totally agree about the value of immediates being to speed up
computations by avoiding allocations. My idea for symbols was not to
avoid the costly mapping of strings to new instances but rather speed up
class lookup a little bit. This wouldn't help us now, but for a future
modular Squeak that would be loading and unloading object graphs all the
time, this could make a difference.

One of the things I think would really helps here is to have a way of assigning the id-hash of a Symbol based on its string hash.  Then MethodDictionaries and the like don't have to be rehashed on load.  I could imagine a new:withHash: primitive that creates an object with a specified hash atomically, whihc is safer than adding a separate setIdHash: primtive; VW has the latter.

Like VisualWorks, Self uses 30 bit integers with a 00 tag (which makes
detagging/retagging unnecessary for addition, subtraction and bitwise
logical operations).

FYI, VW does not use 00 for SmallIntegers; it uses 00 for objects.  So it does have to detag for certain operations.  But of course it optimizes addition/subtraction by only detagging one of the two values so it doesn't have to retag.

Self, Strongtalk and V8 all do use 00.

The other tag values represent 30 bit floats,
object pointers (you always use a constant offset with these anyway, so
the detagging can be built into that constant) and object headers. The
memory is divided into segments (generations) and each segment stores
tagged data from the bottom and binary data from the top. ByteVectors
are normal tagged objects with a SmallInteger pointing to the actual
bytes.

The idea of a tag pattern for object headers is that you can "flatten"
the memory scanning operations. You just scan from top to the limit
until you found what you were looking for and then back up to the
previous header to see what object contains that oop. This can be many
times faster than a objects do: [ :obj | obj fieldsDo: [ :oop | ....]]
nested loop.

Yes, this is neat.  They use it in become operations which are very common as slots are added and removed right?
 
For my old RISC42 design I came up with the idea of having the top two
bits the same to indicate SmallIntegers. This is hard to check in
software, but in hardware is just a two input XOR gate. This allows you
to avoid detagging not only for the operations I mentioned above, but
also for multiplies, divides, left shift and signed right shift too. A
few weeks ago I found out that the old Swamp Smalltalk computer from
1986 used exactly the same scheme (the two patterns where the top two
bits were different were used for oops and for pointers to context
objects).

-- Jecel


Reply | Threaded
Open this post in threaded view
|

Re: Immediates

Jecel Assumpcao Jr
 
Eliot Miranda wrote on Fri, 8 May 2009 15:18:32 -0700

> One of the things I think would really helps here is to have a way of
> assigning the id-hash of a Symbol based on its string hash. Then
> MethodDictionaries and the like don't have to be rehashed on load.
> I could imagine a new:withHash: primitive that creates an object with
> a specified hash atomically, whihc is safer than adding a separate
> setIdHash: primtive; VW has the latter.

This would be great for merging binary packages with the current Symbols
that map 1 to 1 to Strings. I have to confess that this has not been a
priority for me in my designs since I have always interested in schemes
which map a single Symbol to multiple Strings. The idea was to allow
Smalltalk to be written in languages other than English. I know that a
lot of people are very much against this idea, but in my experience it
will be an important factor in the future.

> > [header tag for quick memory scans]
>
> Yes, this is neat.  They use it in become operations which are very
> common as slots are added and removed right? 

Yes, though adding and removing slots is not common when normal
applications are running, only during development. And only "data slots"
require this as constant slots and method cause the object to get a new
"map" (hidden class, roughly) rather than changing the object's size.

-- Jecel

12