Smalltalk › Squeak › Squeak - Dev

Float hierarchy for 64-bit Spur

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

27 messages Options

Eliot Miranda-2

Float hierarchy for 64-bit Spur

Hi All,

64-bit Spur can usefully provide an immediate float, a 61-bit subset of the ieee double precision float. The scheme steals bits from the mantissa to use for the immediate's 3-bit tag pattern. So values have the same precision as ieee doubles, but can only represent the subset with exponents between 10^-38 and 10^38, the single-precision range. The issue here is how to organize the class hierarchy.

The approach that looks best to me is to modify class Float to be an abstract class, and add two subclasses, BoxedFloat and SmallFloat, such that existing boxed instances of Float outside the SmallFloat range will become instances of BoxedFloat and instances within that range will be replaced by references to the relevant SmallFloat.

With this approach ...

- Float pi etc can still be used, even though they will answer instances of SmallFloat. But tests such as "self assert: result class == Float." will need to be rewritten to e.g. "self assert: result isFloat".

- BoxedFloat and SmallFloat will not be mentioned much at all since floats print themselves literally, and so the fact that the classes have changed won't be obvious.

- the boxed Float primitives (receiver is a boxed float) live in BoxedFloat and the immediate ones live in SmallFloat. Making SmallFloat a subclass of Float poses problems for all the primitives that do a super send to retry, since the boxed Float prims will be above the unboxed ones and so the boxed ones would have to test for an immediate receiver.

An alternative, that VW took (because it has both Float and Double) is to add a superclass, e.g. LimitedPrecisionReal, move most of the methods into it, and keep Float as Float, and add SmallFloat as a subclass of LimitedPrecisionReal. Then while class-side methods such as pi would likely be implemented in LimitedPrecisionReal class, sends to Float to access them find them via inheritance. An automatic reorganization which moves only primitives out of LimitedPrecisionReal is easy to write.

Thoughts?

best,

Eliot

David T. Lewis

Re: Float hierarchy for 64-bit Spur

On Thu, Nov 20, 2014 at 05:51:42PM -0800, Eliot Miranda wrote:

> Hi All,
>
> 64-bit Spur can usefully provide an immediate float, a 61-bit subset of
> the ieee double precision float. The scheme steals bits from the mantissa
> to use for the immediate's 3-bit tag pattern. So values have the same
> precision as ieee doubles, but can only represent the subset with exponents
> between 10^-38 and 10^38, the single-precision range. The issue here is
> how to organize the class hierarchy.
>
> The approach that looks best to me is to modify class Float to be an
> abstract class, and add two subclasses, BoxedFloat and SmallFloat, such
> that existing boxed instances of Float outside the SmallFloat range will
> become instances of BoxedFloat and instances within that range will be
> replaced by references to the relevant SmallFloat.
>
> With this approach ...
>
> - Float pi etc can still be used, even though they will answer instances of
> SmallFloat. But tests such as "self assert: result class == Float." will
> need to be rewritten to e.g. "self assert: result isFloat".
>
> - BoxedFloat and SmallFloat will not be mentioned much at all since floats
> print themselves literally, and so the fact that the classes have changed
> won't be obvious.
>
> - the boxed Float primitives (receiver is a boxed float) live in BoxedFloat
> and the immediate ones live in SmallFloat. Making SmallFloat a subclass of
> Float poses problems for all the primitives that do a super send to retry,
> since the boxed Float prims will be above the unboxed ones and so the boxed
> ones would have to test for an immediate receiver.
>
>
> An alternative, that VW took (because it has both Float and Double) is to
> add a superclass, e.g. LimitedPrecisionReal, move most of the methods into
> it, and keep Float as Float, and add SmallFloat as a subclass of
> LimitedPrecisionReal. Then while class-side methods such as pi would
> likely be implemented in LimitedPrecisionReal class, sends to Float to
> access them find them via inheritance. An automatic reorganization which
> moves only primitives out of LimitedPrecisionReal is easy to write.
>
> Thoughts?

I have always felt that the mapping of Float to 64-bit double and FloatArray
to 32-bit float is awkward. It may be that 32-bit floats are becoming less
relevant nowadays, but if short float values are still important, then it
would be nice to be able to represent them directly. I like the idea of having
a Float class and a Double class to represent the two most common representations.
A class hierarchy that could potentially support this sounds like a good idea to me.

I have no experience with VW, but a LimitedPrecisionReal hierachy sounds like a
reasonable approach.

Dave

Bert Freudenberg

Re: Float hierarchy for 64-bit Spur

On 21.11.2014, at 04:19, David T. Lewis <[hidden email]> wrote:

> On Thu, Nov 20, 2014 at 05:51:42PM -0800, Eliot Miranda wrote:
>> Hi All,
>>
>> 64-bit Spur can usefully provide an immediate float, a 61-bit subset of
>> the ieee double precision float. The scheme steals bits from the mantissa
>> to use for the immediate's 3-bit tag pattern. So values have the same
>> precision as ieee doubles, but can only represent the subset with exponents
>> between 10^-38 and 10^38, the single-precision range. The issue here is
>> how to organize the class hierarchy.
>>
>> The approach that looks best to me is to modify class Float to be an
>> abstract class, and add two subclasses, BoxedFloat and SmallFloat, such
>> that existing boxed instances of Float outside the SmallFloat range will
>> become instances of BoxedFloat and instances within that range will be
>> replaced by references to the relevant SmallFloat.
>>
>> With this approach ...
>>
>> - Float pi etc can still be used, even though they will answer instances of
>> SmallFloat. But tests such as "self assert: result class == Float." will
>> need to be rewritten to e.g. "self assert: result isFloat".
>>
>> - BoxedFloat and SmallFloat will not be mentioned much at all since floats
>> print themselves literally, and so the fact that the classes have changed
>> won't be obvious.
>>
>> - the boxed Float primitives (receiver is a boxed float) live in BoxedFloat
>> and the immediate ones live in SmallFloat. Making SmallFloat a subclass of
>> Float poses problems for all the primitives that do a super send to retry,
>> since the boxed Float prims will be above the unboxed ones and so the boxed
>> ones would have to test for an immediate receiver.
>>
>>
>> An alternative, that VW took (because it has both Float and Double) is to
>> add a superclass, e.g. LimitedPrecisionReal, move most of the methods into
>> it, and keep Float as Float, and add SmallFloat as a subclass of
>> LimitedPrecisionReal. Then while class-side methods such as pi would
>> likely be implemented in LimitedPrecisionReal class, sends to Float to
>> access them find them via inheritance. An automatic reorganization which
>> moves only primitives out of LimitedPrecisionReal is easy to write.
>>
>> Thoughts?
>
> I have always felt that the mapping of Float to 64-bit double and FloatArray
> to 32-bit float is awkward. It may be that 32-bit floats are becoming less
> relevant nowadays, but if short float values are still important, then it
> would be nice to be able to represent them directly. I like the idea of having
> a Float class and a Double class to represent the two most common representations.
> A class hierarchy that could potentially support this sounds like a good idea to me.
>
> I have no experience with VW, but a LimitedPrecisionReal hierachy sounds like a
> reasonable approach.
>
> Dave

I'd suggest BoxedDouble and ImmediateDouble as names for the concrete subclasses (*). Names do mean something. (**)

You're right about the FloatArray confusion. However, note that the IEEE standard calls it single and double. It's only C using "float" to mean "single precision".

I'd name the abstract superclass Float, for readability, and the isFloat test etc. Also: "Float pi" reads a lot nicer than anything else. I don't see the need for having a deep LimitedPrecisionReal - Float - BoxedDouble/ImmediateDouble deep hierarchy now.

If we ever add single-precision floats, we should name them BoxedSingle and ImmediateSingle. At that point we might want a Single superclass and a LimitedPrecisionReal supersuperclass, but we can cross that bridge when we get there.

- Bert -

(*) Since we're not going to see the class names often, we could even spell it out as BoxedDoublePrecisionFloat and ImmediateDoublePrecisionFloat. Only half joking. It would make the relation to the abstract Float very clear.

(**) We could also try to make the names googleable. I was surprised to not get a good hit for "boxed immediate". Only "boxed unboxed" finds it. Maybe there are two better words?

smime.p7s (5K) Download Attachment

J. Vuletich (mail lists)

Re: Float hierarchy for 64-bit Spur

Quoting Bert Freudenberg <[hidden email]>:

> On 21.11.2014, at 04:19, David T. Lewis <[hidden email]> wrote:
>
>> On Thu, Nov 20, 2014 at 05:51:42PM -0800, Eliot Miranda wrote:
>>> Hi All,
>>>
>>> 64-bit Spur can usefully provide an immediate float, a 61-bit subset of
>>> the ieee double precision float. The scheme steals bits from the mantissa
>>> to use for the immediate's 3-bit tag pattern. So values have the same
>>> precision as ieee doubles, but can only represent the subset with exponents
>>> between 10^-38 and 10^38, the single-precision range. The issue here is
>>> how to organize the class hierarchy.
>>>
>>> The approach that looks best to me is to modify class Float to be an
>>> abstract class, and add two subclasses, BoxedFloat and SmallFloat, such
>>> that existing boxed instances of Float outside the SmallFloat range will
>>> become instances of BoxedFloat and instances within that range will be
>>> replaced by references to the relevant SmallFloat.
>>>
>>> With this approach ...
>>>
>>> - Float pi etc can still be used, even though they will answer instances of
>>> SmallFloat. But tests such as "self assert: result class == Float." will
>>> need to be rewritten to e.g. "self assert: result isFloat".
>>>
>>> - BoxedFloat and SmallFloat will not be mentioned much at all since floats
>>> print themselves literally, and so the fact that the classes have changed
>>> won't be obvious.
>>>
>>> - the boxed Float primitives (receiver is a boxed float) live in BoxedFloat
>>> and the immediate ones live in SmallFloat. Making SmallFloat a subclass of
>>> Float poses problems for all the primitives that do a super send to retry,
>>> since the boxed Float prims will be above the unboxed ones and so the boxed
>>> ones would have to test for an immediate receiver.
>>>
>>>
>>> An alternative, that VW took (because it has both Float and Double) is to
>>> add a superclass, e.g. LimitedPrecisionReal, move most of the methods into
>>> it, and keep Float as Float, and add SmallFloat as a subclass of
>>> LimitedPrecisionReal. Then while class-side methods such as pi would
>>> likely be implemented in LimitedPrecisionReal class, sends to Float to
>>> access them find them via inheritance. An automatic reorganization which
>>> moves only primitives out of LimitedPrecisionReal is easy to write.
>>>
>>> Thoughts?
>>
>> I have always felt that the mapping of Float to 64-bit double and FloatArray
>> to 32-bit float is awkward. It may be that 32-bit floats are becoming less
>> relevant nowadays, but if short float values are still important, then it
>> would be nice to be able to represent them directly. I like the
>> idea of having
>> a Float class and a Double class to represent the two most common
>> representations.
>> A class hierarchy that could potentially support this sounds like a
>> good idea to me.
>>
>> I have no experience with VW, but a LimitedPrecisionReal hierachy
>> sounds like a
>> reasonable approach.
>>
>> Dave
>
> I'd suggest BoxedDouble and ImmediateDouble as names for the
> concrete subclasses (*). Names do mean something. (**)
>
> You're right about the FloatArray confusion. However, note that the
> IEEE standard calls it single and double. It's only C using "float"
> to mean "single precision".
>
> I'd name the abstract superclass Float, for readability, and the
> isFloat test etc. Also: "Float pi" reads a lot nicer than anything
> else. I don't see the need for having a deep LimitedPrecisionReal -
> Float - BoxedDouble/ImmediateDouble deep hierarchy now.
>
> If we ever add single-precision floats, we should name them
> BoxedSingle and ImmediateSingle. At that point we might want a
> Single superclass and a LimitedPrecisionReal supersuperclass, but we
> can cross that bridge when we get there.
>
> - Bert -
>
> (*) Since we're not going to see the class names often, we could
> even spell it out as BoxedDoublePrecisionFloat and
> ImmediateDoublePrecisionFloat. Only half joking. It would make the
> relation to the abstract Float very clear.
>
> (**) We could also try to make the names googleable. I was surprised
> to not get a good hit for "boxed immediate". Only "boxed unboxed"
> finds it. Maybe there are two better words?

I very much agree with Bert. But I'd suggest SmallDouble instead of
ImmediateDouble for consistency with SmallInteger.

Cheers,
Juan Vuletich

Ben Coman

Re: [Vm-dev] Re: [squeak-dev] Float hierarchy for 64-bit Spur

In reply to this post by Bert Freudenberg

Bert Freudenberg wrote:

>
>
>
> ------------------------------------------------------------------------
>
>
> On 21.11.2014, at 04:19, David T. Lewis <[hidden email]> wrote:
>
>> On Thu, Nov 20, 2014 at 05:51:42PM -0800, Eliot Miranda wrote:
>>> Hi All,
>>>
>>> 64-bit Spur can usefully provide an immediate float, a 61-bit subset of
>>> the ieee double precision float. The scheme steals bits from the mantissa
>>> to use for the immediate's 3-bit tag pattern. So values have the same
>>> precision as ieee doubles, but can only represent the subset with exponents
>>> between 10^-38 and 10^38, the single-precision range. The issue here is
>>> how to organize the class hierarchy.
>>>
>>> The approach that looks best to me is to modify class Float to be an
>>> abstract class, and add two subclasses, BoxedFloat and SmallFloat, such
>>> that existing boxed instances of Float outside the SmallFloat range will
>>> become instances of BoxedFloat and instances within that range will be
>>> replaced by references to the relevant SmallFloat.
>>>
>>> With this approach ...
>>>
>>> - Float pi etc can still be used, even though they will answer instances of
>>> SmallFloat. But tests such as "self assert: result class == Float." will
>>> need to be rewritten to e.g. "self assert: result isFloat".
>>>
>>> - BoxedFloat and SmallFloat will not be mentioned much at all since floats
>>> print themselves literally, and so the fact that the classes have changed
>>> won't be obvious.
>>>
>>> - the boxed Float primitives (receiver is a boxed float) live in BoxedFloat
>>> and the immediate ones live in SmallFloat. Making SmallFloat a subclass of
>>> Float poses problems for all the primitives that do a super send to retry,
>>> since the boxed Float prims will be above the unboxed ones and so the boxed
>>> ones would have to test for an immediate receiver.
>>>
>>>
>>> An alternative, that VW took (because it has both Float and Double) is to
>>> add a superclass, e.g. LimitedPrecisionReal, move most of the methods into
>>> it, and keep Float as Float, and add SmallFloat as a subclass of
>>> LimitedPrecisionReal. Then while class-side methods such as pi would
>>> likely be implemented in LimitedPrecisionReal class, sends to Float to
>>> access them find them via inheritance. An automatic reorganization which
>>> moves only primitives out of LimitedPrecisionReal is easy to write.
>>>
>>> Thoughts?
>> I have always felt that the mapping of Float to 64-bit double and FloatArray
>> to 32-bit float is awkward. It may be that 32-bit floats are becoming less
>> relevant nowadays, but if short float values are still important, then it
>> would be nice to be able to represent them directly. I like the idea of having
>> a Float class and a Double class to represent the two most common representations.
>> A class hierarchy that could potentially support this sounds like a good idea to me.
>>
>> I have no experience with VW, but a LimitedPrecisionReal hierachy sounds like a
>> reasonable approach.
>>
>> Dave
>
> I'd suggest BoxedDouble and ImmediateDouble as names for the concrete subclasses (*). Names do mean something. (**)
>

This is a nice idea, except we have the legacy of SmallInteger and
LargeInteger, and I don't like the inconsistency of Float not following
the same rule. The boxing/unboxing can be covered in the class comment.
Unless you want to change to BoxedInteger and ImmediateInteger ?

cheers -ben

> You're right about the FloatArray confusion. However, note that the IEEE standard calls it single and double. It's only C using "float" to mean "single precision".
>
> I'd name the abstract superclass Float, for readability, and the isFloat test etc. Also: "Float pi" reads a lot nicer than anything else. I don't see the need for having a deep LimitedPrecisionReal - Float - BoxedDouble/ImmediateDouble deep hierarchy now.
>
> If we ever add single-precision floats, we should name them BoxedSingle and ImmediateSingle. At that point we might want a Single superclass and a LimitedPrecisionReal supersuperclass, but we can cross that bridge when we get there.
>
> - Bert -
>
> (*) Since we're not going to see the class names often, we could even spell it out as BoxedDoublePrecisionFloat and ImmediateDoublePrecisionFloat. Only half joking. It would make the relation to the abstract Float very clear.
>
> (**) We could also try to make the names googleable. I was surprised to not get a good hit for "boxed immediate". Only "boxed unboxed" finds it. Maybe there are two better words?
>

Bert Freudenberg

Re: Float hierarchy for 64-bit Spur

In reply to this post by J. Vuletich (mail lists)

On 21.11.2014, at 13:29, J. Vuletich (mail lists) <[hidden email]> wrote:

> Quoting Bert Freudenberg <[hidden email]>:
>>
>> I'd suggest BoxedDouble and ImmediateDouble as names for the concrete subclasses (*). Names do mean something. (**)
>>
>> You're right about the FloatArray confusion. However, note that the IEEE standard calls it single and double. It's only C using "float" to mean "single precision".
>>
>> I'd name the abstract superclass Float, for readability, and the isFloat test etc. Also: "Float pi" reads a lot nicer than anything else. I don't see the need for having a deep LimitedPrecisionReal - Float - BoxedDouble/ImmediateDouble deep hierarchy now.
>>
>> If we ever add single-precision floats, we should name them BoxedSingle and ImmediateSingle. At that point we might want a Single superclass and a LimitedPrecisionReal supersuperclass, but we can cross that bridge when we get there.
>>
>> - Bert -
>>
>> (*) Since we're not going to see the class names often, we could even spell it out as BoxedDoublePrecisionFloat and ImmediateDoublePrecisionFloat. Only half joking. It would make the relation to the abstract Float very clear.
>>
>> (**) We could also try to make the names googleable. I was surprised to not get a good hit for "boxed immediate". Only "boxed unboxed" finds it. Maybe there are two better words?
>
> I very much agree with Bert. But I'd suggest SmallDouble instead of ImmediateDouble for consistency with SmallInteger.

Then it would have to be LargeDouble for consistency with LargeInteger, too. Which I don't find compelling.

Also, with the 64 bit format we get many more immediate objects. There already are immediate integers and characters, floats will be the third, there could be more, like immediate points. For those, the small/large distinction does not make sense.

Maybe Eliot's idea of keeping "Float" in the name was best, but instead of "small" use "immediate":

Float - BoxedFloat - ImmediateFloat

A Float is either a BoxedFloat or an ImmediateFloat, depending on the magnitude of its exponent.

- Bert -

smime.p7s (5K) Download Attachment

Bert Freudenberg

Re: Float hierarchy for 64-bit Spur

In reply to this post by Eliot Miranda-2

On 21.11.2014, at 02:51, Eliot Miranda <[hidden email]> wrote:

> Hi All,
>
> 64-bit Spur can usefully provide an immediate float, a 61-bit subset of the ieee double precision float. The scheme steals bits from the mantissa to use for the immediate's 3-bit tag pattern. So values have the same precision as ieee doubles, but can only represent the subset with exponents between 10^-38 and 10^38, the single-precision range.

This is worded confusingly. It sounds like the mantissa has 3 bits less, which would make it less precise.

Here is how I understood it: The mantissa is stored with its full 52 bits of precision (*). But only the lower 8 bits of the 11-bit exponent are stored. If the upper 3 bits of the exponent are needed, then a boxed float is created.

I guess I know what you meant, that it is the 3 lowest significant bits in an oop which are used for tagging immediate objects, and in an IEEE double that is part of the mantissa. But these 3 bits are not lost, but moved elsewhere (namely where the 3 highest significant bits of the exponent used to be stored).

Did I understand correctly? You haven't pushed the code yet so I couldn't verify.

- Bert -

(*) http://en.wikipedia.org/wiki/Double-precision_floating-point_format

smime.p7s (5K) Download Attachment

Tobias Pape

Re: Float hierarchy for 64-bit Spur

In reply to this post by Bert Freudenberg

Hi,

On 21.11.2014, at 13:44, Bert Freudenberg <[hidden email]> wrote:

> On 21.11.2014, at 13:29, J. Vuletich (mail lists) <[hidden email]> wrote:
>
>> Quoting Bert Freudenberg <[hidden email]>:
>>>
>>> I'd suggest BoxedDouble and ImmediateDouble as names for the concrete subclasses (*). Names do mean something. (**)
>>>
>>> You're right about the FloatArray confusion. However, note that the IEEE standard calls it single and double. It's only C using "float" to mean "single precision".
>>>
>>> I'd name the abstract superclass Float, for readability, and the isFloat test etc. Also: "Float pi" reads a lot nicer than anything else. I don't see the need for having a deep LimitedPrecisionReal - Float - BoxedDouble/ImmediateDouble deep hierarchy now.
>>>
>>> If we ever add single-precision floats, we should name them BoxedSingle and ImmediateSingle. At that point we might want a Single superclass and a LimitedPrecisionReal supersuperclass, but we can cross that bridge when we get there.
>>>
>>> - Bert -
>>>
>>> (*) Since we're not going to see the class names often, we could even spell it out as BoxedDoublePrecisionFloat and ImmediateDoublePrecisionFloat. Only half joking. It would make the relation to the abstract Float very clear.
>>>
>>> (**) We could also try to make the names googleable. I was surprised to not get a good hit for "boxed immediate". Only "boxed unboxed" finds it. Maybe there are two better words?
>>
>> I very much agree with Bert. But I'd suggest SmallDouble instead of ImmediateDouble for consistency with SmallInteger.
>
> Then it would have to be LargeDouble for consistency with LargeInteger, too. Which I don't find compelling.
>
> Also, with the 64 bit format we get many more immediate objects. There already are immediate integers and characters, floats will be the third, there could be more, like immediate points. For those, the small/large distinction does not make sense.
>
> Maybe Eliot's idea of keeping "Float" in the name was best, but instead of "small" use "immediate":
>
> Float - BoxedFloat - ImmediateFloat
>
> A Float is either a BoxedFloat or an ImmediateFloat, depending on the magnitude of its exponent.

I don't like the idea of putting a VM/Storage detail into the Class name.
The running system itself does not care about whether Floats or Integers are
boxed or immediate.
For example in RSqueakVM (aka SPy) there is no immediate
Integer whatsoever. Yes, tagged ints are read during image startup but they
aren't subsequently represented as immediates or tagged ints after that.

Just as input, in the Racket language and other Schemes,
the equivalent to our SmallInterger/LargeInteger is fixnum/bignum
and for floats they have flonums and "extflonums" (80bit).

Best
-Tobias

signature.asc (1K) Download Attachment

Tobias Pape

Re: Float hierarchy for 64-bit Spur

In reply to this post by Bert Freudenberg

Hi,

On 21.11.2014, at 13:44, Bert Freudenberg <[hidden email]> wrote:

signature.asc (1K) Download Attachment

J. Vuletich (mail lists)

Re: Float hierarchy for 64-bit Spur

In reply to this post by Bert Freudenberg

Quoting Bert Freudenberg <[hidden email]>:

> On 21.11.2014, at 13:29, J. Vuletich (mail lists)
> <[hidden email]> wrote:
>
>> Quoting Bert Freudenberg <[hidden email]>:
>>>
>>> I'd suggest BoxedDouble and ImmediateDouble as names for the
>>> concrete subclasses (*). Names do mean something. (**)
>>>
>>> You're right about the FloatArray confusion. However, note that
>>> the IEEE standard calls it single and double. It's only C using
>>> "float" to mean "single precision".
>>>
>>> I'd name the abstract superclass Float, for readability, and the
>>> isFloat test etc. Also: "Float pi" reads a lot nicer than anything
>>> else. I don't see the need for having a deep LimitedPrecisionReal
>>> - Float - BoxedDouble/ImmediateDouble deep hierarchy now.
>>>
>>> If we ever add single-precision floats, we should name them
>>> BoxedSingle and ImmediateSingle. At that point we might want a
>>> Single superclass and a LimitedPrecisionReal supersuperclass, but
>>> we can cross that bridge when we get there.
>>>
>>> - Bert -
>>>
>>> (*) Since we're not going to see the class names often, we could
>>> even spell it out as BoxedDoublePrecisionFloat and
>>> ImmediateDoublePrecisionFloat. Only half joking. It would make the
>>> relation to the abstract Float very clear.
>>>
>>> (**) We could also try to make the names googleable. I was
>>> surprised to not get a good hit for "boxed immediate". Only "boxed
>>> unboxed" finds it. Maybe there are two better words?
>>
>> I very much agree with Bert. But I'd suggest SmallDouble instead of
>> ImmediateDouble for consistency with SmallInteger.
>
> Then it would have to be LargeDouble for consistency with
> LargeInteger, too. Which I don't find compelling.

Please no. 'Large' in LargeInteger means unlimited or at least
extended range. These won't be 'extended' doubles (like, for example,
C 'long double'). They would be plain standard ieee Double. A
LargeDouble could perhaps be an arbitrary precision Double or such,
some day.

> Also, with the 64 bit format we get many more immediate objects.
> There already are immediate integers and characters, floats will be
> the third, there could be more, like immediate points. For those,
> the small/large distinction does not make sense.

That's a point, sure. But the parallels between SmallInteger and
SmallDouble should be explicit.

> Maybe Eliot's idea of keeping "Float" in the name was best, but
> instead of "small" use "immediate":
>
> Float - BoxedFloat - ImmediateFloat
>
> A Float is either a BoxedFloat or an ImmediateFloat, depending on
> the magnitude of its exponent.
>
> - Bert -

Again, please no. Float means 32 bit single precision for too many
people out there. It means that in our own FloatArrays. Doubles are
Doubles.

To me the best option is SmallDouble and BoxedDouble or simply Double.

Cheers,
Juan Vuletich

Bert Freudenberg

Re: Float hierarchy for 64-bit Spur

To be abstract, or to be concrete, that is the question.

Coming back to Eliot's proposal:

> modify class Float to be an abstract class, and add two subclasses, BoxedFloat and SmallFloat, such that existing boxed instances of Float outside the SmallFloat range will become instances of BoxedFloat and instances within that range will be replaced by references to the relevant SmallFloat.
> [...]
> An alternative [...] is to add a superclass, e.g. LimitedPrecisionReal, move most of the methods into it, and keep Float as Float, and add SmallFloat as a subclass of LimitedPrecisionReal.

Float
|
+------- BoxedFloat
|
+------- SmallFloat

LimitedPrecisionReal
|
+------- Float
|
+------- SmallFloat

The actual question was if the class named "Float" (as used in expressions like "Float pi") should be concrete or abstract.

I strongly agree with Eliot's assessment that making Float the abstract superclass is best. What we name the two concrete subclasses is bikeshedding, and I trust Eliot to pick something not too unreasonable.

- Bert -

smime.p7s (5K) Download Attachment

Bert Freudenberg

Re: Float hierarchy for 64-bit Spur

In reply to this post by Tobias Pape

On 21.11.2014, at 13:53, Tobias Pape <[hidden email]> wrote:

> On 21.11.2014, at 13:44, Bert Freudenberg <[hidden email]> wrote:
>> Also, with the 64 bit format we get many more immediate objects. There already are immediate integers and characters, floats will be the third, there could be more, like immediate points. For those, the small/large distinction does not make sense.
>>
>> Maybe Eliot's idea of keeping "Float" in the name was best, but instead of "small" use "immediate":
>>
>> Float - BoxedFloat - ImmediateFloat
>>
>> A Float is either a BoxedFloat or an ImmediateFloat, depending on the magnitude of its exponent.
>
> I don't like the idea of putting a VM/Storage detail into the Class name.
> The running system itself does not care about whether Floats or Integers are
> boxed or immediate.

Good point. Do you have a suggestion for names reflecting that?

- Bert -

smime.p7s (5K) Download Attachment

Tobias Pape

Re: Float hierarchy for 64-bit Spur

On 21.11.2014, at 15:30, Bert Freudenberg <[hidden email]> wrote:

>
> On 21.11.2014, at 13:53, Tobias Pape <[hidden email]> wrote:
>
>> On 21.11.2014, at 13:44, Bert Freudenberg <[hidden email]> wrote:
>>> Also, with the 64 bit format we get many more immediate objects. There already are immediate integers and characters, floats will be the third, there could be more, like immediate points. For those, the small/large distinction does not make sense.
>>>
>>> Maybe Eliot's idea of keeping "Float" in the name was best, but instead of "small" use "immediate":
>>>
>>> Float - BoxedFloat - ImmediateFloat
>>>
>>> A Float is either a BoxedFloat or an ImmediateFloat, depending on the magnitude of its exponent.
>>
>> I don't like the idea of putting a VM/Storage detail into the Class name.
>> The running system itself does not care about whether Floats or Integers are
>> boxed or immediate.
>
> Good point. Do you have a suggestion for names reflecting that?

First: I think it is possible to have both SmallInteger/Large*Integer as well
as all Float stuff combined such that we only have
- Integer
- Float
and the VM has to deal with internal stuff, ie representing small enough numbers
tagged and larger ones as boxed (which could, for example, mean to not be able
to access the boxed values from the image side…).
However, this is “Zukunftsmusik” or “ungelegte Eier” (Things to come or not even
considered).

Second: I think the small/large stuff is semantically correct, because that is what
it is, whether immediate or not:
- Integer: SmallInteger, LargeInteger
- Float: SmallFloat, LargeFloat
I don't think there's confusion about the single=float thing when you don't have
the name double somewhere.
Rationale against immediate in the name: Immediate/Non-Immediate is a means to
an end, which is, speed for small or “few” things: ints, floats, chars. When you
make something different immediate — just for fun: very short ascii strings like
"hello" stored as 0x000068656C6C6F04 and 04 being the tag — you shouldn't name it
ImmediateString but TinyString, because thats why it is there, an optimization
for very tiny things.

HTH

Best
-Tobias

signature.asc (1K) Download Attachment

Eliot Miranda-2

Re: Float hierarchy for 64-bit Spur

In reply to this post by Bert Freudenberg

On Fri, Nov 21, 2014 at 4:47 AM, Bert Freudenberg <[hidden email]> wrote:

On 21.11.2014, at 02:51, Eliot Miranda <[hidden email]> wrote:

> Hi All,
>
> 64-bit Spur can usefully provide an immediate float, a 61-bit subset of the ieee double precision float. The scheme steals bits from the mantissa to use for the immediate's 3-bit tag pattern. So values have the same precision as ieee doubles, but can only represent the subset with exponents between 10^-38 and 10^38, the single-precision range.

This is worded confusingly. It sounds like the mantissa has 3 bits less, which would make it less precise.

It's not worded confusingly, it's just plain wrong :-/. Let me try again...

64-bit Spur can usefully provide an immediate float, a 61-bit subset of the ieee double precision float. The scheme steals 3 bits from the exponent to use for the immediate's 3-bit tag pattern. So values have the same precision as ieee doubles, but can only represent the subset with exponents between 10^-38 and 10^38, the single-precision range.

Here's the representation:

[8 bit exponent][52 bit mantissa][1 bit sign][3 bit tag]

This has the advantage that +/- zero are the only immediate float values that are less than or equal to fifteen. So to convert to a float:

- shift away tags

[000][8 bit exponent][52 bit mantissa][1 bit sign]

- if > 1 (i.e. non-zero)

add exponent offset:

[11 bit exponent][52 bit mantissa][sign bit]

- rotate by -1

[sign bit][11 bit exponent][52 bit mantissa]

And to encode:

- rotate by 1

[11 bit exponent][52 bit mantissa][sign bit]

- if > 1

subtract exponent offset

fail if <= 0

test against max value, fail if too big

- shift by 3 and add tag bits:

[8 bit exponent][52 bit mantissa][1 bit sign][3 bit tag]

Here is how I understood it: The mantissa is stored with its full 52 bits of precision (*). But only the lower 8 bits of the 11-bit exponent are stored. If the upper 3 bits of the exponent are needed, then a boxed float is created.

I guess I know what you meant, that it is the 3 lowest significant bits in an oop which are used for tagging immediate objects, and in an IEEE double that is part of the mantissa. But these 3 bits are not lost, but moved elsewhere (namely where the 3 highest significant bits of the exponent used to be stored).

Did I understand correctly? You haven't pushed the code yet so I couldn't verify.

Yes, of course :-)

- Bert -

(*) http://en.wikipedia.org/wiki/Double-precision_floating-point_format

best,

Eliot

Eliot Miranda-2

Re: Float hierarchy for 64-bit Spur

In reply to this post by Tobias Pape

Hi Tobias,

On Fri, Nov 21, 2014 at 4:51 AM, Tobias Pape <[hidden email]> wrote:

Hi,

On 21.11.2014, at 13:44, Bert Freudenberg <[hidden email]> wrote:

> On 21.11.2014, at 13:29, J. Vuletich (mail lists) <[hidden email]> wrote:
>
>> Quoting Bert Freudenberg <[hidden email]>:
>>>
>>> I'd suggest BoxedDouble and ImmediateDouble as names for the concrete subclasses (*). Names do mean something. (**)
>>>
>>> You're right about the FloatArray confusion. However, note that the IEEE standard calls it single and double. It's only C using "float" to mean "single precision".
>>>
>>> I'd name the abstract superclass Float, for readability, and the isFloat test etc. Also: "Float pi" reads a lot nicer than anything else. I don't see the need for having a deep LimitedPrecisionReal - Float - BoxedDouble/ImmediateDouble deep hierarchy now.
>>>
>>> If we ever add single-precision floats, we should name them BoxedSingle and ImmediateSingle. At that point we might want a Single superclass and a LimitedPrecisionReal supersuperclass, but we can cross that bridge when we get there.
>>>
>>> - Bert -
>>>
>>> (*) Since we're not going to see the class names often, we could even spell it out as BoxedDoublePrecisionFloat and ImmediateDoublePrecisionFloat. Only half joking. It would make the relation to the abstract Float very clear.
>>>
>>> (**) We could also try to make the names googleable. I was surprised to not get a good hit for "boxed immediate". Only "boxed unboxed" finds it. Maybe there are two better words?
>>
>> I very much agree with Bert. But I'd suggest SmallDouble instead of ImmediateDouble for consistency with SmallInteger.
>
> Then it would have to be LargeDouble for consistency with LargeInteger, too. Which I don't find compelling.
>
> Also, with the 64 bit format we get many more immediate objects. There already are immediate integers and characters, floats will be the third, there could be more, like immediate points. For those, the small/large distinction does not make sense.
>
> Maybe Eliot's idea of keeping "Float" in the name was best, but instead of "small" use "immediate":
>
> Float - BoxedFloat - ImmediateFloat
>
> A Float is either a BoxedFloat or an ImmediateFloat, depending on the magnitude of its exponent.

I don't like the idea of putting a VM/Storage detail into the Class name.
The running system itself does not care about whether Floats or Integers are
boxed or immediate.

I disagree. I think at least Smalltalk-80 has a philosophy of lifting as much out of the VM into the system, and hiding it from clients via encapsulation. So unlike many other VMs the compiler is in the system, the system explicitly separates SmallInteger, LargePositiveInteger and LargeNegativeInteger and implements large integer arithmetic with Smalltalk code that uses SmallIntegers. Note that the primitives are an optional extra optimization that the VM does not need to implement. So for me it is in keeping with the current system to use BoxedFloat and SmallFloat or BoxedDouble and SmallDouble.

This lifting things up provides us with an extremely malleable system. Pushing things down into the VM does the opposite.

For example in RSqueakVM (aka SPy) there is no immediate
Integer whatsoever. Yes, tagged ints are read during image startup but they
aren't subsequently represented as immediates or tagged ints after that.

Well because it's implemented above RPython I guess it is using Python's bignum code directly. That's fine but its a bit of a cheat.

Just as input, in the Racket language and other Schemes,
the equivalent to our SmallInterger/LargeInteger is fixnum/bignum
and for floats they have flonums and "extflonums" (80bit).

Best
-Tobias

thanks

best,

Eliot

Eliot Miranda-2

Re: Float hierarchy for 64-bit Spur

In reply to this post by Bert Freudenberg

On Fri, Nov 21, 2014 at 5:30 AM, Bert Freudenberg <[hidden email]> wrote:

To be abstract, or to be concrete, that is the question.

Coming back to Eliot's proposal:

> modify class Float to be an abstract class, and add two subclasses, BoxedFloat and SmallFloat, such that existing boxed instances of Float outside the SmallFloat range will become instances of BoxedFloat and instances within that range will be replaced by references to the relevant SmallFloat.
> [...]
> An alternative [...] is to add a superclass, e.g. LimitedPrecisionReal, move most of the methods into it, and keep Float as Float, and add SmallFloat as a subclass of LimitedPrecisionReal.

Float
|
+------- BoxedFloat
|
+------- SmallFloat

LimitedPrecisionReal
|
+------- Float
|
+------- SmallFloat

The actual question was if the class named "Float" (as used in expressions like "Float pi") should be concrete or abstract.

I strongly agree with Eliot's assessment that making Float the abstract superclass is best. What we name the two concrete subclasses is bikeshedding, and I trust Eliot to pick something not too unreasonable.

Good. I think I'll go with

Float
|
+------- BoxedDouble
|
+------- SmallDouble

ImmediateDouble is fine too, but I like the symmetry with SmallInteger.

best,

Eliot

Eliot Miranda-2

Re: Float hierarchy for 64-bit Spur

In reply to this post by Tobias Pape

Hi Tobias,

On Fri, Nov 21, 2014 at 8:01 AM, Tobias Pape <[hidden email]> wrote:

On 21.11.2014, at 15:30, Bert Freudenberg <[hidden email]> wrote:

>
> On 21.11.2014, at 13:53, Tobias Pape <[hidden email]> wrote:
>
>> On 21.11.2014, at 13:44, Bert Freudenberg <[hidden email]> wrote:
>>> Also, with the 64 bit format we get many more immediate objects. There already are immediate integers and characters, floats will be the third, there could be more, like immediate points. For those, the small/large distinction does not make sense.
>>>
>>> Maybe Eliot's idea of keeping "Float" in the name was best, but instead of "small" use "immediate":
>>>
>>> Float - BoxedFloat - ImmediateFloat
>>>
>>> A Float is either a BoxedFloat or an ImmediateFloat, depending on the magnitude of its exponent.
>>
>> I don't like the idea of putting a VM/Storage detail into the Class name.
>> The running system itself does not care about whether Floats or Integers are
>> boxed or immediate.
>
> Good point. Do you have a suggestion for names reflecting that?

First: I think it is possible to have both SmallInteger/Large*Integer as well
as all Float stuff combined such that we only have
- Integer
- Float
and the VM has to deal with internal stuff, ie representing small enough numbers
tagged and larger ones as boxed (which could, for example, mean to not be able
to access the boxed values from the image side…).
However, this is “Zukunftsmusik” or “ungelegte Eier” (Things to come or not even
considered).

I don't find this compelling for reasons I've expressed earlier in the thread. Personally I think the VM shouldn't be in the business of hiding much. There are advantages to it hiding the machinery that connects contexts to stack frames and methods to machine code because that allows us to use the same system with very different VMs and that's hugely advantageous (see the Stack VM and SqueakJS for examples). But that doesn't for example hide contexts, it just optimizes teir use.

Second: I think the small/large stuff is semantically correct, because that is what
it is, whether immediate or not:
- Integer: SmallInteger, LargeInteger
- Float: SmallFloat, LargeFloat
I don't think there's confusion about the single=float thing when you don't have
the name double somewhere.

Agreed.

Rationale against immediate in the name: Immediate/Non-Immediate is a means to
an end, which is, speed for small or “few” things: ints, floats, chars. When you
make something different immediate — just for fun: very short ascii strings like
"hello" stored as 0x000068656C6C6F04 and 04 being the tag — you shouldn't name it
ImmediateString but TinyString, because thats why it is there, an optimization
for very tiny things.

Agreed. But note that I will /not/ be pursuing things like immediate strings. IMO this is a bad idea. Whereas there are really compelling arguments for immediate integers, characters and floats, there aren't for strings or symbols. Most strings and most symbols are longer than 7 bytes

(ByteSymbol allInstances collect: [:ea| ea size]) sum asFloat / ByteSymbol allInstances size 17.905990063082676

(ByteString allInstances collect: [:ea| ea size]) sum asFloat / ByteString allInstances size 192.12565808504485

So choosing this representation doesn't save much space and loses time because the more complex mixed representation is involved in many operations (e.g. replaceFrom:to:with:startingAt: is now way more complex).

In fact, I'm thinking that a 2 bit tag is probably better. AFAIA, since I implemented 64-bit VisualWorks with a 3 bit tag no one has added any new immediate types. Points don't have the necessary dynamic frequency and indeed points with floats may be very common in newer UI architectures. Making nil, true and false immediates doesn't have much benefit either; they're unique values, and unique addresses work just as well as immediates. Essentially expanding the number of tagged types, and especially making the tagged type organization non-uniform (see e.g. Eliot Moss's VMs where nil, true, false have one organization, character has a another and SmallInteger another one still) makes the decode bloat, which slows down message send. So I think for the moment I'll go with a 2 bit tag, giving us an even larger range for SmallDouble and SmallInteger, and keep the simple representation:

immediates

[62 bit value][2 bit tag]

non-immediates

[64 bit pointer (least 3 bits 0)] -> [8 bit slot count][2 gc bits][22 bit hash][3 gc bits][5 bit format][2 flag bits][22 bit class index]

best,

Eliot

Bert Freudenberg

Re: Float hierarchy for 64-bit Spur

On 21.11.2014, at 19:25, Eliot Miranda <[hidden email]> wrote:

In fact, I'm thinking that a 2 bit tag is probably better. AFAIA, since I implemented 64-bit VisualWorks with a 3 bit tag no one has added any new immediate types. Points don't have the necessary dynamic frequency and indeed points with floats may be very common in newer UI architectures. Making nil, true and false immediates doesn't have much benefit either; they're unique values, and unique addresses work just as well as immediates. Essentially expanding the number of tagged types, and especially making the tagged type organization non-uniform (see e.g. Eliot Moss's VMs where nil, true, false have one organization, character has a another and SmallInteger another one still) makes the decode bloat, which slows down message send. So I think for the moment I'll go with a 2 bit tag, giving us an even larger range for SmallDouble and SmallInteger, and keep the simple representation:

immediates
[62 bit value][2 bit tag]
non-immediates
[64 bit pointer (least 3 bits 0)] -> [8 bit slot count][2 gc bits][22 bit hash][3 gc bits][5 bit format][2 flag bits][22 bit class index]

I don't think that one additional bit will be helpful to either SmallInts or SmallDoubles. But having it can make for nice VM experiments. I'd reserve it.

- Bert -

smime.p7s (5K) Download Attachment

Eliot Miranda-2

Re: Float hierarchy for 64-bit Spur

On Fri, Nov 21, 2014 at 10:36 AM, Bert Freudenberg <[hidden email]> wrote:

On 21.11.2014, at 19:25, Eliot Miranda <[hidden email]> wrote:

In fact, I'm thinking that a 2 bit tag is probably better. AFAIA, since I implemented 64-bit VisualWorks with a 3 bit tag no one has added any new immediate types. Points don't have the necessary dynamic frequency and indeed points with floats may be very common in newer UI architectures. Making nil, true and false immediates doesn't have much benefit either; they're unique values, and unique addresses work just as well as immediates. Essentially expanding the number of tagged types, and especially making the tagged type organization non-uniform (see e.g. Eliot Moss's VMs where nil, true, false have one organization, character has a another and SmallInteger another one still) makes the decode bloat, which slows down message send. So I think for the moment I'll go with a 2 bit tag, giving us an even larger range for SmallDouble and SmallInteger, and keep the simple representation:

immediates
[62 bit value][2 bit tag]
non-immediates
[64 bit pointer (least 3 bits 0)] -> [8 bit slot count][2 gc bits][22 bit hash][3 gc bits][5 bit format][2 flag bits][22 bit class index]

I don't think that one additional bit will be helpful to either SmallInts or SmallDoubles. But having it can make for nice VM experiments. I'd reserve it.

OK, less work too ;-)

best,

Eliot

Tobias Pape

Re: [Vm-dev] Re: [squeak-dev] Float hierarchy for 64-bit Spur

In reply to this post by Eliot Miranda-2

Hi Eliot

On 21.11.2014, at 19:06, Eliot Miranda <[hidden email]> wrote:

>
> Hi Tobias,
>
> On Fri, Nov 21, 2014 at 4:51 AM, Tobias Pape <[hidden email]> wrote:
>
> > Hi,
> >
> > On 21.11.2014, at 13:44, Bert Freudenberg <[hidden email]> wrote:
> >
> > > On 21.11.2014, at 13:29, J. Vuletich (mail lists) <
> > [hidden email]> wrote:
> > >
> > >> Quoting Bert Freudenberg <[hidden email]>:
> > >>>
> > >>> I'd suggest BoxedDouble and ImmediateDouble as names for the concrete
> > subclasses (*). Names do mean something. (**)
> > >>>
> > >>> You're right about the FloatArray confusion. However, note that the
> > IEEE standard calls it single and double. It's only C using "float" to mean
> > "single precision".
> > >>>
> > >>> I'd name the abstract superclass Float, for readability, and the
> > isFloat test etc. Also: "Float pi" reads a lot nicer than anything else. I
> > don't see the need for having a deep LimitedPrecisionReal - Float -
> > BoxedDouble/ImmediateDouble deep hierarchy now.
> > >>>
> > >>> If we ever add single-precision floats, we should name them
> > BoxedSingle and ImmediateSingle. At that point we might want a Single
> > superclass and a LimitedPrecisionReal supersuperclass, but we can cross
> > that bridge when we get there.
> > >>>
> > >>> - Bert -
> > >>>
> > >>> (*) Since we're not going to see the class names often, we could even
> > spell it out as BoxedDoublePrecisionFloat and
> > ImmediateDoublePrecisionFloat. Only half joking. It would make the relation
> > to the abstract Float very clear.
> > >>>
> > >>> (**) We could also try to make the names googleable. I was surprised
> > to not get a good hit for "boxed immediate". Only "boxed unboxed" finds it.
> > Maybe there are two better words?
> > >>
> > >> I very much agree with Bert. But I'd suggest SmallDouble instead of
> > ImmediateDouble for consistency with SmallInteger.
> > >
> > > Then it would have to be LargeDouble for consistency with LargeInteger,
> > too. Which I don't find compelling.
> > >
> > > Also, with the 64 bit format we get many more immediate objects. There
> > already are immediate integers and characters, floats will be the third,
> > there could be more, like immediate points. For those, the small/large
> > distinction does not make sense.
> > >
> > > Maybe Eliot's idea of keeping "Float" in the name was best, but instead
> > of "small" use "immediate":
> > >
> > > Float - BoxedFloat - ImmediateFloat
> > >
> > > A Float is either a BoxedFloat or an ImmediateFloat, depending on
> > the magnitude of its exponent.
> >
> > I don't like the idea of putting a VM/Storage detail into the Class name.
> > The running system itself does not care about whether Floats or Integers
> > are
> > boxed or immediate.
> >
>
> I disagree. I think at least Smalltalk-80 has a philosophy of lifting as
> much out of the VM into the system, and hiding it from clients via
> encapsulation. So unlike many other VMs the compiler is in the system, the
> system explicitly separates SmallInteger, LargePositiveInteger and
> LargeNegativeInteger and implements large integer arithmetic with Smalltalk
> code that uses SmallIntegers. Note that the primitives are an optional
> extra optimization that the VM does not need to implement. So for me it is
> in keeping with the current system to use BoxedFloat and SmallFloat or
> BoxedDouble and SmallDouble.
>
> This lifting things up provides us with an extremely malleable system.
> Pushing things down into the VM does the opposite.
>

I can understand. It is however a tradeoff between abstraction and malleability.
I'd rather see everything done in Smalltalk. Yet primitives not exposing its
innards.

I think I just can't have the cake and have it, too.

>
> For example in RSqueakVM (aka SPy) there is no immediate
> > Integer whatsoever. Yes, tagged ints are read during image startup but they
> > aren't subsequently represented as immediates or tagged ints after that.
> >
>
> Well because it's implemented above RPython I guess it is using Python's
> bignum code directly. That's fine but its a bit of a cheat.

I was talking of SmallInts here. There is no bignum code involved.
On the RPython side, SmallIntegers are just objects that have a field
that is not accessible from Smalltalk but keeps a machine-word integer.

>
>
> >
> > Just as input, in the Racket language and other Schemes,
> > the equivalent to our SmallInterger/LargeInteger is fixnum/bignum
> > and for floats they have flonums and "extflonums" (80bit).
> >
> > Best
> > -Tobias
>
>