The Trunk: Multilingual-ar.87.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

The Trunk: Multilingual-ar.87.mcz

commits-2
Andreas Raab uploaded a new version of Multilingual to project The Trunk:
http://source.squeak.org/trunk/Multilingual-ar.87.mcz

==================== Summary ====================

Name: Multilingual-ar.87
Author: ar
Time: 11 February 2010, 11:48:33.736 pm
UUID: ca61be1a-7e30-2840-9b77-9cc7ecfc2d3f
Ancestors: Multilingual-ar.86

Change EncodedCharSet>>digitValue: to EncodedCharSet>>digitValueOf:. Part 3/3.

=============== Diff against Multilingual-ar.86 ===============

Item was changed:
  ----- Method: EncodedCharSet class>>digitValueOf: (in category 'class methods') -----
  digitValueOf: char
  "Answer 0-9 if the receiver is $0-$9, 10-35 if it is $A-$Z, and < 0
  otherwise. This is used to parse literal numbers of radix 2-36."
 
  | value |
  value := char charCode.
+ value <= $9 asciiValue ifTrue:
+ [^value - $0 asciiValue].
+ value >= $A asciiValue ifTrue:
+ [value <= $Z asciiValue ifTrue: [^value - $A asciiValue + 10].
+ (value >= $a asciiValue and: [value <= $z asciiValue]) ifTrue:
+ [^value - $a asciiValue + 10]].
- value <= $9 asciiValue
- ifTrue: [^value - $0 asciiValue].
- value >= $A asciiValue
- ifTrue: [value <= $Z asciiValue ifTrue: [^value - $A asciiValue + 10]].
  ^ -1
  !

Item was changed:
  ----- Method: Unicode class>>digitValueOf: (in category 'class methods') -----
  digitValueOf: char
  "Answer 0-9 if the receiver is $0-$9, 10-35 if it is $A-$Z, and < 0
  otherwise. This is used to parse literal numbers of radix 2-36."
 
  | value |
  value := char charCode.
+ value <= $9 asciiValue ifTrue:
+ [^value - $0 asciiValue].
+ value >= $A asciiValue ifTrue:
+ [value <= $Z asciiValue ifTrue: [^value - $A asciiValue + 10].
+ (value >= $a asciiValue and: [value <= $z asciiValue]) ifTrue:
+ [^value - $a asciiValue + 10]].
- value <= $9 asciiValue
- ifTrue: [^value - $0 asciiValue].
- value >= $A asciiValue
- ifTrue: [value <= $Z asciiValue ifTrue: [^value - $A asciiValue + 10]].
 
  value > (DecimalProperty size - 1) ifTrue: [^ -1].
  ^ (DecimalProperty at: value+1)
  !

Item was removed:
- ----- Method: LanguageEnvironment class>>digitValue: (in category 'accessing') -----
- digitValue: char
-
- ^ Unicode digitValue: char.
- !

Item was removed:
- ----- Method: EncodedCharSet class>>digitValue: (in category 'class methods') -----
- digitValue: char
- "Answer 0-9 if the receiver is $0-$9, 10-35 if it is $A-$Z, and < 0
- otherwise. This is used to parse literal numbers of radix 2-36."
-
- | value |
- value := char charCode.
- value <= $9 asciiValue
- ifTrue: [^value - $0 asciiValue].
- value >= $A asciiValue
- ifTrue: [value <= $Z asciiValue ifTrue: [^value - $A asciiValue + 10]].
- ^ -1
- !

Item was removed:
- ----- Method: Unicode class>>digitValue: (in category 'class methods') -----
- digitValue: char
-
- | value |
- value := char charCode.
- value <= $9 asciiValue
- ifTrue: [^value - $0 asciiValue].
- value >= $A asciiValue
- ifTrue: [value <= $Z asciiValue ifTrue: [^value - $A asciiValue + 10]].
-
- value > (DecimalProperty size - 1) ifTrue: [^ -1].
- ^ (DecimalProperty at: value+1)
- !


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Multilingual-ar.87.mcz

Nicolas Cellier
Of course, as indicated by Bert, this make some test fail.
If we follow this road, then we must also remove support for
reading/printing Float in non decimal base.
I personnally agree that this form is mostly educational and has no
major industrial utility.
As squeak is also an educational support, I will let other debate if
this is a major feature or not...

I'm not at all in favour in decomposing operations as you suggested
earlier because this introduces additional round off errors:
| a b |
a := SqNumberParser parse: '3r1.1e55'.
b := (SqNumberParser parse: '3r1.1') * (3 raisedTo: 55).
a - b
-> 3.4359738368e10

If you use current squeak (Number>>#readFrom:) then above forms are equivalent.
This is because (Number>>#readFrom:) use the exact broken
decomposition you suggested.
This decomposition is broken because it does not answer the nearest
Floating point value to the literal expression.

Proof:

3r1.1e55 is literally same as 3r11e54, that is (3+1)*(3 raisedTo: 54).
Of course, this number cannot be represented exactly as a Float, it
has too many digits:
| a |
a := (3+1)*(3 raisedTo: 54).
a highBit - a lowBit
-> 85, that means a 85 bit mantissa is required (plus implied leading
1), Float only has 52 bits (plus implied leading 1)

SqNumberParser does answer a Float nearest the exact literal than
Number>>readFrom:
((SqNumberParser parse: '3r1.1e55') asTrueFraction - ((3r11)*(3
raisedTo: 54))) abs
< ((Number readFrom: '3r1.1e55') asTrueFraction - ((3r11)*(3 raisedTo: 54))) abs
-> true

In fact, the nearest Float, as you can check with predecessor successor.
| a exact |
a := SqNumberParser parse: '3r1.1e55'.
exact := (3r11)*(3 raisedTo: 54).
self assert: (a - exact) abs <= (a successor - exact) abs.
self assert: (a - exact) abs <= (a predecessor - exact) abs.

This of course fail with Number>>readFrom:

You can also simply try
3r1.1e55 - 3r11.0e54
and the SqNumberParser form...
(SqNumberParser parse: '3r1.1e55') - (SqNumberParser parse: '3r11.0e54')

Maybe it's time to push the readFrom: changes i already pushed in Pharo...

Nicolas

2010/2/12  <[hidden email]>:

> Andreas Raab uploaded a new version of Multilingual to project The Trunk:
> http://source.squeak.org/trunk/Multilingual-ar.87.mcz
>
> ==================== Summary ====================
>
> Name: Multilingual-ar.87
> Author: ar
> Time: 11 February 2010, 11:48:33.736 pm
> UUID: ca61be1a-7e30-2840-9b77-9cc7ecfc2d3f
> Ancestors: Multilingual-ar.86
>
> Change EncodedCharSet>>digitValue: to EncodedCharSet>>digitValueOf:. Part 3/3.
>
> =============== Diff against Multilingual-ar.86 ===============
>
> Item was changed:
>  ----- Method: EncodedCharSet class>>digitValueOf: (in category 'class methods') -----
>  digitValueOf: char
>        "Answer 0-9 if the receiver is $0-$9, 10-35 if it is $A-$Z, and < 0
>        otherwise. This is used to parse literal numbers of radix 2-36."
>
>        | value |
>        value := char charCode.
> +       value <= $9 asciiValue ifTrue:
> +               [^value - $0 asciiValue].
> +       value >= $A asciiValue ifTrue:
> +               [value <= $Z asciiValue ifTrue: [^value - $A asciiValue + 10].
> +                (value >= $a asciiValue and: [value <= $z asciiValue]) ifTrue:
> +                       [^value - $a asciiValue + 10]].
> -       value <= $9 asciiValue
> -               ifTrue: [^value - $0 asciiValue].
> -       value >= $A asciiValue
> -               ifTrue: [value <= $Z asciiValue ifTrue: [^value - $A asciiValue + 10]].
>        ^ -1
>  !
>
> Item was changed:
>  ----- Method: Unicode class>>digitValueOf: (in category 'class methods') -----
>  digitValueOf: char
>        "Answer 0-9 if the receiver is $0-$9, 10-35 if it is $A-$Z, and < 0
>        otherwise. This is used to parse literal numbers of radix 2-36."
>
>        | value |
>        value := char charCode.
> +       value <= $9 asciiValue ifTrue:
> +               [^value - $0 asciiValue].
> +       value >= $A asciiValue ifTrue:
> +               [value <= $Z asciiValue ifTrue: [^value - $A asciiValue + 10].
> +                (value >= $a asciiValue and: [value <= $z asciiValue]) ifTrue:
> +                       [^value - $a asciiValue + 10]].
> -       value <= $9 asciiValue
> -               ifTrue: [^value - $0 asciiValue].
> -       value >= $A asciiValue
> -               ifTrue: [value <= $Z asciiValue ifTrue: [^value - $A asciiValue + 10]].
>
>        value > (DecimalProperty size - 1) ifTrue: [^ -1].
>        ^ (DecimalProperty at: value+1)
>  !
>
> Item was removed:
> - ----- Method: LanguageEnvironment class>>digitValue: (in category 'accessing') -----
> - digitValue: char
> -
> -       ^ Unicode digitValue: char.
> - !
>
> Item was removed:
> - ----- Method: EncodedCharSet class>>digitValue: (in category 'class methods') -----
> - digitValue: char
> -       "Answer 0-9 if the receiver is $0-$9, 10-35 if it is $A-$Z, and < 0
> -       otherwise. This is used to parse literal numbers of radix 2-36."
> -
> -       | value |
> -       value := char charCode.
> -       value <= $9 asciiValue
> -               ifTrue: [^value - $0 asciiValue].
> -       value >= $A asciiValue
> -               ifTrue: [value <= $Z asciiValue ifTrue: [^value - $A asciiValue + 10]].
> -       ^ -1
> - !
>
> Item was removed:
> - ----- Method: Unicode class>>digitValue: (in category 'class methods') -----
> - digitValue: char
> -
> -       | value |
> -       value := char charCode.
> -       value <= $9 asciiValue
> -               ifTrue: [^value - $0 asciiValue].
> -       value >= $A asciiValue
> -               ifTrue: [value <= $Z asciiValue ifTrue: [^value - $A asciiValue + 10]].
> -
> -       value > (DecimalProperty size - 1) ifTrue: [^ -1].
> -       ^ (DecimalProperty at: value+1)
> - !
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Multilingual-ar.87.mcz

Andreas.Raab
Nicolas Cellier wrote:
> This of course fail with Number>>readFrom:
>
> You can also simply try
> 3r1.1e55 - 3r11.0e54
> and the SqNumberParser form...
> (SqNumberParser parse: '3r1.1e55') - (SqNumberParser parse: '3r11.0e54')
>
> Maybe it's time to push the readFrom: changes i already pushed in Pharo...

Go for it. You're the best expert we have in this area, and I trust your
  judgment.

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Multilingual-ar.87.mcz

johnmci
Also see
http://code.google.com/p/pharo/issues/detail?id=1258

'.1' asNumber  toss an error in  Pharo 1.1 11043
and

'1.0e+2'  asNumber
returns 1.0

yet other I think more knowledgeable people say
http://www.wolframalpha.com/input/?i=1.0e%2B2
100

I'd just say any expectations of converting a string to a number in most flavours of Squeak should be viewed with suspicion.

On 2010-02-12, at 12:44 AM, Andreas Raab wrote:

> Nicolas Cellier wrote:
>> This of course fail with Number>>readFrom:
>> You can also simply try
>> 3r1.1e55 - 3r11.0e54
>> and the SqNumberParser form...
>> (SqNumberParser parse: '3r1.1e55') - (SqNumberParser parse: '3r11.0e54')
>> Maybe it's time to push the readFrom: changes i already pushed in Pharo...
>
> Go for it. You're the best expert we have in this area, and I trust your  judgment.
>
> Cheers,
>  - Andreas
>

--
===========================================================================
John M. McIntosh <[hidden email]>   Twitter:  squeaker68882
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
===========================================================================





Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Multilingual-ar.87.mcz

Nicolas Cellier
2010/2/12 John M McIntosh <[hidden email]>:
> Also see
> http://code.google.com/p/pharo/issues/detail?id=1258
>
> '.1' asNumber  toss an error in  Pharo 1.1 11043
> and
>
> '1.0e+2'  asNumber
> returns 1.0
>

If it's a matter of traditional Smalltalk syntax, we could check
result with these Smalltalks:
- gst
- VW
- Dolphin
- vast
- stx
- ...

If it's a matter of extending the syntax, we could also try current
squeak version for:
'+1' asNumber.
'1,964' asNumber.

Where to stop ?
We have to write some EBNF rules and/or simply provide a few TestCase
documenting the expectations (say with unit TestCase).

In Pharo, I rejected undocumented non-Smalltalk syntax.
The reason is that (Number>>readFrom:) is used by Parser, and 1.e2  or
.1e2 are ambiguous in Smalltalk syntax.

Stupid example:
| e2 y |
e2:=1.
y:=1.e2 storeOn: Transcript.Transcript endEntry.
^y

According to st-80 and documented Squeak EBNF syntax, it should print
1, not 100.

This does not need to be written on stone, we can evolve Smalltalk
syntax, or we can easily dissociate String>>asNumber from Parser usage
with two subclasses of NumberParser...
Some applications would also prefer a localized decimal separator for example...
But this has to be decided on rational bases, not just by addition of
incomplete undocumented hacks in readFrom: for specific application
usage.

Nicolas

> yet other I think more knowledgeable people say
> http://www.wolframalpha.com/input/?i=1.0e%2B2
> 100
>
> I'd just say any expectations of converting a string to a number in most flavours of Squeak should be viewed with suspicion.
>
> On 2010-02-12, at 12:44 AM, Andreas Raab wrote:
>
>> Nicolas Cellier wrote:
>>> This of course fail with Number>>readFrom:
>>> You can also simply try
>>> 3r1.1e55 - 3r11.0e54
>>> and the SqNumberParser form...
>>> (SqNumberParser parse: '3r1.1e55') - (SqNumberParser parse: '3r11.0e54')
>>> Maybe it's time to push the readFrom: changes i already pushed in Pharo...
>>
>> Go for it. You're the best expert we have in this area, and I trust your  judgment.
>>
>> Cheers,
>>  - Andreas
>>
>
> --
> ===========================================================================
> John M. McIntosh <[hidden email]>   Twitter:  squeaker68882
> Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
> ===========================================================================
>
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Multilingual-ar.87.mcz

K. K. Subramaniam
In reply to this post by johnmci
On Friday 12 February 2010 02:40:36 pm John M McIntosh wrote:
> '1.0e+2'  asNumber
> returns 1.0
Squeak (Trunk) shows:
'1.0e2' asNumber 100
'1.0e+2' asNumber 1.0

:-/  .. Subbu