As stated in NewCompiler's code, the ANSI syntax: ClosureCompiler evaluate: '- 1' is WEIRD! It answers -1 as a literal negative number, space not being significant. BEWARE a tab or cr are significant in current implementation (ANSI?) This is more confusing than usefull. It makes people think of a prefixed operator like other languages. Also, as already said, inside literal array #(- 1) space is significant. And what about the sign of exponent? ClosureCompiler evaluate: '-1.0e- 1'. Message not understood e ((-1.0) e) - (1), so space is significant here. Beside, as NewCompiler accepts minus as last character of a multi-character binary selector, this causes further ambiguity. ClosureCompiler evaluate: '0--1'. is 1 (0-(-1)) last minus is attached to digit because there is no space. ClosureCompiler evaluate: '0-- 1'. Message not understood -- 2 contradictory rules - either space is significant thus selector is #-- - or space is not significant (like '- 1') The first rule wins apparently Weak weak ANSI. What was in their mind? Nicolas |
Yes I agree we should not allow the space between the minus and the one.
By the way you should post this to the NewCompiler mailing list. http://lists.squeakfoundation.org/mailman/listinfo/newcompiler Mth On May 24, 2007, at 12:12 AM, nicolas cellier wrote: > > As stated in NewCompiler's code, the ANSI syntax: > > ClosureCompiler evaluate: '- 1' > > is WEIRD! > It answers -1 as a literal negative number, space not being > significant. BEWARE a tab or cr are significant in current > implementation (ANSI?) > > This is more confusing than usefull. > It makes people think of a prefixed operator like other languages. > Also, as already said, inside literal array #(- 1) space is > significant. > > And what about the sign of exponent? > ClosureCompiler evaluate: '-1.0e- 1'. Message not understood e > ((-1.0) e) - (1), so space is significant here. > > Beside, as NewCompiler accepts minus as last character of a multi- > character binary selector, this causes further ambiguity. > > ClosureCompiler evaluate: '0--1'. is 1 (0-(-1)) last minus is > attached to digit because there is no space. > ClosureCompiler evaluate: '0-- 1'. Message not understood -- > 2 contradictory rules > - either space is significant thus selector is #-- > - or space is not significant (like '- 1') > The first rule wins apparently > > Weak weak ANSI. What was in their mind? > > Nicolas > > |
In reply to this post by Nicolas Cellier-3
nicolas
thanks, please notice that marcus is not reading the squeak-dev mailinglist anymore. Please cross post to the new compiler mailing-list Stef On 24 mai 07, at 00:12, nicolas cellier wrote: > > As stated in NewCompiler's code, the ANSI syntax: > > ClosureCompiler evaluate: '- 1' > > is WEIRD! > It answers -1 as a literal negative number, space not being > significant. BEWARE a tab or cr are significant in current > implementation (ANSI?) > > This is more confusing than usefull. > It makes people think of a prefixed operator like other languages. > Also, as already said, inside literal array #(- 1) space is > significant. > > And what about the sign of exponent? > ClosureCompiler evaluate: '-1.0e- 1'. Message not understood e > ((-1.0) e) - (1), so space is significant here. > > Beside, as NewCompiler accepts minus as last character of a multi- > character binary selector, this causes further ambiguity. > > ClosureCompiler evaluate: '0--1'. is 1 (0-(-1)) last minus is > attached to digit because there is no space. > ClosureCompiler evaluate: '0-- 1'. Message not understood -- > 2 contradictory rules > - either space is significant thus selector is #-- > - or space is not significant (like '- 1') > The first rule wins apparently > > Weak weak ANSI. What was in their mind? > > Nicolas > > > |
In reply to this post by Nicolas Cellier-3
nicolas cellier <[hidden email]> writes:
> As stated in NewCompiler's code, the ANSI syntax: These are fun challenges, Nicolas! They will help us get a compiler that does precisely what we want from it. In deciding the behavior we want, I would propose two principles, the first being stronger than the last: 1. Be compatible on the trivial stuff. Save Squeak-isms for places where there is a real advantage. 2. Lean towards accepting more rather than less. Especially we should try to accept things that other implementations accept. Here are my attempts to figure out what ANSI actually wants in these cases. Be aware it is not always what the compiler is currently doing. > ClosureCompiler evaluate: '- 1' > > is WEIRD! > It answers -1 as a literal negative number, space not being > significant. BEWARE a tab or cr are significant in current > implementation (ANSI?) > > This is more confusing than usefull. > It makes people think of a prefixed operator like other languages. The standard is clear. A unary - is allowed to dangle way ahead of the the number literal that it modifies. Section 3.4.6.1 includes the appropriate rule. Note the last sentence. <number literal> ::= ['-'] <number> <number> ::= integer | float | scaledDecimal If the preceding '-' is not present the value of the numeric object is a positive number. If the '-' is present the value of the numeric object is the negative number that is the negation of the positive number defined by the <number> clause. White space is allowed between the '-' and the <number>. As best as I can tell, the standard is factored this way so that you can divide your parser in the standard way into a tokenizer and a parser. When the tokenizer sees "-1", it should divide it into two tokens, "-" and "1". Then, the parser is free to interpreter this as either a literal, as in "x := -1", or a subtraction of 1, as in "y := x-1". Since it's handled at the level of parsing, white space is allowed for consistency. You can even put comments in there if you like. That's my rationalization, anyway. :) The rationale doesn't say, but the spec is clear that spaces are allowed there. > Also, as already said, inside literal array #(- 1) space is > significant. In this case, it should probably be the same as #(-1). The standard is generally too quiet about array literals, but in this case it does define a parse, so I guess we should use the standard parse. If you want to parse a two-element array out of the above, you can write it as: #(#- 1) . > And what about the sign of exponent? > ClosureCompiler evaluate: '-1.0e- 1'. Message not understood e > ((-1.0) e) - (1), so space is significant here. According to the standard, the only place you can insert a space inside a number literal is if the whole literal starts with a "-". This (ugly) exception only applies at the beginning, and not in the example you give. > Beside, as NewCompiler accepts minus as last character of a > multi-character binary selector, this causes further ambiguity. > > ClosureCompiler evaluate: '0--1'. is 1 (0-(-1)) last minus is attached > to digit because there is no space. This code is incorrect by the standard, and I'd be happy with rejecting it. ANSI uses normal old longest match. Section 3.5 says: Unless otherwise specified, white space or another separator must appear between any two tokens if the initial characters of the second token would be a valid extension of the first token. Thus, 0--1 should tokenize as "0", "--", "1", which then does not parse. Lex Spoon |
Lex Spoon a écrit :
> nicolas cellier <[hidden email]> writes: >> As stated in NewCompiler's code, the ANSI syntax: > > These are fun challenges, Nicolas! They will help us get a compiler > that does precisely what we want from it. > > In deciding the behavior we want, I would propose two principles, the > first being stronger than the last: > > 1. Be compatible on the trivial stuff. Save Squeak-isms for places > where there is a real advantage. > > 2. Lean towards accepting more rather than less. Especially we > should try to accept things that other implementations accept. > Very reasonnable. But which dialect does interpret '- 1' the ANSI way? Not VW, nor gst, nor stx... (didn't check Dolphin nor VA) > > Here are my attempts to figure out what ANSI actually wants in these > cases. Be aware it is not always what the compiler is currently > doing. > > >> ClosureCompiler evaluate: '- 1' >> >> is WEIRD! >> It answers -1 as a literal negative number, space not being >> significant. BEWARE a tab or cr are significant in current >> implementation (ANSI?) >> >> This is more confusing than usefull. >> It makes people think of a prefixed operator like other languages. > > The standard is clear. A unary - is allowed to dangle way ahead of > the the number literal that it modifies. > > > Section 3.4.6.1 includes the appropriate rule. Note the last > sentence. > > > <number literal> ::= ['-'] <number> > <number> ::= integer | float | scaledDecimal > > If the preceding '-' is not present the value of the numeric > object is a positive number. If the '-' is present the value of > the numeric object is the negative number that is the negation of > the positive number defined by the <number> clause. White space is > allowed between the '-' and the <number>. > > > As best as I can tell, the standard is factored this way so that you > can divide your parser in the standard way into a tokenizer and a > parser. When the tokenizer sees "-1", it should divide it into two > tokens, "-" and "1". Then, the parser is free to interpreter this as > either a literal, as in "x := -1", or a subtraction of 1, as in > "y := x-1". > > Since it's handled at the level of parsing, white space is allowed > for consistency. You can even put comments in there if you like. > > That's my rationalization, anyway. :) The rationale doesn't say, but > the spec is clear that spaces are allowed there. > This makes some sense. Though not implemented that way in new compiler (only character spaces are allowed, not logical spaces). This is more the gramar rule which is questionable. It's not based on any dialect customs, nor historical roots. Maybe the fact that people coming from other language may appreciate... > > >> Also, as already said, inside literal array #(- 1) space is >> significant. > > In this case, it should probably be the same as #(-1). The standard is > generally too quiet about array literals, but in this case it does > define a parse, so I guess we should use the standard parse. > > If you want to parse a two-element array out of the above, you can > write it as: #(#- 1) . > > I would not like it. I prefer to understand the rule as: 1) a space separates two tokens 2) -1 is a single token not two, while - 1 is two tokens 3) a literal array is an array of literal tokens My guess is that this was the rule in the mind of Smalltalk gods. Only a guess... Anyway, it's the actual behaviour of most Smalltalks. > > >> And what about the sign of exponent? >> ClosureCompiler evaluate: '-1.0e- 1'. Message not understood e >> ((-1.0) e) - (1), so space is significant here. > > According to the standard, the only place you can insert a space > inside a number literal is if the whole literal starts with a "-". > This (ugly) exception only applies at the beginning, and not in the > example you give. > > Not a brilliant example anyway, forget it. > > >> Beside, as NewCompiler accepts minus as last character of a >> multi-character binary selector, this causes further ambiguity. >> >> ClosureCompiler evaluate: '0--1'. is 1 (0-(-1)) last minus is attached >> to digit because there is no space. > > This code is incorrect by the standard, and I'd be happy with > rejecting it. ANSI uses normal old longest match. Section 3.5 says: > > Unless otherwise specified, white space or another separator must > appear between any two tokens if the initial characters of the > second token would be a valid extension of the first token. > > Thus, 0--1 should tokenize as "0", "--", "1", which then does not > parse. > > > > Lex Spoon > > > Yes, just like #||, #-- is not standard. It is an extension. For the very reason to avoid ambiguity caused when mixed with negative literal numbers if i understood it well. It would be easy to have a pedantic interactive compiler forcing user to disambiguate (at least warning, or a menu proposing various interpretations with auto-inserting space action). See: http://lists.squeakfoundation.org/pipermail/squeak-dev/2006-May/thread.html#103895 http://lists.squeakfoundation.org/pipermail/squeak-dev/2006-May/104088.html http://lists.squeakfoundation.org/pipermail/squeak-dev/2006-May/103907.html etc... (binary selectors ambiguity and space) http://bugs.squeak.org/view.php?id=3616 |
Free forum by Nabble | Edit this page |