Hi, where can I find a good reference about what characters are allowed asthe current set of allowed characters includes the "special characters": $! $% $& $* $+ $, $- $/ $< $= $> $? $@ $\ $| $~ but allowed as binary selector character) And this is what String>>#numArgs uses. Therefore'-' numArgs "->1". '§' numArgs "-> -1 (the -1 is indicating "not even a valid selector")" (From the scanners typeTable) : {Character value: 1 . Character value: 2 . Character value: 3 . Character value: 4 . Character value: 5 . Character value: 6 . Character value: 7 . Character backspace . Character value: 11 . Character value: 14 . Character value: 15 . Character value: 16 . Character value: 17 . Character value: 18 . Character value: 19 . Character value: 20 . Character value: 21 . Character value: 22 . Character value: 23 . Character value: 24 . Character value: 25 . Character value: 26 . Character escape . Character value: 28 . Character value: 29 . Character value: 30 . Character value: 31 . $! . $% . $& . $* . $+ . $, . $- . $/ . $< . $= . $> . $? . $@ . $\ . $` . $~ . Character delete . $€ . $ . $‚ . $ƒ . $„ . $… . $† . $‡ . $ˆ . $‰ . $Š . $‹ . $Œ . $ . $Ž . $ . $ . $‘ . $’ . $“ . $” . $• . $– . $— . $˜ . $™ . $š . $› . $œ . $ . $ž . $Ÿ . $ . $¡ . $¢ . $£ . $¤ . $¥ . $¦ . $§ . $¨ . $© . $« . $¬ . $ . $® . $¯ . $° . $± . $² . $³ . $´ . $¶ . $· . $¸ . $¹ . $» . $¼ . $½ . $¾ . $¿ . $× . $÷} So , the question I want to ask. What do we want to allow as a binary selector (character). All that is nowadays "parseable" as binary selector, or only the set of "special characters" or something between both, and where to put this information, the "this is an allowed binary selector character" information? Thanks Nicolai |
See RBParserTest>>#testBinarySelectors
It's based on the draft ANSI Smalltalk-80 standard. You integrated it. It tests the RBParser's parsing of binary method definitions and message sends of all binary selectors from 1 char upto 3 chars. (The Blue Book is more restrictive than ANSI, limiting them to 2 chars max IIRC.) I wrote the test because of issues I had with the OldCompiler's handling of selectors containing "|" and issues on other platforms like GemStone, so the behavior I need and think is correct won't get broken without warning. |
In reply to this post by Nicolai Hess-3-2
This is a really nice and important question.
I would really have a clear answer because it will make the system more stable. If you can build an analysis and let us know it would be really great. Something related but not on the same topic is that I would love to have a syntax for nested comments. This is really annoying to have to uncomment parts when we have to comment a large part. We discussed this back in 2007-2009 but we never did it. Stef Le 28/8/16 à 12:17, Nicolai Hess a écrit : > Hi, > > where can I find a good reference about what characters are allowed as > binary selectors (from old syntax definition) and what is nowadays allowed > by the implementations. > > And whether the current set of allowed binaries selector includes some > additions on > purpose or if this is just a bug of the parser. > > From what I found out, (Blue book and some other smalltalk syntax > definitions) > the current set of allowed characters includes the "special characters": > $! $% $& $* $+ $, $- $/ $< $= $> $? $@ $\ $| $~ > (some implementation do not allow $@ and some calls $- not a special > character > but allowed as binary selector character) > > And this is what String>>#numArgs uses. Therefore > > '-' numArgs "->1". > '!' numArgs "->1". > And for example: > '§' numArgs "-> -1 (the -1 is indicating "not even a valid selector")" > > But I am interested in the characters not called "special characters and > not even in the range 0-126. > > The scanner allowes much more characters to be used as a selector name > (From the scanners typeTable) : > > {Character value: 1 . Character value: 2 . Character value: 3 . > Character value: 4 . Character value: 5 . Character value: 6 . > Character value: 7 . Character backspace . Character value: 11 . > Character value: 14 . Character value: 15 . Character value: 16 . > Character value: 17 . Character value: 18 . Character value: 19 . > Character value: 20 . Character value: 21 . Character value: 22 . > Character value: 23 . Character value: 24 . Character value: 25 . > Character value: 26 . Character escape . Character value: 28 . > Character value: 29 . Character value: 30 . Character value: 31 . $! . > $% . $& . $* . $+ . $, . $- . $/ . $< . $= . $> . $? . $@ . $\ . $` . > $~ . Character delete . $€ . $ . $‚ . $ƒ . $„ . $… . $† . $‡ . $ˆ . $‰ > . $Š . $‹ . $Œ . $ . $Ž . $ . $ . $‘ . $’ . $“ . $” . $• . $– . $— . > $˜ . $™ . $š . $› . $œ . $ . $ž . $Ÿ . $ . $¡ . $¢ . $£ . $¤ . $¥ . > $¦ . $§ . $¨ . $© . $« . $¬ . $ . $® . $¯ . $° . $± . $² . $³ . $´ . > $¶ . $· . $¸ . $¹ . $» . $¼ . $½ . $¾ . $¿ . $× . $÷} > > This means you can define a method with for example the name "÷". > > So , the question I want to ask. What do we want to allow as a binary > selector (character). > All that is nowadays "parseable" as binary selector, or only the set > of "special characters" > or something between both, and where to put this information, the > "this is an allowed binary > selector character" information? > > Thanks > Nicolai > |
Hi Stef,
2016-08-29 8:35 GMT+02:00 stepharo <[hidden email]>: This is a really nice and important question. If your need is that (uncomment while commenting and the reverse), then the answer is not a syntax change, but a better comment/uncomment command in the editor. Now that you say that, I also have the issue. I'll have a try in AltBrowser if I can get the behavior you'd like. Thierry
|
In reply to this post by monty-3
2016-08-28 13:41 GMT+02:00 monty <[hidden email]>: See RBParserTest>># Hi Monty, yes, but I am just wondering why the scanner interprets some characters as binary selector token, whereas they are not allowed as binary selectors. In the old scanner, the initialization of the type table just sets " binary token" as the default for all characters and changes some of them explicit to for example ($0 asciiValue to: $9 asciiValue) -> digit tokens #(9 10 12 13 32 ) -> delimiter token ... But RBScanner on the other hand explicitly sets some non-ascii characters to be used as "binary tokens" classificationTable at: 177 put: #binary. "plus-or-minus" classificationTable at: 183 put: #binary. "centered dot" classificationTable at: 215 put: #binary. "times" classificationTable at: 247 put: #binary. "divide" It looks like someone ( or somewhere ) it should be allowed to use these characters as a binary selector "#($± $· $× $÷)" Although later at the parsing step, using this tokens for binary message selectors isn't allowed. I think I will exclude these characters as binary selector tokens. |
In reply to this post by Thierry Goubier
2016-08-29 10:58 GMT+02:00 Thierry Goubier <[hidden email]>:
That works, but: - Single click select doesn't work very well (stops at the next double quotes instead of the end double quote) - The formatter likes to split double quotes (adding returns) - Backporting that to a standard editor is a mess, because there is a need to change the #enclose: method. Syntax wise, one could consider "" to be inside a comment (i.e. do not split into two comments if encountered inside a comment, as it is done now). Thierry
|
Thierry
If you have a better editor control even better :) This one could be nice too :) |
On Mon, Aug 29, 2016 at 11:42 AM, stepharo <[hidden email]> wrote:
This would be also consistent with how single quotes are handled in a string, which would make it doublenice. :) Peter |
In reply to this post by stepharo
2016-08-29 11:42 GMT+02:00 stepharo <[hidden email]>:
What I did is yank that sort of behavior into independent commands that are keymap-bound to an editor. A lot easier to customize " independently of ', (, [.
And simple: probably just a line or two inside RBScanner. Thierry |
In reply to this post by stepharo
Hi Stef,
2016-08-29 11:42 GMT+02:00 stepharo <[hidden email]>:
I'll have the slice ready soon. Any comments on what that would mean regarding the Smalltalk commonly accepted syntax if that feature is integrated? Thierry |
In reply to this post by Peter Uhnak
2016-08-29 13:24 GMT+02:00 Peter Uhnák <[hidden email]>:
Implementing it means looking at how strings are scanned :) Thierry
|
In reply to this post by Thierry Goubier
Le 29/8/16 à 17:45, Thierry Goubier a
écrit :
It will break compatibility for people using it now we should raise the topic and lets a chance to people to discuss about it. We could check before publishing if code contain nested comments. I think that I would use them only when developing. Stef
|
Le 29/08/2016 à 21:28, stepharo a écrit :
> > > Le 29/8/16 à 17:45, Thierry Goubier a écrit : >> Hi Stef, >> >> 2016-08-29 11:42 GMT+02:00 stepharo <[hidden email] >> <mailto:[hidden email]>>: >> >> Thierry >> >> If you have a better editor control even better :) >>> >>> Syntax wise, one could consider "" to be inside a comment (i.e. >>> do not split into two comments if encountered inside a comment, >>> as it is done now). >> This one could be nice too :) >> >> >> https://pharo.fogbugz.com/f/cases/19011/Integrate-two-double-quotes-inside-comments >> >> I'll have the slice ready soon. Any comments on what that would mean >> regarding the Smalltalk commonly accepted syntax if that feature is >> integrated? > It will break compatibility for people using it now we should raise the > topic and lets a chance to people to discuss about it. We could check > before publishing if code contain nested comments. Hum. The slice should parse anything legal Smalltalk; just that it may show less comments intervals (because in fact it will coalesce adjacent comments). For example, standard parse will say that: '"this ""test"' is a token with two comments, intervals 1 to: 7 and 8 to: 13. The slice makes that a single comment: '"this ""test"' is a token with one comment, interval 1 to: 13. Now, this has probably no impact on parsing smalltalk code. But it changes a bit the language definition, so that's why I'd like comments on it. > I think that I would use them only when developing. Up to you :) The most interesting is to have the correct comment/uncomment behavior in an editor... that one works independently and is quite cool. Thierry > Stef > >> >> Thierry >> > |
2016-08-29 21:38 GMT+02:00 Thierry Goubier <[hidden email]>: Le 29/08/2016 à 21:28, stepharo a écrit : Yes, I think the change for RBScanner is fine, it does not changes what kind of comments are accepted, only how they are assigned to the AST nodes (one vs. multiple comments). (BTW. do we have a function that would do the coalescing of intervals: (1 to:99) (100 to: 199) -> (1 to:199) ? )
|
On Tue, Aug 30, 2016 at 8:14 AM, Nicolai Hess <[hidden email]> wrote:
Specialize #, in Interval to check. Right now #, will answer an Array, but it would be easy to special-case.
_,,,^..^,,,_ best, Eliot |
In reply to this post by Nicolai Hess-3-2
On Tue, Aug 30, 2016 at 8:14 AM, Nicolai Hess <[hidden email]> wrote:
Find attached something that works in Squeak 5
_,,,^..^,,,_ best, Eliot Interval-methods.st (788 bytes) Download Attachment |
Oops. No need to add a step method; the increment method already exists: On Wed, Aug 31, 2016 at 9:12 AM, Eliot Miranda <[hidden email]> wrote:
_,,,^..^,,,_ best, Eliot Interval-methods.st (618 bytes) Download Attachment |
2016-08-31 10:14 GMT+02:00 Eliot Miranda <[hidden email]>:
Nice, But actually I wasn't clear about the requirements :-) The purpose was to merge source code intervals after parsing code comments. The comments may be adjacent and could be merged into one comment. For this I would like to merge an collection of intervals in a smaller number of intervals with adjacent intervals merged into one: { (30 to: 35) . (36 to:40) . (50 to:100) } -> { (30 to:40) . (50 to:100) } But Thierry already changed the scanner to produce this smaller set of intervals/comments :-)
|
In reply to this post by Nicolai Hess-3-2
2016-08-29 11:23 GMT+02:00 Nicolai Hess <[hidden email]>:
Now back ontopic :) Anyone knows why RefactoringBrowsers smalltalk scanner (RBScanner) explicit allowes "#($± $· $× $÷)" to be binary selector characters ? Is there any smalltalk dialect that uses these characters ? I think I 'll remove the support for this in Pharo (it actually isn't really supported, althought the scanner scannes these characters as binary selector tokens, the parser finally does not allow these characters as binary selector symbols. |
On 08/31/2016 08:46 AM, Nicolai Hess wrote:
> Anyone knows why RefactoringBrowsers smalltalk scanner (RBScanner) > explicit allowes > "#($± $· $× $÷)" to be binary selector characters ? I do -- I added them about 20 years ago :)... > Is there any smalltalk dialect that uses these characters ? VW allows them. When possible, we made the scanner/parser be a superset of the VW & VA syntax. John Brant |
Free forum by Nabble | Edit this page |