Hi,
just something to think about: one thing I always liked about Smalltalk is that it allows for nice DSL's. We have nice things like a unit framework in Pharo, ... In the most simple case one can easily implement own units just by providing a unary messages: 1 m 1 second 1 px 1 EUR One can easily implement an own Money class with a currency and then do polymorphic tricks like 10 EUR + 20 EUR But we can currently can not implement special unary selectors (including special unary selectors with unicode) like: 100 % 20 $ 40 € 12 ‰ (for promille) Especially things like 20 % would be nice for layout issues or other (Bloc comes to mind). Maybe we should put that on the roadmap of Pharo because IMHO it would be cool to support such things in the future. Dont know how much effort it currently means on the technical level but maybe others can comment. Thx T. |
From one perspective, full names are clearer at the cost of some typing
33 percent 45 dollar 100 kilometerPerHour 4 newtonMeter 200 bitsPerSecond And just like there is a limited namespace for class names (and prefixes to separate them), the namespace of selectors is global and limited too. #percent and #second(s) already exist, for example. So there might be conflicts. > On 17 Nov 2017, at 10:32, Torsten Bergmann <[hidden email]> wrote: > > Hi, > > just something to think about: one thing I always liked about Smalltalk is that it allows for nice DSL's. We have nice things > like a unit framework in Pharo, ... > > In the most simple case one can easily implement own units just by providing a unary messages: > > 1 m > 1 second > 1 px > 1 EUR > > One can easily implement an own Money class with a currency and then do polymorphic tricks like > > 10 EUR + 20 EUR > > But we can currently can not implement special unary selectors (including special unary selectors with unicode) like: > > 100 % > 20 $ > 40 € > 12 ‰ (for promille) > > Especially things like 20 % would be nice for layout issues or other (Bloc comes to mind). > > Maybe we should put that on the roadmap of Pharo because IMHO it would be cool to support such things in the > future. Dont know how much effort it currently means on the technical level but maybe others can comment. > > Thx > T. > |
In reply to this post by Torsten Bergmann
Well you would change the syntax of Pharo, which I don't think will be met with a lot of enthusiasm. How would you distinguish it from something that should trigger compilation error instead? ($ for characters) Also; 1 ml is one milliliter, or one mile? Also considering how was the #, selector was shut down for Bloc vectors, this seems rather bold. :) But I would imagine that people do and will do named unary selectors within their projects (which is something quite different to having all of it in Pharo by default) Peter On Fri, Nov 17, 2017 at 10:32 AM, Torsten Bergmann <[hidden email]> wrote: Hi, |
> Well you would change the syntax of Pharo
I'm not sure this is a syntax change like others. Haven yet thought about it too deeply. Where is the difference between an unary method #m and #% from the callers point of view? Either there is an argument object then % is a binary - if not it is an unary. Maybe it is hard to parse or introduces new problems. >But I would imagine that people do and will do named unary selectors within their projects (which is something quite >different to having all of it in Pharo by default) I dont want to have them inside by default in the base image - but people might be interested in implementing % (for percent), € (Euro) or ‰ (promille) as own unary messages in an own custom external package. Or having binary messages with Unicode. Just an idea to discuss what it would mean to have such beasts or if it would be useful ;) Bye T. |
In reply to this post by Torsten Bergmann
I would really like to see % removed as a binary selector and available to use in unary or keyword ones. The only implementor in a Pharo 6 image is: So it's juts aliasing \\ , and % most widespread usage in the real world is por percentages (the use in modular arithmetic is more a programming thing that math notation I think). % aNumber ^ self \\ aNumber On Fri, Nov 17, 2017 at 6:32 AM, Torsten Bergmann <[hidden email]> wrote: Hi, |
2017-11-17 17:40 GMT+01:00 Gabriel Cotelli <[hidden email]>:
+1, such alias has nothing to do in Kernel
This could also be the case for punctuation like ! and ? The idea of Torsten is more generic, any combination of binary char could be used. From what I understand from https://en.wikipedia.org/wiki/LR_parser we would just have to scan one more token ahead for deciding if unary or binary, and could preserve our simple shift reduce steps... So it seems doable for resolving the send. More problematic would be the declaration of method, if we have both a unary + and a binary +, we will need new syntax for marking the difference. Whether it's worth or not is another matter...
|
Le 17/11/2017 à 10:14, Nicolas Cellier a écrit :
> > > 2017-11-17 17:40 GMT+01:00 Gabriel Cotelli <[hidden email] > <mailto:[hidden email]>>: > > I would really like to see % removed as a binary selector and > available to use in unary or keyword ones. The only implementor in a > Pharo 6 image is: > > % aNumber > > ^ self \\ aNumber > > > +1, such alias has nothing to do in Kernel > > So it's juts aliasing \\ , and % most widespread usage in the real > world is por percentages (the use in modular arithmetic is more a > programming thing that math notation I think). > > And for allowing more Unicode code points for selector names I'm > totally in for Symbols, Arrows, Math Symbols, etc... We just need to > analyse the ones that makes sense as binary selectors, and the ones > allowed for unary and keyword ones. This will allow us to write > pretty cool DSLs. > > Just my opinion. > > This could also be the case for punctuation like ! and ? > > The idea of Torsten is more generic, any combination of binary char > could be used. > From what I understand from https://en.wikipedia.org/wiki/LR_parser we > would just have to scan one more token ahead for deciding if unary or > binary, and could preserve our simple shift reduce steps... The Smalltalk parsers being handwritten, there wouldn't be shift/reduces to contend with, and, anyway, the lexer doesn't shift/reduce; it would simply creates a token up to the next separator (that is goble up the next space/cr/end of line dot/closing parenthesis/etc...) > So it seems doable for resolving the send. Sort of. The parser difficulty would be this one: anObject % print Is this a binary selector with a print argument or two unary selectors? Using the symbol table when you parse would solve it, but that is certainly not context free... > More problematic would be the declaration of method, if we have both a > unary + and a binary +, we will need new syntax for marking the difference. In most cases, distinguishing between unary + declaration and binary + declaration would be doable: + whatever is the start of a binary selector + ^ self is the start (or the declaration of) an unary selector. But writing + self Can be interpreted as either unary plus doing self, or binary + method definition... > Whether it's worth or not is another matter... Well, one should probably try to implement the various parsers for that (extend RB, extend the SmaCC Smalltalk parser, extend the Petit Parser) to see how much complexity it would bring. Coming up with strange interpretations one could do with that syntax can be helpfull as well. Regards, Thierry > On Fri, Nov 17, 2017 at 6:32 AM, Torsten Bergmann <[hidden email] > <mailto:[hidden email]>> wrote: > > Hi, > > just something to think about: one thing I always liked about > Smalltalk is that it allows for nice DSL's. We have nice things > like a unit framework in Pharo, ... > > In the most simple case one can easily implement own units just > by providing a unary messages: > > 1 m > 1 second > 1 px > 1 EUR > > One can easily implement an own Money class with a currency and > then do polymorphic tricks like > > 10 EUR + 20 EUR > > But we can currently can not implement special unary selectors > (including special unary selectors with unicode) like: > > 100 % > 20 $ > 40 € > 12 ‰ (for promille) > > Especially things like 20 % would be nice for layout issues or > other (Bloc comes to mind). > > Maybe we should put that on the roadmap of Pharo because IMHO it > would be cool to support such things in the > future. Dont know how much effort it currently means on the > technical level but maybe others can comment. > > Thx > T. > > > |
Parsing difficulty also means that it is harder for humans to understand, to explain to (new) users.
It would be pretty strange to have binary selectors that are unary, is my first reaction. > On 17 Nov 2017, at 18:32, Thierry Goubier <[hidden email]> wrote: > > Le 17/11/2017 à 10:14, Nicolas Cellier a écrit : >> 2017-11-17 17:40 GMT+01:00 Gabriel Cotelli <[hidden email] <mailto:[hidden email]>>: >> I would really like to see % removed as a binary selector and >> available to use in unary or keyword ones. The only implementor in a >> Pharo 6 image is: >> % aNumber >> ^ self \\ aNumber >> +1, such alias has nothing to do in Kernel >> So it's juts aliasing \\ , and % most widespread usage in the real >> world is por percentages (the use in modular arithmetic is more a >> programming thing that math notation I think). >> And for allowing more Unicode code points for selector names I'm >> totally in for Symbols, Arrows, Math Symbols, etc... We just need to >> analyse the ones that makes sense as binary selectors, and the ones >> allowed for unary and keyword ones. This will allow us to write >> pretty cool DSLs. >> Just my opinion. >> This could also be the case for punctuation like ! and ? >> The idea of Torsten is more generic, any combination of binary char could be used. >> From what I understand from https://en.wikipedia.org/wiki/LR_parser we would just have to scan one more token ahead for deciding if unary or binary, and could preserve our simple shift reduce steps... > > The Smalltalk parsers being handwritten, there wouldn't be shift/reduces to contend with, and, anyway, the lexer doesn't shift/reduce; it would simply creates a token up to the next separator (that is goble up the next space/cr/end of line dot/closing parenthesis/etc...) > >> So it seems doable for resolving the send. > > Sort of. The parser difficulty would be this one: > > anObject % print > > Is this a binary selector with a print argument or two unary selectors? > > Using the symbol table when you parse would solve it, but that is certainly not context free... > >> More problematic would be the declaration of method, if we have both a unary + and a binary +, we will need new syntax for marking the difference. > > In most cases, distinguishing between unary + declaration and binary + declaration would be doable: > > + whatever > > is the start of a binary selector > > + ^ self > > is the start (or the declaration of) an unary selector. > > But writing > > + self > > Can be interpreted as either unary plus doing self, or binary + method definition... > >> Whether it's worth or not is another matter... > > Well, one should probably try to implement the various parsers for that (extend RB, extend the SmaCC Smalltalk parser, extend the Petit Parser) to see how much complexity it would bring. > > Coming up with strange interpretations one could do with that syntax can be helpfull as well. > > Regards, > > Thierry > >> On Fri, Nov 17, 2017 at 6:32 AM, Torsten Bergmann <[hidden email] >> <mailto:[hidden email]>> wrote: >> Hi, >> just something to think about: one thing I always liked about >> Smalltalk is that it allows for nice DSL's. We have nice things >> like a unit framework in Pharo, ... >> In the most simple case one can easily implement own units just >> by providing a unary messages: >> 1 m >> 1 second >> 1 px >> 1 EUR >> One can easily implement an own Money class with a currency and >> then do polymorphic tricks like >> 10 EUR + 20 EUR >> But we can currently can not implement special unary selectors >> (including special unary selectors with unicode) like: >> 100 % >> 20 $ >> 40 € >> 12 ‰ (for promille) >> Especially things like 20 % would be nice for layout issues or >> other (Bloc comes to mind). >> Maybe we should put that on the roadmap of Pharo because IMHO it >> would be cool to support such things in the >> future. Dont know how much effort it currently means on the >> technical level but maybe others can comment. >> Thx >> T. > > |
In reply to this post by Thierry Goubier
2017-11-17 18:32 GMT+01:00 Thierry Goubier <[hidden email]>: Le 17/11/2017 à 10:14, Nicolas Cellier a écrit : I don't have academical cursus, so I may be approximate, but the manually written parsers just have to read a single token ahead so far, and linearly build the parseNode, to me it was equivalent.
Yes, that's a severe limitation. Context free => it's a binary... Or we have to use ( ). But then it's unfriendly to have different rules for unary symbols versus unary words... It devaluates the idea...
Yes I was thinking of similar issues... + b | c | a binary plus with unused temp and implicit ^self, or unary + with binary | sent to b with unary (c |) parameter etc... So we need a new syntax meaning that there is no parameter, like + ][ or anything yet unused...
|
In reply to this post by Sven Van Caekenberghe-2
Le 17/11/2017 à 11:08, Sven Van Caekenberghe a écrit :
> Parsing difficulty also means that it is harder for humans to understand, to explain to (new) users. Agreed. This is a significant issue in some programming languages. > It would be pretty strange to have binary selectors that are unary, is my first reaction. Or that it could be both... An unary +, then a binary +, maybe not in the same class... Thierry > >> On 17 Nov 2017, at 18:32, Thierry Goubier <[hidden email]> wrote: >> >> Le 17/11/2017 à 10:14, Nicolas Cellier a écrit : >>> 2017-11-17 17:40 GMT+01:00 Gabriel Cotelli <[hidden email] <mailto:[hidden email]>>: >>> I would really like to see % removed as a binary selector and >>> available to use in unary or keyword ones. The only implementor in a >>> Pharo 6 image is: >>> % aNumber >>> ^ self \\ aNumber >>> +1, such alias has nothing to do in Kernel >>> So it's juts aliasing \\ , and % most widespread usage in the real >>> world is por percentages (the use in modular arithmetic is more a >>> programming thing that math notation I think). >>> And for allowing more Unicode code points for selector names I'm >>> totally in for Symbols, Arrows, Math Symbols, etc... We just need to >>> analyse the ones that makes sense as binary selectors, and the ones >>> allowed for unary and keyword ones. This will allow us to write >>> pretty cool DSLs. >>> Just my opinion. >>> This could also be the case for punctuation like ! and ? >>> The idea of Torsten is more generic, any combination of binary char could be used. >>> From what I understand from https://en.wikipedia.org/wiki/LR_parser we would just have to scan one more token ahead for deciding if unary or binary, and could preserve our simple shift reduce steps... >> >> The Smalltalk parsers being handwritten, there wouldn't be shift/reduces to contend with, and, anyway, the lexer doesn't shift/reduce; it would simply creates a token up to the next separator (that is goble up the next space/cr/end of line dot/closing parenthesis/etc...) >> >>> So it seems doable for resolving the send. >> >> Sort of. The parser difficulty would be this one: >> >> anObject % print >> >> Is this a binary selector with a print argument or two unary selectors? >> >> Using the symbol table when you parse would solve it, but that is certainly not context free... >> >>> More problematic would be the declaration of method, if we have both a unary + and a binary +, we will need new syntax for marking the difference. >> >> In most cases, distinguishing between unary + declaration and binary + declaration would be doable: >> >> + whatever >> >> is the start of a binary selector >> >> + ^ self >> >> is the start (or the declaration of) an unary selector. >> >> But writing >> >> + self >> >> Can be interpreted as either unary plus doing self, or binary + method definition... >> >>> Whether it's worth or not is another matter... >> >> Well, one should probably try to implement the various parsers for that (extend RB, extend the SmaCC Smalltalk parser, extend the Petit Parser) to see how much complexity it would bring. >> >> Coming up with strange interpretations one could do with that syntax can be helpfull as well. >> >> Regards, >> >> Thierry >> >>> On Fri, Nov 17, 2017 at 6:32 AM, Torsten Bergmann <[hidden email] >>> <mailto:[hidden email]>> wrote: >>> Hi, >>> just something to think about: one thing I always liked about >>> Smalltalk is that it allows for nice DSL's. We have nice things >>> like a unit framework in Pharo, ... >>> In the most simple case one can easily implement own units just >>> by providing a unary messages: >>> 1 m >>> 1 second >>> 1 px >>> 1 EUR >>> One can easily implement an own Money class with a currency and >>> then do polymorphic tricks like >>> 10 EUR + 20 EUR >>> But we can currently can not implement special unary selectors >>> (including special unary selectors with unicode) like: >>> 100 % >>> 20 $ >>> 40 € >>> 12 ‰ (for promille) >>> Especially things like 20 % would be nice for layout issues or >>> other (Bloc comes to mind). >>> Maybe we should put that on the roadmap of Pharo because IMHO it >>> would be cool to support such things in the >>> future. Dont know how much effort it currently means on the >>> technical level but maybe others can comment. >>> Thx >>> T. >> >> > > > |
In reply to this post by Nicolas Cellier
If a valid Smalltalk method identifier contains only... [a-zA-Z][a-zA-Z0-9]*
then one option could be that unary symbols must touch the previous identifier, i.e. no intervening whitespace, 100% 20$ 40€ 12‰ portion% someMoney$ or pick a new "unary separator/binder" 100'% or 100 '% 20'$ or 20 '$ 40'€ or 40 '€ 12'‰ or 12 '‰ portion'% someMoney'$ or re-use the existing colon which indicates an argument to the right, to indicate an argument to the left.... 100 :% 20 :$ 40 :€ 12 :‰ (for promille) portion :% someMoney :$ which is unambiguous regarding block variable definitions since no messages are valid in the block variable definition area, but this may complicate precedence semantics due to its similarity to a keyword selector, and would complicate things if the space was missing. On 18 November 2017 at 04:02, Nicolas Cellier <[hidden email]> wrote:
Less ambiguous... anObject'% print anObject '% print
a "unary binder" could distinguish them... binary method definition... + self unary method definition... '+ self
+ b | c | binary plus, unused temporary c '+ b | c | unary plus, first | is binary, second | is unary or maybe it would need to be... '+ b | c'|
without ' + || b | c | empty local variable definition, next | is binary, second | is unary also with local variable... + | lv1 | b | c | + | lv1 | b | c'|
btw, did the recently announced QA Release Tests add enforcement all selectors to start with a lowercase? I felt that one a bit overly-restrictive, which would break such a currency DSL. cheers -ben
|
In reply to this post by Nicolas Cellier
Le 17 nov. 2017 1:03 PM, "Nicolas Cellier" <[hidden email]> a écrit :
What I meant is that at the tokenifier stage, the parser already combines characters into a single token. So the parser is yes using only a single token lookahead,, but where a token can be multiple characters.
I agree.
I haven't seen this one coming.
It starts to look very convoluted. Yuck. Thierry
|
In reply to this post by Ben Coman
Personnally, I against adding those special symbols. They add close to nothing (except complexity in the parser) to what we can actually do! Besides, what does 30$ + 17$ add up to? Oh! Did I tell you it was actually $30USD + $17CAN ? :) ----------------- Benoît St-Jean Yahoo! Messenger: bstjean Twitter: @BenLeChialeux Pinterest: benoitstjean Instagram: Chef_Benito IRC: lamneth Blogue: endormitoire.wordpress.com "A standpoint is an intellectual horizon of radius zero". (A. Einstein)
On Friday, November 17, 2017, 6:41:11 PM EST, Ben Coman <[hidden email]> wrote:
If a valid Smalltalk method identifier contains only... [a-zA-Z][a-zA-Z0-9]* then one option could be that unary symbols must touch the previous identifier, i.e. no intervening whitespace, 100% 20$ 40€ 12‰ portion% someMoney$ or pick a new "unary separator/binder" 100'% or 100 '% 20'$ or 20 '$ 40'€ or 40 '€ 12'‰ or 12 '‰ portion'% someMoney'$ or re-use the existing colon which indicates an argument to the right, to indicate an argument to the left.... 100 :% 20 :$ 40 :€ 12 :‰ (for promille) portion :% someMoney :$ which is unambiguous regarding block variable definitions since no messages are valid in the block variable definition area, but this may complicate precedence semantics due to its similarity to a keyword selector, and would complicate things if the space was missing. On 18 November 2017 at 04:02, Nicolas Cellier <[hidden email]> wrote:
Less ambiguous... anObject'% print anObject '% print
a "unary binder" could distinguish them... binary method definition... + self unary method definition... '+ self
+ b | c | binary plus, unused temporary c '+ b | c | unary plus, first | is binary, second | is unary or maybe it would need to be... '+ b | c'|
without ' + || b | c | empty local variable definition, next | is binary, second | is unary also with local variable... + | lv1 | b | c | + | lv1 | b | c'|
btw, did the recently announced QA Release Tests add enforcement all selectors to start with a lowercase? I felt that one a bit overly-restrictive, which would break such a currency DSL. cheers -ben
|
I just randomly ran into this project https://github.com/ba-st/aconcagua which seems to be designed around such unary selectors. On Sat, Nov 18, 2017 at 12:55 AM, Benoit St-Jean via Pharo-dev <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |