Pharo and special unary selectors

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Pharo and special unary selectors

Torsten Bergmann
Hi,

just something to think about: one thing I always liked about Smalltalk is that it allows for nice DSL's. We have nice things
like a unit framework in Pharo, ...

In the most simple case one can easily implement own units just by providing a unary messages:

 1 m
 1 second
 1 px
 1 EUR  

One can easily implement an own Money class with a currency and then do polymorphic tricks like

  10 EUR + 20 EUR

But we can currently can not implement special unary selectors (including special unary selectors with unicode) like:

  100 %
  20 $
  40 €  
  12 ‰  (for promille)

Especially things like 20 % would be nice for layout issues or other (Bloc comes to mind).

Maybe we should put that on the roadmap of Pharo because IMHO it would be cool to support such things in the
future. Dont know how much effort it currently means on the technical level but maybe others can comment.

Thx
T.

Reply | Threaded
Open this post in threaded view
|

Re: Pharo and special unary selectors

Sven Van Caekenberghe-2
From one perspective, full names are clearer at the cost of some typing

33 percent
45 dollar
100 kilometerPerHour
4 newtonMeter
200 bitsPerSecond

And just like there is a limited namespace for class names (and prefixes to separate them), the namespace of selectors is global and limited too. #percent and #second(s) already exist, for example. So there might be conflicts.

> On 17 Nov 2017, at 10:32, Torsten Bergmann <[hidden email]> wrote:
>
> Hi,
>
> just something to think about: one thing I always liked about Smalltalk is that it allows for nice DSL's. We have nice things
> like a unit framework in Pharo, ...
>
> In the most simple case one can easily implement own units just by providing a unary messages:
>
> 1 m
> 1 second
> 1 px
> 1 EUR  
>
> One can easily implement an own Money class with a currency and then do polymorphic tricks like
>
>  10 EUR + 20 EUR
>
> But we can currently can not implement special unary selectors (including special unary selectors with unicode) like:
>
>  100 %
>  20 $
>  40 €  
>  12 ‰  (for promille)
>
> Especially things like 20 % would be nice for layout issues or other (Bloc comes to mind).
>
> Maybe we should put that on the roadmap of Pharo because IMHO it would be cool to support such things in the
> future. Dont know how much effort it currently means on the technical level but maybe others can comment.
>
> Thx
> T.
>


Reply | Threaded
Open this post in threaded view
|

Re: Pharo and special unary selectors

Peter Uhnak
In reply to this post by Torsten Bergmann
Well you would change the syntax of Pharo, which I don't think will be met with a lot of enthusiasm.

How would you distinguish it from unary messages?
How would you distinguish it from something that should trigger compilation error instead? ($ for characters)

Also; 1 ml is one milliliter, or one mile?

Also considering how was the #, selector was shut down for Bloc vectors, this seems rather bold. :)

But I would imagine that people do and will do named unary selectors within their projects (which is something quite different to having all of it in Pharo by default)

Peter

On Fri, Nov 17, 2017 at 10:32 AM, Torsten Bergmann <[hidden email]> wrote:
Hi,

just something to think about: one thing I always liked about Smalltalk is that it allows for nice DSL's. We have nice things
like a unit framework in Pharo, ...

In the most simple case one can easily implement own units just by providing a unary messages:

 1 m
 1 second
 1 px
 1 EUR

One can easily implement an own Money class with a currency and then do polymorphic tricks like

  10 EUR + 20 EUR

But we can currently can not implement special unary selectors (including special unary selectors with unicode) like:

  100 %
  20 $
  40 €
  12 ‰  (for promille)

Especially things like 20 % would be nice for layout issues or other (Bloc comes to mind).

Maybe we should put that on the roadmap of Pharo because IMHO it would be cool to support such things in the
future. Dont know how much effort it currently means on the technical level but maybe others can comment.

Thx
T.


Reply | Threaded
Open this post in threaded view
|

Re: Pharo and special unary selectors

Torsten Bergmann
> Well you would change the syntax of Pharo

I'm not sure this is a syntax change like others. Haven yet thought about it too deeply.
Where is the difference between an unary method #m and #% from the callers point of view?

Either there is an argument object then % is a binary - if not it is an unary. Maybe it is hard to parse
or introduces new problems.

>But I would imagine that people do and will do named unary selectors within their projects (which is something quite
>different to having all of it in Pharo by default)

I dont want to have them inside by default in the base image - but people might be interested in implementing
% (for percent), € (Euro) or ‰ (promille) as own unary messages in an own custom external package. Or having binary
messages with Unicode.

Just an idea to discuss what it would mean to have such beasts or if it would be useful ;)

Bye
T.



 

Reply | Threaded
Open this post in threaded view
|

Re: Pharo and special unary selectors

gcotelli
In reply to this post by Torsten Bergmann
I would really like to see % removed as a binary selector and available to use in unary or keyword ones. The only implementor in a Pharo 6 image is:

 % aNumber
  
    ^ self \\ aNumber

So it's juts aliasing \\ , and % most widespread usage in the real world is por percentages (the use in modular arithmetic is more a programming thing that math notation I think).

And for allowing more Unicode code points for selector names I'm totally in for Symbols, Arrows, Math Symbols, etc... We just need to analyse the ones that makes sense as binary selectors, and the ones allowed for unary and keyword ones. This will allow us to write pretty cool DSLs.

Just my opinion.

On Fri, Nov 17, 2017 at 6:32 AM, Torsten Bergmann <[hidden email]> wrote:
Hi,

just something to think about: one thing I always liked about Smalltalk is that it allows for nice DSL's. We have nice things
like a unit framework in Pharo, ...

In the most simple case one can easily implement own units just by providing a unary messages:

 1 m
 1 second
 1 px
 1 EUR

One can easily implement an own Money class with a currency and then do polymorphic tricks like

  10 EUR + 20 EUR

But we can currently can not implement special unary selectors (including special unary selectors with unicode) like:

  100 %
  20 $
  40 €
  12 ‰  (for promille)

Especially things like 20 % would be nice for layout issues or other (Bloc comes to mind).

Maybe we should put that on the roadmap of Pharo because IMHO it would be cool to support such things in the
future. Dont know how much effort it currently means on the technical level but maybe others can comment.

Thx
T.


Reply | Threaded
Open this post in threaded view
|

Re: Pharo and special unary selectors

Nicolas Cellier


2017-11-17 17:40 GMT+01:00 Gabriel Cotelli <[hidden email]>:
I would really like to see % removed as a binary selector and available to use in unary or keyword ones. The only implementor in a Pharo 6 image is:

 % aNumber
  
    ^ self \\ aNumber


+1, such alias has nothing to do in Kernel

So it's juts aliasing \\ , and % most widespread usage in the real world is por percentages (the use in modular arithmetic is more a programming thing that math notation I think).

And for allowing more Unicode code points for selector names I'm totally in for Symbols, Arrows, Math Symbols, etc... We just need to analyse the ones that makes sense as binary selectors, and the ones allowed for unary and keyword ones. This will allow us to write pretty cool DSLs.

Just my opinion.

This could also be the case for punctuation like ! and ?

The idea of Torsten is more generic, any combination of binary char could be used.
From what I understand from https://en.wikipedia.org/wiki/LR_parser we would just have to scan one more token ahead for deciding if unary or binary, and could preserve our simple shift reduce steps...
So it seems doable for resolving the send.
More problematic would be the declaration of method, if we have both a unary + and a binary +, we will need new syntax for marking the difference.

Whether it's worth or not is another matter...

On Fri, Nov 17, 2017 at 6:32 AM, Torsten Bergmann <[hidden email]> wrote:
Hi,

just something to think about: one thing I always liked about Smalltalk is that it allows for nice DSL's. We have nice things
like a unit framework in Pharo, ...

In the most simple case one can easily implement own units just by providing a unary messages:

 1 m
 1 second
 1 px
 1 EUR

One can easily implement an own Money class with a currency and then do polymorphic tricks like

  10 EUR + 20 EUR

But we can currently can not implement special unary selectors (including special unary selectors with unicode) like:

  100 %
  20 $
  40 €
  12 ‰  (for promille)

Especially things like 20 % would be nice for layout issues or other (Bloc comes to mind).

Maybe we should put that on the roadmap of Pharo because IMHO it would be cool to support such things in the
future. Dont know how much effort it currently means on the technical level but maybe others can comment.

Thx
T.



Reply | Threaded
Open this post in threaded view
|

Re: Pharo and special unary selectors

Thierry Goubier
Le 17/11/2017 à 10:14, Nicolas Cellier a écrit :

>
>
> 2017-11-17 17:40 GMT+01:00 Gabriel Cotelli <[hidden email]
> <mailto:[hidden email]>>:
>
>     I would really like to see % removed as a binary selector and
>     available to use in unary or keyword ones. The only implementor in a
>     Pharo 6 image is:
>
>       % aNumber
>
>          ^ self \\ aNumber
>
>
> +1, such alias has nothing to do in Kernel
>
>     So it's juts aliasing \\ , and % most widespread usage in the real
>     world is por percentages (the use in modular arithmetic is more a
>     programming thing that math notation I think).
>
>     And for allowing more Unicode code points for selector names I'm
>     totally in for Symbols, Arrows, Math Symbols, etc... We just need to
>     analyse the ones that makes sense as binary selectors, and the ones
>     allowed for unary and keyword ones. This will allow us to write
>     pretty cool DSLs.
>
>     Just my opinion.
>
> This could also be the case for punctuation like ! and ?
>
> The idea of Torsten is more generic, any combination of binary char
> could be used.
>  From what I understand from https://en.wikipedia.org/wiki/LR_parser we
> would just have to scan one more token ahead for deciding if unary or
> binary, and could preserve our simple shift reduce steps...

The Smalltalk parsers being handwritten, there wouldn't be shift/reduces
to contend with, and, anyway, the lexer doesn't shift/reduce; it would
simply creates a token up to the next separator (that is goble up the
next space/cr/end of line dot/closing parenthesis/etc...)

> So it seems doable for resolving the send.

Sort of. The parser difficulty would be this one:

anObject % print

Is this a binary selector with a print argument or two unary selectors?

Using the symbol table when you parse would solve it, but that is
certainly not context free...

> More problematic would be the declaration of method, if we have both a
> unary + and a binary +, we will need new syntax for marking the difference.

In most cases, distinguishing between unary + declaration and binary +
declaration would be doable:

+ whatever

is the start of a binary selector

+ ^ self

is the start (or the declaration of) an unary selector.

But writing

+ self

Can be interpreted as either unary plus doing self, or binary + method
definition...

> Whether it's worth or not is another matter...

Well, one should probably try to implement the various parsers for that
(extend RB, extend the SmaCC Smalltalk parser, extend the Petit Parser)
to see how much complexity it would bring.

Coming up with strange interpretations one could do with that syntax can
be helpfull as well.

Regards,

Thierry

>     On Fri, Nov 17, 2017 at 6:32 AM, Torsten Bergmann <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>         Hi,
>
>         just something to think about: one thing I always liked about
>         Smalltalk is that it allows for nice DSL's. We have nice things
>         like a unit framework in Pharo, ...
>
>         In the most simple case one can easily implement own units just
>         by providing a unary messages:
>
>           1 m
>           1 second
>           1 px
>           1 EUR
>
>         One can easily implement an own Money class with a currency and
>         then do polymorphic tricks like
>
>            10 EUR + 20 EUR
>
>         But we can currently can not implement special unary selectors
>         (including special unary selectors with unicode) like:
>
>            100 %
>            20 $
>            40 €
>            12 ‰  (for promille)
>
>         Especially things like 20 % would be nice for layout issues or
>         other (Bloc comes to mind).
>
>         Maybe we should put that on the roadmap of Pharo because IMHO it
>         would be cool to support such things in the
>         future. Dont know how much effort it currently means on the
>         technical level but maybe others can comment.
>
>         Thx
>         T.
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Pharo and special unary selectors

Sven Van Caekenberghe-2
Parsing difficulty also means that it is harder for humans to understand, to explain to (new) users.

It would be pretty strange to have binary selectors that are unary, is my first reaction.

> On 17 Nov 2017, at 18:32, Thierry Goubier <[hidden email]> wrote:
>
> Le 17/11/2017 à 10:14, Nicolas Cellier a écrit :
>> 2017-11-17 17:40 GMT+01:00 Gabriel Cotelli <[hidden email] <mailto:[hidden email]>>:
>>    I would really like to see % removed as a binary selector and
>>    available to use in unary or keyword ones. The only implementor in a
>>    Pharo 6 image is:
>>      % aNumber
>>         ^ self \\ aNumber
>> +1, such alias has nothing to do in Kernel
>>    So it's juts aliasing \\ , and % most widespread usage in the real
>>    world is por percentages (the use in modular arithmetic is more a
>>    programming thing that math notation I think).
>>    And for allowing more Unicode code points for selector names I'm
>>    totally in for Symbols, Arrows, Math Symbols, etc... We just need to
>>    analyse the ones that makes sense as binary selectors, and the ones
>>    allowed for unary and keyword ones. This will allow us to write
>>    pretty cool DSLs.
>>    Just my opinion.
>> This could also be the case for punctuation like ! and ?
>> The idea of Torsten is more generic, any combination of binary char could be used.
>> From what I understand from https://en.wikipedia.org/wiki/LR_parser we would just have to scan one more token ahead for deciding if unary or binary, and could preserve our simple shift reduce steps...
>
> The Smalltalk parsers being handwritten, there wouldn't be shift/reduces to contend with, and, anyway, the lexer doesn't shift/reduce; it would simply creates a token up to the next separator (that is goble up the next space/cr/end of line dot/closing parenthesis/etc...)
>
>> So it seems doable for resolving the send.
>
> Sort of. The parser difficulty would be this one:
>
> anObject % print
>
> Is this a binary selector with a print argument or two unary selectors?
>
> Using the symbol table when you parse would solve it, but that is certainly not context free...
>
>> More problematic would be the declaration of method, if we have both a unary + and a binary +, we will need new syntax for marking the difference.
>
> In most cases, distinguishing between unary + declaration and binary + declaration would be doable:
>
> + whatever
>
> is the start of a binary selector
>
> + ^ self
>
> is the start (or the declaration of) an unary selector.
>
> But writing
>
> + self
>
> Can be interpreted as either unary plus doing self, or binary + method definition...
>
>> Whether it's worth or not is another matter...
>
> Well, one should probably try to implement the various parsers for that (extend RB, extend the SmaCC Smalltalk parser, extend the Petit Parser) to see how much complexity it would bring.
>
> Coming up with strange interpretations one could do with that syntax can be helpfull as well.
>
> Regards,
>
> Thierry
>
>>    On Fri, Nov 17, 2017 at 6:32 AM, Torsten Bergmann <[hidden email]
>>    <mailto:[hidden email]>> wrote:
>>        Hi,
>>        just something to think about: one thing I always liked about
>>        Smalltalk is that it allows for nice DSL's. We have nice things
>>        like a unit framework in Pharo, ...
>>        In the most simple case one can easily implement own units just
>>        by providing a unary messages:
>>          1 m
>>          1 second
>>          1 px
>>          1 EUR
>>        One can easily implement an own Money class with a currency and
>>        then do polymorphic tricks like
>>           10 EUR + 20 EUR
>>        But we can currently can not implement special unary selectors
>>        (including special unary selectors with unicode) like:
>>           100 %
>>           20 $
>>           40 €
>>           12 ‰  (for promille)
>>        Especially things like 20 % would be nice for layout issues or
>>        other (Bloc comes to mind).
>>        Maybe we should put that on the roadmap of Pharo because IMHO it
>>        would be cool to support such things in the
>>        future. Dont know how much effort it currently means on the
>>        technical level but maybe others can comment.
>>        Thx
>>        T.
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Pharo and special unary selectors

Nicolas Cellier
In reply to this post by Thierry Goubier


2017-11-17 18:32 GMT+01:00 Thierry Goubier <[hidden email]>:
Le 17/11/2017 à 10:14, Nicolas Cellier a écrit :


2017-11-17 17:40 GMT+01:00 Gabriel Cotelli <[hidden email] <mailto:[hidden email]>>:

    I would really like to see % removed as a binary selector and
    available to use in unary or keyword ones. The only implementor in a
    Pharo 6 image is:

      % aNumber

         ^ self \\ aNumber


+1, such alias has nothing to do in Kernel

    So it's juts aliasing \\ , and % most widespread usage in the real
    world is por percentages (the use in modular arithmetic is more a
    programming thing that math notation I think).

    And for allowing more Unicode code points for selector names I'm
    totally in for Symbols, Arrows, Math Symbols, etc... We just need to
    analyse the ones that makes sense as binary selectors, and the ones
    allowed for unary and keyword ones. This will allow us to write
    pretty cool DSLs.

    Just my opinion.

This could also be the case for punctuation like ! and ?

The idea of Torsten is more generic, any combination of binary char could be used.
 From what I understand from https://en.wikipedia.org/wiki/LR_parser we would just have to scan one more token ahead for deciding if unary or binary, and could preserve our simple shift reduce steps...

The Smalltalk parsers being handwritten, there wouldn't be shift/reduces to contend with, and, anyway, the lexer doesn't shift/reduce; it would simply creates a token up to the next separator (that is goble up the next space/cr/end of line dot/closing parenthesis/etc...)

I don't have academical cursus, so I may be approximate, but the manually written parsers just have to read a single token ahead so far, and linearly build the parseNode, to me it was equivalent.


So it seems doable for resolving the send.

Sort of. The parser difficulty would be this one:

anObject % print

Yes, that's a severe limitation. Context free => it's a binary... Or we have to use ( ).
But then it's unfriendly to have different rules for unary symbols versus unary words...
It devaluates the idea...


Is this a binary selector with a print argument or two unary selectors?

Using the symbol table when you parse would solve it, but that is certainly not context free...

More problematic would be the declaration of method, if we have both a unary + and a binary +, we will need new syntax for marking the difference.

In most cases, distinguishing between unary + declaration and binary + declaration would be doable:

+ whatever

is the start of a binary selector

+ ^ self

is the start (or the declaration of) an unary selector.

But writing

+ self

Can be interpreted as either unary plus doing self, or binary + method definition...

Yes I was thinking of similar issues...

+ b | c |

a binary plus with unused temp and implicit ^self, or unary + with binary | sent to b with unary (c |) parameter etc...
So we need a new syntax meaning that there is no parameter, like + ][ or anything yet unused...



Whether it's worth or not is another matter...

Well, one should probably try to implement the various parsers for that (extend RB, extend the SmaCC Smalltalk parser, extend the Petit Parser) to see how much complexity it would bring.

Coming up with strange interpretations one could do with that syntax can be helpfull as well.

Regards,

Thierry

    On Fri, Nov 17, 2017 at 6:32 AM, Torsten Bergmann <[hidden email]
    <mailto:[hidden email]>> wrote:

        Hi,

        just something to think about: one thing I always liked about
        Smalltalk is that it allows for nice DSL's. We have nice things
        like a unit framework in Pharo, ...

        In the most simple case one can easily implement own units just
        by providing a unary messages:

          1 m
          1 second
          1 px
          1 EUR

        One can easily implement an own Money class with a currency and
        then do polymorphic tricks like

           10 EUR + 20 EUR

        But we can currently can not implement special unary selectors
        (including special unary selectors with unicode) like:

           100 %
           20 $
           40 €
           12 ‰  (for promille)

        Especially things like 20 % would be nice for layout issues or
        other (Bloc comes to mind).

        Maybe we should put that on the roadmap of Pharo because IMHO it
        would be cool to support such things in the
        future. Dont know how much effort it currently means on the
        technical level but maybe others can comment.

        Thx
        T.






Reply | Threaded
Open this post in threaded view
|

Re: Pharo and special unary selectors

Thierry Goubier
In reply to this post by Sven Van Caekenberghe-2
Le 17/11/2017 à 11:08, Sven Van Caekenberghe a écrit :
> Parsing difficulty also means that it is harder for humans to understand, to explain to (new) users.

Agreed. This is a significant issue in some programming languages.

> It would be pretty strange to have binary selectors that are unary, is my first reaction.

Or that it could be both...

An unary +, then a binary +, maybe not in the same class...

Thierry

>
>> On 17 Nov 2017, at 18:32, Thierry Goubier <[hidden email]> wrote:
>>
>> Le 17/11/2017 à 10:14, Nicolas Cellier a écrit :
>>> 2017-11-17 17:40 GMT+01:00 Gabriel Cotelli <[hidden email] <mailto:[hidden email]>>:
>>>     I would really like to see % removed as a binary selector and
>>>     available to use in unary or keyword ones. The only implementor in a
>>>     Pharo 6 image is:
>>>       % aNumber
>>>          ^ self \\ aNumber
>>> +1, such alias has nothing to do in Kernel
>>>     So it's juts aliasing \\ , and % most widespread usage in the real
>>>     world is por percentages (the use in modular arithmetic is more a
>>>     programming thing that math notation I think).
>>>     And for allowing more Unicode code points for selector names I'm
>>>     totally in for Symbols, Arrows, Math Symbols, etc... We just need to
>>>     analyse the ones that makes sense as binary selectors, and the ones
>>>     allowed for unary and keyword ones. This will allow us to write
>>>     pretty cool DSLs.
>>>     Just my opinion.
>>> This could also be the case for punctuation like ! and ?
>>> The idea of Torsten is more generic, any combination of binary char could be used.
>>>  From what I understand from https://en.wikipedia.org/wiki/LR_parser we would just have to scan one more token ahead for deciding if unary or binary, and could preserve our simple shift reduce steps...
>>
>> The Smalltalk parsers being handwritten, there wouldn't be shift/reduces to contend with, and, anyway, the lexer doesn't shift/reduce; it would simply creates a token up to the next separator (that is goble up the next space/cr/end of line dot/closing parenthesis/etc...)
>>
>>> So it seems doable for resolving the send.
>>
>> Sort of. The parser difficulty would be this one:
>>
>> anObject % print
>>
>> Is this a binary selector with a print argument or two unary selectors?
>>
>> Using the symbol table when you parse would solve it, but that is certainly not context free...
>>
>>> More problematic would be the declaration of method, if we have both a unary + and a binary +, we will need new syntax for marking the difference.
>>
>> In most cases, distinguishing between unary + declaration and binary + declaration would be doable:
>>
>> + whatever
>>
>> is the start of a binary selector
>>
>> + ^ self
>>
>> is the start (or the declaration of) an unary selector.
>>
>> But writing
>>
>> + self
>>
>> Can be interpreted as either unary plus doing self, or binary + method definition...
>>
>>> Whether it's worth or not is another matter...
>>
>> Well, one should probably try to implement the various parsers for that (extend RB, extend the SmaCC Smalltalk parser, extend the Petit Parser) to see how much complexity it would bring.
>>
>> Coming up with strange interpretations one could do with that syntax can be helpfull as well.
>>
>> Regards,
>>
>> Thierry
>>
>>>     On Fri, Nov 17, 2017 at 6:32 AM, Torsten Bergmann <[hidden email]
>>>     <mailto:[hidden email]>> wrote:
>>>         Hi,
>>>         just something to think about: one thing I always liked about
>>>         Smalltalk is that it allows for nice DSL's. We have nice things
>>>         like a unit framework in Pharo, ...
>>>         In the most simple case one can easily implement own units just
>>>         by providing a unary messages:
>>>           1 m
>>>           1 second
>>>           1 px
>>>           1 EUR
>>>         One can easily implement an own Money class with a currency and
>>>         then do polymorphic tricks like
>>>            10 EUR + 20 EUR
>>>         But we can currently can not implement special unary selectors
>>>         (including special unary selectors with unicode) like:
>>>            100 %
>>>            20 $
>>>            40 €
>>>            12 ‰  (for promille)
>>>         Especially things like 20 % would be nice for layout issues or
>>>         other (Bloc comes to mind).
>>>         Maybe we should put that on the roadmap of Pharo because IMHO it
>>>         would be cool to support such things in the
>>>         future. Dont know how much effort it currently means on the
>>>         technical level but maybe others can comment.
>>>         Thx
>>>         T.
>>
>>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Pharo and special unary selectors

Ben Coman
In reply to this post by Nicolas Cellier
If a valid Smalltalk method identifier contains only... [a-zA-Z][a-zA-Z0-9]*

then one option could be that unary symbols must touch the previous identifier, i.e. no intervening whitespace,

  100%   
  20$
  40€
  12‰ 
  portion%
  someMoney$


or pick a new "unary separator/binder" 

  100'%    or  100 '%
  20'$       or  20 '$
  40'€       or  40 '€
  12'‰     or  12 '‰ 
  portion'%
  someMoney'$


or re-use the existing colon which indicates an argument to the right,
to indicate an argument to the left....

  100 :%   
  20 :$
  40 :€
  12 :‰  (for promille)
  portion :%
  someMoney :$

which is unambiguous regarding block variable definitions since no messages are valid in the block variable definition area,
but this may complicate precedence semantics due to its similarity to a keyword selector, 
and would complicate things if the space was missing. 



On 18 November 2017 at 04:02, Nicolas Cellier <[hidden email]> wrote:


2017-11-17 18:32 GMT+01:00 Thierry Goubier <[hidden email]>:
Le 17/11/2017 à 10:14, Nicolas Cellier a écrit :


2017-11-17 17:40 GMT+01:00 Gabriel Cotelli <[hidden email] <mailto:[hidden email]>>:

    I would really like to see % removed as a binary selector and
    available to use in unary or keyword ones. The only implementor in a
    Pharo 6 image is:

      % aNumber

         ^ self \\ aNumber


+1, such alias has nothing to do in Kernel

    So it's juts aliasing \\ , and % most widespread usage in the real
    world is por percentages (the use in modular arithmetic is more a
    programming thing that math notation I think).

    And for allowing more Unicode code points for selector names I'm
    totally in for Symbols, Arrows, Math Symbols, etc... We just need to
    analyse the ones that makes sense as binary selectors, and the ones
    allowed for unary and keyword ones. This will allow us to write
    pretty cool DSLs.

    Just my opinion.

This could also be the case for punctuation like ! and ?

The idea of Torsten is more generic, any combination of binary char could be used.
 From what I understand from https://en.wikipedia.org/wiki/LR_parser we would just have to scan one more token ahead for deciding if unary or binary, and could preserve our simple shift reduce steps...

The Smalltalk parsers being handwritten, there wouldn't be shift/reduces to contend with, and, anyway, the lexer doesn't shift/reduce; it would simply creates a token up to the next separator (that is goble up the next space/cr/end of line dot/closing parenthesis/etc...)

I don't have academical cursus, so I may be approximate, but the manually written parsers just have to read a single token ahead so far, and linearly build the parseNode, to me it was equivalent.


So it seems doable for resolving the send.

Sort of. The parser difficulty would be this one:

anObject % print

Yes, that's a severe limitation. Context free => it's a binary... Or we have to use ( ). 
But then it's unfriendly to have different rules for unary symbols versus unary words...
It devaluates the idea... 


Is this a binary selector with a print argument or two unary selectors?

Less ambiguous...
anObject'% print
anObject '% print
 
 

Using the symbol table when you parse would solve it, but that is certainly not context free...

More problematic would be the declaration of method, if we have both a unary + and a binary +, we will need new syntax for marking the difference.


In most cases, distinguishing between unary + declaration and binary + declaration would be doable:

+ whatever

is the start of a binary selector

+ ^ self

is the start (or the declaration of) an unary selector.

But writing

+ self

Can be interpreted as either unary plus doing self, or binary + method definition...

a "unary binder" could distinguish them...

binary method definition...
+ self

unary method definition...
'+ self

  
 issues...

+ b | c |

a binary plus with unused temp and implicit ^self, or unary + with binary | sent to b with unary (c |) parameter etc...

+ b | c |
binary plus, unused temporary c

'+ b | c |
unary plus, first | is binary, second | is unary   
or maybe it would need to be...   '+ b | c'|

 
So we need a new syntax meaning that there is no parameter, like + ][ or anything yet unused...

without ' 

+ || b | c |
empty local variable definition,  next | is binary, second | is unary   

also with local variable...
+ | lv1 | b | c |
+ | lv1 | b | c'|

 



Whether it's worth or not is another matter...

Well, one should probably try to implement the various parsers for that (extend RB, extend the SmaCC Smalltalk parser, extend the Petit Parser) to see how much complexity it would bring.

Coming up with strange interpretations one could do with that syntax can be helpfull as well.

Regards,

Thierry

    On Fri, Nov 17, 2017 at 6:32 AM, Torsten Bergmann <[hidden email]
    <mailto:[hidden email]>> wrote:

        Hi,

        just something to think about: one thing I always liked about
        Smalltalk is that it allows for nice DSL's. We have nice things
        like a unit framework in Pharo, ...

        In the most simple case one can easily implement own units just
        by providing a unary messages:

          1 m
          1 second
          1 px
          1 EUR

        One can easily implement an own Money class with a currency and
        then do polymorphic tricks like

           10 EUR + 20 EUR

btw, did the recently announced QA Release Tests add enforcement all selectors to start with a lowercase?
I felt that one a bit overly-restrictive, which would break such a currency DSL.

cheers -ben



        But we can currently can not implement special unary selectors
        (including special unary selectors with unicode) like:

           100 %
           20 $
           40 €
           12 ‰  (for promille)

        Especially things like 20 % would be nice for layout issues or
        other (Bloc comes to mind).

        Maybe we should put that on the roadmap of Pharo because IMHO it
        would be cool to support such things in the
        future. Dont know how much effort it currently means on the
        technical level but maybe others can comment.

        Thx
        T.

Reply | Threaded
Open this post in threaded view
|

Re: Pharo and special unary selectors

Thierry Goubier
In reply to this post by Nicolas Cellier


Le 17 nov. 2017 1:03 PM, "Nicolas Cellier" <[hidden email]> a écrit :


2017-11-17 18:32 GMT+01:00 Thierry Goubier <[hidden email]>:
Le 17/11/2017 à 10:14, Nicolas Cellier a écrit :


2017-11-17 17:40 GMT+01:00 Gabriel Cotelli <[hidden email] <mailto:[hidden email]>>:

    I would really like to see % removed as a binary selector and
    available to use in unary or keyword ones. The only implementor in a
    Pharo 6 image is:

      % aNumber

         ^ self \\ aNumber


+1, such alias has nothing to do in Kernel

    So it's juts aliasing \\ , and % most widespread usage in the real
    world is por percentages (the use in modular arithmetic is more a
    programming thing that math notation I think).

    And for allowing more Unicode code points for selector names I'm
    totally in for Symbols, Arrows, Math Symbols, etc... We just need to
    analyse the ones that makes sense as binary selectors, and the ones
    allowed for unary and keyword ones. This will allow us to write
    pretty cool DSLs.

    Just my opinion.

This could also be the case for punctuation like ! and ?

The idea of Torsten is more generic, any combination of binary char could be used.
 From what I understand from https://en.wikipedia.org/wiki/LR_parser we would just have to scan one more token ahead for deciding if unary or binary, and could preserve our simple shift reduce steps...

The Smalltalk parsers being handwritten, there wouldn't be shift/reduces to contend with, and, anyway, the lexer doesn't shift/reduce; it would simply creates a token up to the next separator (that is goble up the next space/cr/end of line dot/closing parenthesis/etc...)

I don't have academical cursus, so I may be approximate, but the manually written parsers just have to read a single token ahead so far, and linearly build the parseNode, to me it was equivalent.

What I meant is that at the tokenifier stage, the parser already combines characters into a single token. So the parser is yes using only a single token lookahead,, but where a token can be multiple characters.


So it seems doable for resolving the send.

Sort of. The parser difficulty would be this one:

anObject % print

Yes, that's a severe limitation. Context free => it's a binary... Or we have to use ( ).
But then it's unfriendly to have different rules for unary symbols versus unary words...
It devaluates the idea...

I agree.



Is this a binary selector with a print argument or two unary selectors?

Using the symbol table when you parse would solve it, but that is certainly not context free...

More problematic would be the declaration of method, if we have both a unary + and a binary +, we will need new syntax for marking the difference.

In most cases, distinguishing between unary + declaration and binary + declaration would be doable:

+ whatever

is the start of a binary selector

+ ^ self

is the start (or the declaration of) an unary selector.

But writing

+ self

Can be interpreted as either unary plus doing self, or binary + method definition...

Yes I was thinking of similar issues...

+ b | c |

a binary plus with unused temp and implicit ^self, or unary + with binary | sent to b with unary (c |) parameter etc...

I haven't seen this one coming.

So we need a new syntax meaning that there is no parameter, like + ][ or anything yet unused...

It starts to look very convoluted. Yuck.

Thierry




Whether it's worth or not is another matter...

Well, one should probably try to implement the various parsers for that (extend RB, extend the SmaCC Smalltalk parser, extend the Petit Parser) to see how much complexity it would bring.

Coming up with strange interpretations one could do with that syntax can be helpfull as well.

Regards,

Thierry

    On Fri, Nov 17, 2017 at 6:32 AM, Torsten Bergmann <[hidden email]
    <mailto:[hidden email]>> wrote:

        Hi,

        just something to think about: one thing I always liked about
        Smalltalk is that it allows for nice DSL's. We have nice things
        like a unit framework in Pharo, ...

        In the most simple case one can easily implement own units just
        by providing a unary messages:

          1 m
          1 second
          1 px
          1 EUR

        One can easily implement an own Money class with a currency and
        then do polymorphic tricks like

           10 EUR + 20 EUR

        But we can currently can not implement special unary selectors
        (including special unary selectors with unicode) like:

           100 %
           20 $
           40 €
           12 ‰  (for promille)

        Especially things like 20 % would be nice for layout issues or
        other (Bloc comes to mind).

        Maybe we should put that on the roadmap of Pharo because IMHO it
        would be cool to support such things in the
        future. Dont know how much effort it currently means on the
        technical level but maybe others can comment.

        Thx
        T.







Reply | Threaded
Open this post in threaded view
|

Re: Pharo and special unary selectors

Pharo Smalltalk Developers mailing list
In reply to this post by Ben Coman
Personnally, I against adding those special symbols.  They add close to nothing (except complexity in the parser) to what we can actually do!

Besides, what does 30$ + 17$ add up to?  Oh!  Did I tell you it was actually $30USD + $17CAN ? :)


-----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Twitter: @BenLeChialeux
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero".  (A. Einstein)


On Friday, November 17, 2017, 6:41:11 PM EST, Ben Coman <[hidden email]> wrote:


If a valid Smalltalk method identifier contains only... [a-zA-Z][a-zA-Z0-9]*

then one option could be that unary symbols must touch the previous identifier, i.e. no intervening whitespace,

  100%   
  20$
  40€
  12‰ 
  portion%
  someMoney$


or pick a new "unary separator/binder" 

  100'%    or  100 '%
  20'$       or  20 '$
  40'€       or  40 '€
  12'‰     or  12 '‰ 
  portion'%
  someMoney'$


or re-use the existing colon which indicates an argument to the right,
to indicate an argument to the left....

  100 :%   
  20 :$
  40 :€
  12 :‰  (for promille)
  portion :%
  someMoney :$

which is unambiguous regarding block variable definitions since no messages are valid in the block variable definition area,
but this may complicate precedence semantics due to its similarity to a keyword selector, 
and would complicate things if the space was missing. 



On 18 November 2017 at 04:02, Nicolas Cellier <[hidden email]> wrote:


2017-11-17 18:32 GMT+01:00 Thierry Goubier <[hidden email]>:
Le 17/11/2017 à 10:14, Nicolas Cellier a écrit :


2017-11-17 17:40 GMT+01:00 Gabriel Cotelli <[hidden email] <mailto:[hidden email]>>:

    I would really like to see % removed as a binary selector and
    available to use in unary or keyword ones. The only implementor in a
    Pharo 6 image is:

      % aNumber

         ^ self \\ aNumber


+1, such alias has nothing to do in Kernel

    So it's juts aliasing \\ , and % most widespread usage in the real
    world is por percentages (the use in modular arithmetic is more a
    programming thing that math notation I think).

    And for allowing more Unicode code points for selector names I'm
    totally in for Symbols, Arrows, Math Symbols, etc... We just need to
    analyse the ones that makes sense as binary selectors, and the ones
    allowed for unary and keyword ones. This will allow us to write
    pretty cool DSLs.

    Just my opinion.

This could also be the case for punctuation like ! and ?

The idea of Torsten is more generic, any combination of binary char could be used.
 From what I understand from https://en.wikipedia.org/wiki/ LR_parser we would just have to scan one more token ahead for deciding if unary or binary, and could preserve our simple shift reduce steps...

The Smalltalk parsers being handwritten, there wouldn't be shift/reduces to contend with, and, anyway, the lexer doesn't shift/reduce; it would simply creates a token up to the next separator (that is goble up the next space/cr/end of line dot/closing parenthesis/etc...)

I don't have academical cursus, so I may be approximate, but the manually written parsers just have to read a single token ahead so far, and linearly build the parseNode, to me it was equivalent.


So it seems doable for resolving the send.

Sort of. The parser difficulty would be this one:

anObject % print

Yes, that's a severe limitation. Context free => it's a binary... Or we have to use ( ). 
But then it's unfriendly to have different rules for unary symbols versus unary words...
It devaluates the idea... 


Is this a binary selector with a print argument or two unary selectors?

Less ambiguous...
anObject'% print
anObject '% print
 
 

Using the symbol table when you parse would solve it, but that is certainly not context free...

More problematic would be the declaration of method, if we have both a unary + and a binary +, we will need new syntax for marking the difference.


In most cases, distinguishing between unary + declaration and binary + declaration would be doable:

+ whatever

is the start of a binary selector

+ ^ self

is the start (or the declaration of) an unary selector.

But writing

+ self

Can be interpreted as either unary plus doing self, or binary + method definition...

a "unary binder" could distinguish them...

binary method definition...
+ self

unary method definition...
'+ self

  
 issues...

+ b | c |

a binary plus with unused temp and implicit ^self, or unary + with binary | sent to b with unary (c |) parameter etc...

+ b | c |
binary plus, unused temporary c

'+ b | c |
unary plus, first | is binary, second | is unary   
or maybe it would need to be...   '+ b | c'|

 
So we need a new syntax meaning that there is no parameter, like + ][ or anything yet unused...

without ' 

+ || b | c |
empty local variable definition,  next | is binary, second | is unary   

also with local variable...
+ | lv1 | b | c |
+ | lv1 | b | c'|

 



Whether it's worth or not is another matter...

Well, one should probably try to implement the various parsers for that (extend RB, extend the SmaCC Smalltalk parser, extend the Petit Parser) to see how much complexity it would bring.

Coming up with strange interpretations one could do with that syntax can be helpfull as well.

Regards,

Thierry

    On Fri, Nov 17, 2017 at 6:32 AM, Torsten Bergmann <[hidden email]
    <mailto:[hidden email]>> wrote:

        Hi,

        just something to think about: one thing I always liked about
        Smalltalk is that it allows for nice DSL's. We have nice things
        like a unit framework in Pharo, ...

        In the most simple case one can easily implement own units just
        by providing a unary messages:

          1 m
          1 second
          1 px
          1 EUR

        One can easily implement an own Money class with a currency and
        then do polymorphic tricks like

           10 EUR + 20 EUR

btw, did the recently announced QA Release Tests add enforcement all selectors to start with a lowercase?
I felt that one a bit overly-restrictive, which would break such a currency DSL.

cheers -ben



        But we can currently can not implement special unary selectors
        (including special unary selectors with unicode) like:

           100 %
           20 $
           40 €
           12 ‰  (for promille)

        Especially things like 20 % would be nice for layout issues or
        other (Bloc comes to mind).

        Maybe we should put that on the roadmap of Pharo because IMHO it
        would be cool to support such things in the
        future. Dont know how much effort it currently means on the
        technical level but maybe others can comment.

        Thx
        T.

Reply | Threaded
Open this post in threaded view
|

Re: Pharo and special unary selectors

Peter Uhnak
I just randomly ran into this project https://github.com/ba-st/aconcagua which seems to be designed around such unary selectors.

On Sat, Nov 18, 2017 at 12:55 AM, Benoit St-Jean via Pharo-dev <[hidden email]> wrote:


---------- Forwarded message ----------
From: Benoit St-Jean <[hidden email]>
To: Pharo Development List <[hidden email]>
Cc: 
Bcc: 
Date: Fri, 17 Nov 2017 23:55:56 +0000 (UTC)
Subject: Re: [Pharo-dev] Pharo and special unary selectors
Personnally, I against adding those special symbols.  They add close to nothing (except complexity in the parser) to what we can actually do!

Besides, what does 30$ + 17$ add up to?  Oh!  Did I tell you it was actually $30USD + $17CAN ? :)


-----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Twitter: @BenLeChialeux
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero".  (A. Einstein)


On Friday, November 17, 2017, 6:41:11 PM EST, Ben Coman <[hidden email]> wrote:


If a valid Smalltalk method identifier contains only... [a-zA-Z][a-zA-Z0-9]*

then one option could be that unary symbols must touch the previous identifier, i.e. no intervening whitespace,

  100%   
  20$
  40€
  12‰ 
  portion%
  someMoney$


or pick a new "unary separator/binder" 

  100'%    or  100 '%
  20'$       or  20 '$
  40'€       or  40 '€
  12'‰     or  12 '‰ 
  portion'%
  someMoney'$


or re-use the existing colon which indicates an argument to the right,
to indicate an argument to the left....

  100 :%   
  20 :$
  40 :€
  12 :‰  (for promille)
  portion :%
  someMoney :$

which is unambiguous regarding block variable definitions since no messages are valid in the block variable definition area,
but this may complicate precedence semantics due to its similarity to a keyword selector, 
and would complicate things if the space was missing. 



On 18 November 2017 at 04:02, Nicolas Cellier <[hidden email]> wrote:


2017-11-17 18:32 GMT+01:00 Thierry Goubier <[hidden email]>:
Le 17/11/2017 à 10:14, Nicolas Cellier a écrit :


2017-11-17 17:40 GMT+01:00 Gabriel Cotelli <[hidden email] <mailto:[hidden email]>>:

    I would really like to see % removed as a binary selector and
    available to use in unary or keyword ones. The only implementor in a
    Pharo 6 image is:

      % aNumber

         ^ self \\ aNumber


+1, such alias has nothing to do in Kernel

    So it's juts aliasing \\ , and % most widespread usage in the real
    world is por percentages (the use in modular arithmetic is more a
    programming thing that math notation I think).

    And for allowing more Unicode code points for selector names I'm
    totally in for Symbols, Arrows, Math Symbols, etc... We just need to
    analyse the ones that makes sense as binary selectors, and the ones
    allowed for unary and keyword ones. This will allow us to write
    pretty cool DSLs.

    Just my opinion.

This could also be the case for punctuation like ! and ?

The idea of Torsten is more generic, any combination of binary char could be used.
 From what I understand from https://en.wikipedia.org/wiki/ LR_parser we would just have to scan one more token ahead for deciding if unary or binary, and could preserve our simple shift reduce steps...

The Smalltalk parsers being handwritten, there wouldn't be shift/reduces to contend with, and, anyway, the lexer doesn't shift/reduce; it would simply creates a token up to the next separator (that is goble up the next space/cr/end of line dot/closing parenthesis/etc...)

I don't have academical cursus, so I may be approximate, but the manually written parsers just have to read a single token ahead so far, and linearly build the parseNode, to me it was equivalent.


So it seems doable for resolving the send.

Sort of. The parser difficulty would be this one:

anObject % print

Yes, that's a severe limitation. Context free => it's a binary... Or we have to use ( ). 
But then it's unfriendly to have different rules for unary symbols versus unary words...
It devaluates the idea... 


Is this a binary selector with a print argument or two unary selectors?

Less ambiguous...
anObject'% print
anObject '% print
 
 

Using the symbol table when you parse would solve it, but that is certainly not context free...

More problematic would be the declaration of method, if we have both a unary + and a binary +, we will need new syntax for marking the difference.


In most cases, distinguishing between unary + declaration and binary + declaration would be doable:

+ whatever

is the start of a binary selector

+ ^ self

is the start (or the declaration of) an unary selector.

But writing

+ self

Can be interpreted as either unary plus doing self, or binary + method definition...

a "unary binder" could distinguish them...

binary method definition...
+ self

unary method definition...
'+ self

  
 issues...

+ b | c |

a binary plus with unused temp and implicit ^self, or unary + with binary | sent to b with unary (c |) parameter etc...

+ b | c |
binary plus, unused temporary c

'+ b | c |
unary plus, first | is binary, second | is unary   
or maybe it would need to be...   '+ b | c'|

 
So we need a new syntax meaning that there is no parameter, like + ][ or anything yet unused...

without ' 

+ || b | c |
empty local variable definition,  next | is binary, second | is unary   

also with local variable...
+ | lv1 | b | c |
+ | lv1 | b | c'|

 



Whether it's worth or not is another matter...

Well, one should probably try to implement the various parsers for that (extend RB, extend the SmaCC Smalltalk parser, extend the Petit Parser) to see how much complexity it would bring.

Coming up with strange interpretations one could do with that syntax can be helpfull as well.

Regards,

Thierry

    On Fri, Nov 17, 2017 at 6:32 AM, Torsten Bergmann <[hidden email]
    <mailto:[hidden email]>> wrote:

        Hi,

        just something to think about: one thing I always liked about
        Smalltalk is that it allows for nice DSL's. We have nice things
        like a unit framework in Pharo, ...

        In the most simple case one can easily implement own units just
        by providing a unary messages:

          1 m
          1 second
          1 px
          1 EUR

        One can easily implement an own Money class with a currency and
        then do polymorphic tricks like

           10 EUR + 20 EUR

btw, did the recently announced QA Release Tests add enforcement all selectors to start with a lowercase?
I felt that one a bit overly-restrictive, which would break such a currency DSL.

cheers -ben



        But we can currently can not implement special unary selectors
        (including special unary selectors with unicode) like:

           100 %
           20 $
           40 €
           12 ‰  (for promille)

        Especially things like 20 % would be nice for layout issues or
        other (Bloc comes to mind).

        Maybe we should put that on the roadmap of Pharo because IMHO it
        would be cool to support such things in the
        future. Dont know how much effort it currently means on the
        technical level but maybe others can comment.

        Thx
        T.