Another case of RB Parser ~~ Dolphin parser

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Another case of RB Parser ~~ Dolphin parser

Chris Uppal-3
Apologies if this is already recorded as one of the known differences -- I
don't remember it being mentioned before.

The following expression is not recognised as valid syntax by the RB parser,
but is valid according to Dolphin, and (IIRC) ANSI.

    [:x ||| x]

For instance, try:

    SmalltalkParser parseExpression: '[:x ||| x]'

The actual problem is in SmalltalkParser>>parseBlockArgsInto: which has a hack
for the next token being #|| but does not look for #|||

    -- chris


Reply | Threaded
Open this post in threaded view
|

Re: Another case of RB Parser ~~ Dolphin parser

Blair McGlashan
"Chris Uppal" <[hidden email]> wrote in message
news:[hidden email]...
> Apologies if this is already recorded as one of the known differences -- I
> don't remember it being mentioned before.
>
> The following expression is not recognised as valid syntax by the RB
parser,
> but is valid according to Dolphin, and (IIRC) ANSI.
>
>     [:x ||| x]
>
> For instance, try:
>
>     SmalltalkParser parseExpression: '[:x ||| x]'
>
> The actual problem is in SmalltalkParser>>parseBlockArgsInto: which has a
hack
> for the next token being #|| but does not look for #|||

Thanks Chris, you are correct that this is inconsistent between the two
parsers, as for the syntax, well the standard states:

1) "... Each token is to be recognised as the longest string of characters
that is syntactically valid, except where otherwise specified. Unless
otherwise specified, white space or another separator must appear between
any two tokens if the initial characters of the second token would be a
valid extension of the first token." (Section 3.5 Lexical Grammar)
2) "A <temporary variable list> may be empty, containing no identifiers. In
this case the enclosing vertical bars may be immediately adjacent with no
intervening white space." (Section 3.4.2 Method Definition)
3) "If any block arguments are present, the final block argument is followed
by a vertical bar ("|"). If a <temporaries> clause is present then the first
temporary variable is preceded by a vertical bar. A vertical var that
terminates a sequence of block arguments may be immediately adjacent (with
no intervening white space) to the vertical bar that initiates a
<temporaries> clause." (Section 3.4.4 blocks)

As I understand this, you are correct, that it is syntactically valid
according to ANSI, although only because of two explicit exceptions being
made to (1). So it probably does need a hack in the RB parser to accommodate
this exceptional case.

Here's another interesting case:
    [ | || ]        (i.e. an empty block argument list)

This is not regarded as valid by the compilers I checked, but does appear to
be valid according to the standard's BNF. I think this could be put down to
an error in the BNF though, i.e.
    <block body> ::= [<block argument>* '|'] [<temporaries>] [<statements>]
Should in fact be:
    <block body> ::= [<block argument>+ '|'] [<temporaries>] [<statements>]
since the existing BNF is somewhat at odds with (3) above.

Regards

Blair


Reply | Threaded
Open this post in threaded view
|

Re: Another case of RB Parser ~~ Dolphin parser

Chris Uppal-3
Blair,

> Here's another interesting case:
>     [ | || ]        (i.e. an empty block argument list)
>
> This is not regarded as valid by the compilers I checked, but does appear
> to be valid according to the standard's BNF. I think this could be put
> down to an error in the BNF though, i.e.
>     <block body> ::= [<block argument>* '|'] [<temporaries>]
> [<statements>] Should in fact be:
>     <block body> ::= [<block argument>+ '|'] [<temporaries>]
> [<statements>] since the existing BNF is somewhat at odds with (3) above.

I think it /has/ to be an error in the BNF -- it's unparseable as it stands.

For instance:

    [ | one | two ]

is ambiguous, it's clearly a block with no arguments, but does that have a
temporary called "one" and the simple expression

    two

as its body, or is the first "|" just the end of an empty parameter list,
followed by the expression:

    one | two

;-)

    -- chris


Reply | Threaded
Open this post in threaded view
|

Re: Another case of RB Parser ~~ Dolphin parser

John Brant
In reply to this post by Blair McGlashan
Blair McGlashan wrote:

> Here's another interesting case:
>     [ | || ]        (i.e. an empty block argument list)
>
> This is not regarded as valid by the compilers I checked, but does appear to
> be valid according to the standard's BNF. I think this could be put down to
> an error in the BNF though, i.e.
>     <block body> ::= [<block argument>* '|'] [<temporaries>] [<statements>]
> Should in fact be:
>     <block body> ::= [<block argument>+ '|'] [<temporaries>] [<statements>]
> since the existing BNF is somewhat at odds with (3) above.

I believe it is an error. Otherwise, how do we parse "[ | a | true ]"?
Is the first "|" the start of the temporaries or is the end of the block
arguments? If it is the start of the temporaries, then this is
equivalent to "[true]" (a is unused). If it is the end of the block
arguments, then it would be equivalent to "[a | true]".


John Brant


jas
Reply | Threaded
Open this post in threaded view
|

Re: Another case of RB Parser ~~ Dolphin parser

jas
In reply to this post by Blair McGlashan
Blair McGlashan wrote:

>
> Here's another interesting case:
>     [ | || ]        (i.e. an empty block argument list)
>
> This is not regarded as valid by the compilers I checked, but does appear to
> be valid according to the standard's BNF. I think this could be put down to
> an error in the BNF though, i.e.
>     <block body> ::= [<block argument>* '|'] [<temporaries>] [<statements>]

  The BNF would allow the above,
  but would also dis-allow [],
  and [ true ], among many other
  handy, and common, things.

  So the above BNF ...

> Should in fact be:
>     <block body> ::= [(<block argument>+ '|')?] [<temporaries>] [<statements>]


Regards,

-cstb


jas
Reply | Threaded
Open this post in threaded view
|

Never mind. (was Re: Another case of RB Parser ~~ Dolphin parser)

jas
Yes, I see now that the square brackets
are being used to signify [optional].

Whereas the rest of the world seems
to use {optional} or (optional)?

Silly me.

-cstb



jas wrote:

>
> Blair McGlashan wrote:
>
> >
> > Here's another interesting case:
> >     [ | || ]        (i.e. an empty block argument list)
> >
> > This is not regarded as valid by the compilers I checked, but does appear to
> > be valid according to the standard's BNF. I think this could be put down to
> > an error in the BNF though, i.e.
> >     <block body> ::= [<block argument>* '|'] [<temporaries>] [<statements>]
>
>   The BNF would allow the above,
>   but would also dis-allow [],
>   and [ true ], among many other
>   handy, and common, things.
>
>   So the above BNF ...
>
> > Should in fact be:
> >     <block body> ::= [(<block argument>+ '|')?] [<temporaries>] [<statements>]
>
> Regards,
>
> -cstb


Reply | Threaded
Open this post in threaded view
|

Re: Never mind. (was Re: Another case of RB Parser ~~ Dolphin parser)

Blair McGlashan
"jas" <[hidden email]> wrote in message news:[hidden email]...
> Yes, I see now that the square brackets
> are being used to signify [optional].
>

Indeed, as popularised by Niklaus Wirth I think.

> Whereas the rest of the world seems
> to use {optional} or (optional)?

They do? I believe the ISO standard for Extended-BNF specifies square
brackets for optional symbols. In any case there are many variants of EBNF,
so the standard does describe the variant it uses. Although the published
standard has to be purchased, a copy of the final draft is downloadable from
http://www.object-arts.com/Lib/Downloads/AnsiSmalltalkDraft1-9.pdf

Regards

Blair