Catching EOF in SmaCC

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Catching EOF in SmaCC

Prof. Andrew P. Black
Is there a way, in SmaCC, or either
       
        - writing a grammar production that involves EOF (end of file)
        - or, writing a scanner action that is executed when EOF is read.

        Andrew


Reply | Threaded
Open this post in threaded view
|

Re: Catching EOF in SmaCC

Thierry Goubier
Hi Andrew,

there is an 'E O F' token generated by SmaCC; I haven't tried to use it
in a parser yet.

The second is used in the Python2 parser. See:

https://github.com/ThierryGoubier/SmaCC/blob/master/SmaCC-Python.package/PythonScanner2.class/instance/scannerError.st

Regards,

Thierry

Le 17/11/2017 à 04:11, Prof. Andrew P. Black a écrit :
> Is there a way, in SmaCC, or either
>
> - writing a grammar production that involves EOF (end of file)
> - or, writing a scanner action that is executed when EOF is read.
>
> Andrew
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Catching EOF in SmaCC

Prof. Andrew P. Black

> On 17 Nov 2017, at 14:10 , Thierry Goubier <[hidden email]> wrote:
>
>
> there is an 'E O F' token generated by SmaCC; I haven't tried to use it in a parser yet.

I tried patching the tokenActions table to trap on this, but the token id for E O F is outside of the range of the table.   The Python example that you pointed me to is a little different.  It overrides scannerError, and explicitly adds a newline token if there is an error at the end of the file.  It doesn’t actually use the E O F token, but it is probably a pattern that I can steal.

In the meantime, I made the final StatementSeparator (<newline> or ";") optional in all the productions.  The grammar is a bit ugly, but the parser is cleaner.

I also gave up trying to eliminate intermediate parseTree nodes.  Instead, I eliminated intermediate productions form the grammar.  This makes the grammar more ugly (it has several repetitions where I inlined the intermediate productions), but the
tree construction is a lot more straightforward.

        Andrew


Reply | Threaded
Open this post in threaded view
|

Re: Catching EOF in SmaCC

Thierry Goubier
Hi Andrew,

Le 17/11/2017 à 12:26, Prof. Andrew P. Black a écrit :
>
>> On 17 Nov 2017, at 14:10 , Thierry Goubier <[hidden email]> wrote:
>>
>>
>> there is an 'E O F' token generated by SmaCC; I haven't tried to use it in a parser yet.
>
> I tried patching the tokenActions table to trap on this, but the token id for E O F is outside of the range of the table.   The Python example that you pointed me to is a little different.  It overrides scannerError, and explicitly adds a newline token if there is an error at the end of the file.  It doesn’t actually use the E O F token, but it is probably a pattern that I can steal.

In all honesty, I wasn't thinking about that, but instead to be able to
write '<eof>' in the grammar itself to terminate statements.

The Python approach is necessary because you may have to emit additional
dedent tokens at the end of a file (this is a typical issue of those
meaningfull identation whitespace languages: an idea used in the very
beginning of programming languages, then considered harmfull, then
coming back up again...).

>
> In the meantime, I made the final StatementSeparator (<newline> or ";") optional in all the productions.  The grammar is a bit ugly, but the parser is cleaner.

Which is the cleanest way to do it (at least, like that, you have a
documented way around that instead of carrying around a grammar + hacks
in the scanner)(*)

> I also gave up trying to eliminate intermediate parseTree nodes.  Instead, I eliminated intermediate productions form the grammar.  This makes the grammar more ugly (it has several repetitions where I inlined the intermediate productions), but the
> tree construction is a lot more straightforward.

Sorry for having been unable to answer your questions on that :( I'm
happy to learn you've found a way around it.

Thierry

(*) Which is still way better than a hand-written, recursive descent
parser where any line can hide a hack...

> Andrew
>
>
>