Smalltalk › Cuis Smalltalk

Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited)

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

13 messages Options

Phil B

Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited)

OK, I know we've been going on and on about OMeta recently but this one
has a Smalltalk-centric twist: it really just struck me that sure,
OMeta can parse strings like nobody's business, but one of its claims to
fame is that it is actually a fully object oriented *object* parser.

So forgetting about all applications involving strings, what real world
uses can you think of where an object parser might come in handy? You
have all of the capabilities that a parser brings to bear in terms of
pattern matching and being able to speculatively introspect objects in
its search, but on any arbitrary collection or stream of objects. Keep
in mind that anything coming in over the wire (i.e. network) has already
been serialized so reconstituting it as an object graph seems
inefficient and of questionable value if the sole purpose in doing so is
to perform a task you could have easily done in it's default text/binary
form... but maybe I'm looking at it wrong.

The only thing that really jumps out at me is locally generated
time-series data (be it midi, audio, video, whatever) where you are
trying to prototype a solution without an existing framework and/or
experiment with with what the framework you are developing should look
like? Another possibility (and I'm reaching here) is maybe some sort of
funky resource caching/scheduling scheme? Would the value just be at
the investigative/prototyping stage or can you think of reasons why it
would make sense to stick with a parser-based approach longer term?

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

Edgar De Cleene

Re: Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited)

On 5/22/15, 1:49 AM, "Phil (list)" <[hidden email]> wrote:

> So forgetting about all applications involving strings, what real world
> uses can you think of where an object parser might come in handy? You
> have all of the capabilities that a parser brings to bear in terms of
> pattern matching and being able to speculatively introspect objects in
> its search, but on any arbitrary collection or stream of objects. Keep
> in mind that anything coming in over the wire (i.e. network) has already
> been serialized so reconstituting it as an object graph seems
> inefficient and of questionable value if the sole purpose in doing so is
> to perform a task you could have easily done in it's default text/binary
> form... but maybe I'm looking at it wrong.

I using a serialized ReferenceStream saved as .obj which I use for forced
compatibility between Squeak. Pharo and Cuis.
Works for roughly compatible classes,

But we could use to map objects between forks.

See the attached for Squeak 4.6.

I use the SqueakLight idea of Object lookForClass: for not enough smart DNU
look in a class repository which could be created in Dropbox or Git or
whatever external place for any fork.

See

http://squeakros.org/4dot6
http://squeakros.org/4dot5
http://squeakros.org/4dot4
http://squeakros.org/3dot10
http://squeakros.org/3dot9
http://squeakros.org/3dot8
http://squeakros.org/Pharo2dot0

Combining this rough idea with OMeta think maybe could made smart
export/import between forks.

Edgar

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

ObjectEnhEd.3.cs (23K) Download Attachment

Thierry Goubier

Re: Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited)

In reply to this post by Phil B

Hi all,

first post here about Cuis, and this is a question I am interested in... I do believe the viewpoint institutes documents have a few answers about that (parsers for network protocols, etc...). But still...

I'm in a strange position about OMeta which is I don't see the benefits. I do have the same position about PetitParser, but with even worse data points which is I know precisely the performance loss of going the petit parser way.

I have been writing compiler front-ends for the past 7 years, first with Flex / Bison and C, and then with Smalltalk / SmaCC (I maintain SmaCC for Pharo). I see the work done by John Brant and Don Roberts first hand (RB, SmaCC, generalised refactoring in SmaCC) and I know that both OMeta and petit parser are using for me what is a very limited form of parsing, with additionally a large performance penalty. Moreover, grammars produced in the PetitParser case are as long, if not longer than the equivalent SmaCC grammar.

So what are the benefits of OMeta? Note that SmaCC would very easily do parsing over any kind of objects, not only tokens.

Thierry

2015-05-22 6:49 GMT+02:00 Phil (list) <[hidden email]>:

OK, I know we've been going on and on about OMeta recently but this one
has a Smalltalk-centric twist: it really just struck me that sure,
OMeta can parse strings like nobody's business, but one of its claims to
fame is that it is actually a fully object oriented *object* parser.

So forgetting about all applications involving strings, what real world
uses can you think of where an object parser might come in handy? You
have all of the capabilities that a parser brings to bear in terms of
pattern matching and being able to speculatively introspect objects in
its search, but on any arbitrary collection or stream of objects. Keep
in mind that anything coming in over the wire (i.e. network) has already
been serialized so reconstituting it as an object graph seems
inefficient and of questionable value if the sole purpose in doing so is
to perform a task you could have easily done in it's default text/binary
form... but maybe I'm looking at it wrong.

The only thing that really jumps out at me is locally generated
time-series data (be it midi, audio, video, whatever) where you are
trying to prototype a solution without an existing framework and/or
experiment with with what the framework you are developing should look
like? Another possibility (and I'm reaching here) is maybe some sort of
funky resource caching/scheduling scheme? Would the value just be at
the investigative/prototyping stage or can you think of reasons why it
would make sense to stick with a parser-based approach longer term?

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

Phil B

Re: Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited)

In reply to this post by Edgar De Cleene

On Fri, 2015-05-22 at 03:57 -0300, Edgar J. De Cleene wrote:

>
>
> On 5/22/15, 1:49 AM, "Phil (list)" <[hidden email]> wrote:
>
> > So forgetting about all applications involving strings, what real world
> > uses can you think of where an object parser might come in handy? You
> > have all of the capabilities that a parser brings to bear in terms of
> > pattern matching and being able to speculatively introspect objects in
> > its search, but on any arbitrary collection or stream of objects. Keep
> > in mind that anything coming in over the wire (i.e. network) has already
> > been serialized so reconstituting it as an object graph seems
> > inefficient and of questionable value if the sole purpose in doing so is
> > to perform a task you could have easily done in it's default text/binary
> > form... but maybe I'm looking at it wrong.
>
> I using a serialized ReferenceStream saved as .obj which I use for forced
> compatibility between Squeak. Pharo and Cuis.
> Works for roughly compatible classes,
>
> But we could use to map objects between forks.
>
> See the attached for Squeak 4.6.
>
> I use the SqueakLight idea of Object lookForClass: for not enough smart DNU
> look in a class repository which could be created in Dropbox or Git or
> whatever external place for any fork.
>
> See
>
> http://squeakros.org/4dot6
> http://squeakros.org/4dot5
> http://squeakros.org/4dot4
> http://squeakros.org/3dot10
> http://squeakros.org/3dot9
> http://squeakros.org/3dot8
> http://squeakros.org/Pharo2dot0
>
>
> Combining this rough idea with OMeta think maybe could made smart
> export/import between forks.
>

Interesting idea. Keep us posted on how it goes!

> Edgar
>
> _______________________________________________
> Cuis mailing list
> [hidden email]
> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

Phil B

Why or why not OMeta? (was Re: Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited))

In reply to this post by Thierry Goubier

Hi Thierry,

On Fri, 2015-05-22 at 09:56 +0200, Thierry Goubier wrote:
> Hi all,
>
>
> first post here about Cuis, and this is a question I am interested
> in... I do believe the viewpoint institutes documents have a few
> answers about that (parsers for network protocols, etc...). But
> still...
>

I could see network protocols as another time-series application.

>
> I'm in a strange position about OMeta which is I don't see the
> benefits. I do have the same position about PetitParser, but with even
> worse data points which is I know precisely the performance loss of
> going the petit parser way.
>

Not strange at all given where it sounds like you're coming from re:
performance being a key requirement. I won't try to put any spin on it:
everything I've seen indicates that OMeta is among the slowest parsers
out there, but pretty quick given its approach. Computing power what it
is today, for many applications the response is 'it's fast enough' or
'who cares?' (see the World Wide Web, client- and server-side, for a
perfect example) I would imagine that if you have heavy data processing
workloads or have very specific response time requirements, then you do
care and OMeta wouldn't work for the application. However, as a
language for DSLs, at most you're typically only going to see a small
fraction of a second of overhead. Another way to think of it: if speed
OF the solution is the priority, don't use OMeta. If speed TO the
solution is the priority, that's what OMeta does well. I'll get more
specific below...

>
> I have been writing compiler front-ends for the past 7 years, first
> with Flex / Bison and C, and then with Smalltalk / SmaCC (I maintain
> SmaCC for Pharo). I see the work done by John Brant and Don Roberts
> first hand (RB, SmaCC, generalised refactoring in SmaCC) and I know
> that both OMeta and petit parser are using for me what is a very
> limited form of parsing, with additionally a large performance
> penalty. Moreover, grammars produced in the PetitParser case are as
> long, if not longer than the equivalent SmaCC grammar.
>

I believe this is one of the areas where OMeta is quite strong: its
grammars are short... very short... 'where did the grammar go?' short.
Consider this example I posted earlier to parse Squeak array
constructors. Here is the Smalltalk version (i.e. what OMeta is
actually doing behind the scenes):

arrayConstr
^ self ometaOr: {[true
ifTrue: [self apply: #token withArgs: {'{'}.
self apply: #expr.
self
many: [true
ifTrue: [self
apply: #token withArgs: {'.'}.
self
apply: #expr]].
self ometaOr: {[self apply: #token
withArgs: {'.'}]. [self apply:
#empty]}.
self apply: #token withArgs: {'}'}]].
[true
ifTrue: [self apply: #token withArgs: {'{'}.
self apply: #token withArgs: {'}'}]]}

and here's the OMeta version:

arrayConstr =

"{" expr ("." expr)* ("." | empty) "}"
| "{" "}"

The only thing that's missing are any semantic predicates and actions so
the ultimate size and readability will be more dictated by how much
Smalltalk code it takes to actually do the work with what OMeta has
parsed.

>
> So what are the benefits of OMeta? Note that SmaCC would very easily
> do parsing over any kind of objects, not only tokens.
>

I understand that OMeta isn't unique in being an object parser and I
started this thread mainly because I'm wondering how much value people
can see in parsing things other than text/binary streams. i.e. is it a
genuinely useful feature or a gimmick/freebie that won't see much use?

As to the first part of your question, here goes: The fundamental
concept that really grabs me is the OMeta approach of being written in
the host language and using source to source translation to target the
host language while essentially hijacking the host language and
environment to fade into the background of the host environment. Want a
new DSL? Subclass OMeta2 and add methods with your rules... done. Want
a new dialect of said DSL? Subclass your first DSL and tweak as needed.
Want to write a program in your DSL? Create a new class and setup the
compiler for that class to use your parser as its 'Language'. For
example, I could create a subclass called Lisp and write every method in
that class as either pure Lisp or as a hyrid of Lisp/Smalltalk/and any
other DSLs I had created, provided I set up the parsing correctly. I'm
not aware of any other parser that does it quite so elegantly.

Now here are the downsides: Alex, the original author or OMeta, is a
parser / languages guy. This work was related to his employment at VPRI
and subsequent PhD work. He's since moved on to other things and
there's still a lot missing from OMeta on Smalltalk in terms of tooling
to actually realize the vision. The lack of debugging support will
drive you nuts until you get used to what it's telling you: have a
syntax error in your rules? '<-- parse error around here -->'... have
fun! A semantic error in your parser? Get used to looking at your
decompiled code (i.e. the actual Smalltalk it generates) when things go
wrong to figure it out. Have a logic/runtime error (i.e. your generated
code is sending a message to nil)? Ditto re: looking at the decompiled
code when it crashes while running. When everything is correct and
working, OMeta is pure joy. When it isn't, welcome back to 1980's style
debugging. Also, if you have an ambiguous grammar look elsewhere...
OMeta won't work for you. Finally, as I mentioned at the top, OMeta
isn't going to set any new parser speed records.

>
> Thierry
>

Hope this helps,
Phil

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

Hannes Hirzel

Re: Why or why not OMeta? (was Re: Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited))

On 5/22/15, Phil (list) <[hidden email]> wrote:
> Hi Thierry,
>
> On Fri, 2015-05-22 at 09:56 +0200, Thierry Goubier wrote:
>> Hi all,
>>
>>
>> first post here about Cuis,

Welcom Thierry!

and this is a question I am interested

>> in... I do believe the viewpoint institutes documents have a few
>> answers about that (parsers for network protocols, etc...). But
>> still...
>>
>
> I could see network protocols as another time-series application.
>
>>
>> I'm in a strange position about OMeta which is I don't see the
>> benefits. I do have the same position about PetitParser, but with even
>> worse data points which is I know precisely the performance loss of
>> going the petit parser way.
>>
>
> Not strange at all given where it sounds like you're coming from re:
> performance being a key requirement. I won't try to put any spin on it:
> everything I've seen indicates that OMeta is among the slowest parsers
> out there, but pretty quick given its approach. Computing power what it
> is today, for many applications the response is 'it's fast enough' or
> 'who cares?' (see the World Wide Web, client- and server-side, for a
> perfect example) I would imagine that if you have heavy data processing
> workloads or have very specific response time requirements, then you do
> care and OMeta wouldn't work for the application. However, as a
> language for DSLs, at most you're typically only going to see a small
> fraction of a second of overhead. Another way to think of it: if speed
> OF the solution is the priority, don't use OMeta. If speed TO the
> solution is the priority, that's what OMeta does well. I'll get more
> specific below...

Speed is an issue but not the only one. We need an in-depth comparison.

>>
>> I have been writing compiler front-ends for the past 7 years, first
>> with Flex / Bison and C, and then with Smalltalk / SmaCC (I maintain
>> SmaCC for Pharo).

Great. Where do you have the repository?

Currently OMeta 2 is in focus but later I think we need all of the
parsers ported to Cuis for comparison.

I see the work done by John Brant and Don Roberts

>> first hand (RB, SmaCC, generalised refactoring in SmaCC) and I know
>> that both OMeta and petit parser are using for me what is a very
>> limited form of parsing, with additionally a large performance
>> penalty. Moreover, grammars produced in the PetitParser case are as
>> long, if not longer than the equivalent SmaCC grammar.
>>
>
> I believe this is one of the areas where OMeta is quite strong: its
> grammars are short... very short... 'where did the grammar go?' short.
> Consider this example I posted earlier to parse Squeak array
> constructors. Here is the Smalltalk version (i.e. what OMeta is
> actually doing behind the scenes):

Yes, the short grammar are a strong point. Compact and readable.

> arrayConstr
> ^ self ometaOr: {[true
> ifTrue: [self apply: #token withArgs: {'{'}.
> self apply: #expr.
> self
> many: [true
> ifTrue: [self
> apply: #token withArgs: {'.'}.
> self
> apply: #expr]].
> self ometaOr: {[self apply: #token
> withArgs: {'.'}]. [self apply:
> #empty]}.
> self apply: #token withArgs: {'}'}]].
> [true
> ifTrue: [self apply: #token withArgs: {'{'}.
> self apply: #token withArgs: {'}'}]]}
>
> and here's the OMeta version:
>
> arrayConstr =
>
> "{" expr ("." expr)* ("." | empty) "}"
> | "{" "}"
>
> The only thing that's missing are any semantic predicates and actions so
> the ultimate size and readability will be more dictated by how much
> Smalltalk code it takes to actually do the work with what OMeta has
> parsed.
>
>>
>> So what are the benefits of OMeta? Note that SmaCC would very easily
>> do parsing over any kind of objects, not only tokens.

Can you elaborate on this?

>
> I understand that OMeta isn't unique in being an object parser and I
> started this thread mainly because I'm wondering how much value people
> can see in parsing things other than text/binary streams. i.e. is it a
> genuinely useful feature or a gimmick/freebie that won't see much use?
>
> As to the first part of your question, here goes: The fundamental
> concept that really grabs me is the OMeta approach of being written in
> the host language and using source to source translation to target the
> host language while essentially hijacking the host language and
> environment to fade into the background of the host environment. Want a
> new DSL? Subclass OMeta2 and add methods with your rules... done. Want
> a new dialect of said DSL? Subclass your first DSL and tweak as needed.
> Want to write a program in your DSL? Create a new class and setup the
> compiler for that class to use your parser as its 'Language'. For
> example, I could create a subclass called Lisp and write every method in
> that class as either pure Lisp or as a hyrid of Lisp/Smalltalk/and any
> other DSLs I had created, provided I set up the parsing correctly. I'm
> not aware of any other parser that does it quite so elegantly.
>
> Now here are the downsides: Alex, the original author or OMeta, is a
> parser / languages guy. This work was related to his employment at VPRI
> and subsequent PhD work. He's since moved on to other things and
> there's still a lot missing from OMeta on Smalltalk in terms of tooling
> to actually realize the vision. The lack of debugging support will
> drive you nuts until you get used to what it's telling you: have a
> syntax error in your rules? '<-- parse error around here -->'... have
> fun! A semantic error in your parser? Get used to looking at your
> decompiled code (i.e. the actual Smalltalk it generates) when things go
> wrong to figure it out. Have a logic/runtime error (i.e. your generated
> code is sending a message to nil)? Ditto re: looking at the decompiled
> code when it crashes while running. When everything is correct and
> working, OMeta is pure joy. When it isn't, welcome back to 1980's style
> debugging. Also, if you have an ambiguous grammar look elsewhere...
> OMeta won't work for you. Finally, as I mentioned at the top, OMeta
> isn't going to set any new parser speed records.
>
>>
>> Thierry
>>
>
> Hope this helps,
> Phil
>
>
>
> _______________________________________________
> Cuis mailing list
> [hidden email]
> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

Thierry Goubier

Re: Why or why not OMeta? (was Re: Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited))

In reply to this post by Phil B

Hi Phil,

thanks for your input. I have a few things to add, which are in summary
that OMeta concise examples are just parsing grammar productions as they
could appear elsewhere (bison/SmaCC)... Certainly beats hand written
parsing :)

Another part of the OMeta use is a kind of code rewritter / generator,
and I wonder about the usefullness of yet another layer of rewriting (as
in aspects / template-based meta programming)...

Le 22/05/2015 21:49, Phil (list) a écrit :

I'm not in the know about that, but I would guess from past discussions
that people dealing with network and communication protocols would like
to see automatons... and parsing is a way to build an automaton.

>> I'm in a strange position about OMeta which is I don't see the
>> benefits. I do have the same position about PetitParser, but with even
>> worse data points which is I know precisely the performance loss of
>> going the petit parser way.
>>
>
> Not strange at all given where it sounds like you're coming from re:
> performance being a key requirement. I won't try to put any spin on it:
> everything I've seen indicates that OMeta is among the slowest parsers
> out there, but pretty quick given its approach. Computing power what it
> is today, for many applications the response is 'it's fast enough' or
> 'who cares?' (see the World Wide Web, client- and server-side, for a
> perfect example) I would imagine that if you have heavy data processing
> workloads or have very specific response time requirements, then you do
> care and OMeta wouldn't work for the application. However, as a
> language for DSLs, at most you're typically only going to see a small
> fraction of a second of overhead. Another way to think of it: if speed
> OF the solution is the priority, don't use OMeta. If speed TO the
> solution is the priority, that's what OMeta does well. I'll get more
> specific below...

>
>>
>> I have been writing compiler front-ends for the past 7 years, first
>> with Flex / Bison and C, and then with Smalltalk / SmaCC (I maintain
>> SmaCC for Pharo). I see the work done by John Brant and Don Roberts
>> first hand (RB, SmaCC, generalised refactoring in SmaCC) and I know
>> that both OMeta and petit parser are using for me what is a very
>> limited form of parsing, with additionally a large performance
>> penalty. Moreover, grammars produced in the PetitParser case are as
>> long, if not longer than the equivalent SmaCC grammar.
>>
>
> I believe this is one of the areas where OMeta is quite strong: its
> grammars are short... very short... 'where did the grammar go?' short.
> Consider this example I posted earlier to parse Squeak array
> constructors. Here is the Smalltalk version (i.e. what OMeta is
> actually doing behind the scenes):
>
> arrayConstr
> ^ self ometaOr: {[true
> ifTrue: [self apply: #token withArgs: {'{'}.
> self apply: #expr.
> self
> many: [true
> ifTrue: [self
> apply: #token withArgs: {'.'}.
> self
> apply: #expr]].
> self ometaOr: {[self apply: #token
> withArgs: {'.'}]. [self apply:
> #empty]}.
> self apply: #token withArgs: {'}'}]].
> [true
> ifTrue: [self apply: #token withArgs: {'{'}.
> self apply: #token withArgs: {'}'}]]}
>
> and here's the OMeta version:
>
> arrayConstr =
>
> "{" expr ("." expr)* ("." | empty) "}"
> | "{" "}"

But this is just as would be the SmaCC grammar for that... This is what
I added to the SmaCC St grammar to support arrays:

| "{" StatementList OptionalPeriod "}"
{ RBArrayNode statements: '2' }
| "{" "}"
{ RBArrayNode new }

This is my hardest thing with OMeta: grammars are, well, grammars. No
difference. PEGs are strictly less interesting than GLR (and probably
even than LALR...).

Object orientation over grammars is interesting, may bring benefits (had
to work on an extension of C in the past), but why PEGs?

> The only thing that's missing are any semantic predicates and actions so
> the ultimate size and readability will be more dictated by how much
> Smalltalk code it takes to actually do the work with what OMeta has
> parsed.

From my experience, this part vastly dwarf any parsing related
complexity :( I'm doing things for R at the moment, a lot slower than I
expected because what is behind the parser is very complex.

>> So what are the benefits of OMeta? Note that SmaCC would very easily
>> do parsing over any kind of objects, not only tokens.
>>
>
> I understand that OMeta isn't unique in being an object parser and I
> started this thread mainly because I'm wondering how much value people
> can see in parsing things other than text/binary streams. i.e. is it a
> genuinely useful feature or a gimmick/freebie that won't see much use?

I do think that OMeta has some unique benefits... but which?

You're probably right on that, and it could bring interesting uses.

> Now here are the downsides: Alex, the original author or OMeta, is a
> parser / languages guy. This work was related to his employment at VPRI
> and subsequent PhD work. He's since moved on to other things and
> there's still a lot missing from OMeta on Smalltalk in terms of tooling
> to actually realize the vision. The lack of debugging support will
> drive you nuts until you get used to what it's telling you: have a
> syntax error in your rules? '<-- parse error around here -->'... have
> fun! A semantic error in your parser? Get used to looking at your
> decompiled code (i.e. the actual Smalltalk it generates) when things go
> wrong to figure it out. Have a logic/runtime error (i.e. your generated
> code is sending a message to nil)? Ditto re: looking at the decompiled
> code when it crashes while running. When everything is correct and
> working, OMeta is pure joy. When it isn't, welcome back to 1980's style
> debugging. Also, if you have an ambiguous grammar look elsewhere...
> OMeta won't work for you. Finally, as I mentioned at the top, OMeta
> isn't going to set any new parser speed records.

I believe some of this is related to DSL support / modeling inside tools
(debugging), and isn't specific to OMeta really (IMHO).

> Hope this helps,

Thanks,

Thierry

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

Thierry Goubier

Re: Why or why not OMeta? (was Re: Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited))

In reply to this post by Hannes Hirzel

Hi,

Le 23/05/2015 11:30, H. Hirzel a écrit :

> On 5/22/15, Phil (list) <[hidden email]> wrote:
>>
>> Not strange at all given where it sounds like you're coming from re:
>> performance being a key requirement. I won't try to put any spin on it:
>> everything I've seen indicates that OMeta is among the slowest parsers
>> out there, but pretty quick given its approach. Computing power what it
>> is today, for many applications the response is 'it's fast enough' or
>> 'who cares?' (see the World Wide Web, client- and server-side, for a
>> perfect example) I would imagine that if you have heavy data processing
>> workloads or have very specific response time requirements, then you do
>> care and OMeta wouldn't work for the application. However, as a
>> language for DSLs, at most you're typically only going to see a small
>> fraction of a second of overhead. Another way to think of it: if speed
>> OF the solution is the priority, don't use OMeta. If speed TO the
>> solution is the priority, that's what OMeta does well. I'll get more
>> specific below...
>
>
> Speed is an issue but not the only one. We need an in-depth comparison.

The investment you have to make in fully functional parsers is high, so
it tend to makes comparisons harder :(

>>> I have been writing compiler front-ends for the past 7 years, first
>>> with Flex / Bison and C, and then with Smalltalk / SmaCC (I maintain
>>> SmaCC for Pharo).
>
> Great. Where do you have the repository?

http://github.com/ThierryGoubier/SmaCC

It's currently SmaCC 2.0.4 as from refactory.com, missing the
master/slave distributed rewriting engine I haven't ported and with the
+ / * / ? additions to the grammar syntax.

Has a strong dependency on RB. Significantly different from the past
1.0.X version that was available for Squeak/Pharo.

> Currently OMeta 2 is in focus but later I think we need all of the
> parsers ported to Cuis for comparison.

What do you intend to do with them? Writing working parsers for a
programming language is a significant investment.

>
> Yes, the short grammar are a strong point. Compact and readable.

The problem is, as far as my experience goes with grammars, they are
neither that short nor that readable ;)

Of course, if you compare to a hand-written parser... Even if for
Smalltalk use, I wouldn't replace the RB parser. Fast, and extremely
powerfull (refactoring).

>>>
>>> So what are the benefits of OMeta? Note that SmaCC would very easily
>>> do parsing over any kind of objects, not only tokens.
>
> Can you elaborate on this?

A SmaCC Parser sees only a stream of instances of SmaCCToken, on which
it only uses an "id" notion (or multiple ids when in GLR/ambiguity
mode). So you could replace tokens by anything else as long as you use
this id thing (the id is a number).

Regards,

Thierry

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

Frank Shearar-3

Re: Why or why not OMeta? (was Re: Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited))

In reply to this post by Phil B

On 22 May 2015 at 20:49, Phil (list) <[hidden email]> wrote:

> Hi Thierry,
>
> On Fri, 2015-05-22 at 09:56 +0200, Thierry Goubier wrote:
>> Hi all,
>>
>>
>> first post here about Cuis, and this is a question I am interested
>> in... I do believe the viewpoint institutes documents have a few
>> answers about that (parsers for network protocols, etc...). But
>> still...
>>
>
> I could see network protocols as another time-series application.
>
>>
>> I'm in a strange position about OMeta which is I don't see the
>> benefits. I do have the same position about PetitParser, but with even
>> worse data points which is I know precisely the performance loss of
>> going the petit parser way.
>>
>
> Not strange at all given where it sounds like you're coming from re:
> performance being a key requirement. I won't try to put any spin on it:
> everything I've seen indicates that OMeta is among the slowest parsers
> out there, but pretty quick given its approach. Computing power what it
> is today, for many applications the response is 'it's fast enough' or
> 'who cares?' (see the World Wide Web, client- and server-side, for a
> perfect example) I would imagine that if you have heavy data processing
> workloads or have very specific response time requirements, then you do
> care and OMeta wouldn't work for the application. However, as a
> language for DSLs, at most you're typically only going to see a small
> fraction of a second of overhead. Another way to think of it: if speed
> OF the solution is the priority, don't use OMeta. If speed TO the
> solution is the priority, that's what OMeta does well. I'll get more
> specific below...
>
>>
>> I have been writing compiler front-ends for the past 7 years, first
>> with Flex / Bison and C, and then with Smalltalk / SmaCC (I maintain
>> SmaCC for Pharo). I see the work done by John Brant and Don Roberts
>> first hand (RB, SmaCC, generalised refactoring in SmaCC) and I know
>> that both OMeta and petit parser are using for me what is a very
>> limited form of parsing, with additionally a large performance
>> penalty. Moreover, grammars produced in the PetitParser case are as
>> long, if not longer than the equivalent SmaCC grammar.
>>
>
> I believe this is one of the areas where OMeta is quite strong: its
> grammars are short... very short... 'where did the grammar go?' short.
> Consider this example I posted earlier to parse Squeak array
> constructors. Here is the Smalltalk version (i.e. what OMeta is
> actually doing behind the scenes):
>
> arrayConstr
> ^ self ometaOr: {[true
> ifTrue: [self apply: #token withArgs: {'{'}.
> self apply: #expr.
> self
> many: [true
> ifTrue: [self
> apply: #token withArgs: {'.'}.
> self
> apply: #expr]].
> self ometaOr: {[self apply: #token
> withArgs: {'.'}]. [self apply:
> #empty]}.
> self apply: #token withArgs: {'}'}]].
> [true
> ifTrue: [self apply: #token withArgs: {'{'}.
> self apply: #token withArgs: {'}'}]]}
>
> and here's the OMeta version:
>
> arrayConstr =
>
> "{" expr ("." expr)* ("." | empty) "}"
> | "{" "}"
>
> The only thing that's missing are any semantic predicates and actions so
> the ultimate size and readability will be more dictated by how much
> Smalltalk code it takes to actually do the work with what OMeta has
> parsed.
>
>>
>> So what are the benefits of OMeta? Note that SmaCC would very easily
>> do parsing over any kind of objects, not only tokens.
>>
>
> I understand that OMeta isn't unique in being an object parser and I
> started this thread mainly because I'm wondering how much value people
> can see in parsing things other than text/binary streams. i.e. is it a
> genuinely useful feature or a gimmick/freebie that won't see much use?

I finally found it: I once implemented pattern matching with OMeta:
http://www.lshift.net/blog/2011/05/15/algebraic-data-types-and-ometa2/

Given a standard binary tree implementation with Tree whose subclasses
are Node, Leaf or Empty, I could write:

depth =
{#Empty} -> [0]
| {#Leaf anything} -> [1]
| {#Node depth:l depth:r} -> [(l max: r) + 1]

sum =
{#Empty} -> [0]
| {#Leaf anything:v} -> [v]
| {#Node sum:l sum:r} -> [l + r]

which, I think, is pretty neat.

frank

> As to the first part of your question, here goes: The fundamental
> concept that really grabs me is the OMeta approach of being written in
> the host language and using source to source translation to target the
> host language while essentially hijacking the host language and
> environment to fade into the background of the host environment. Want a
> new DSL? Subclass OMeta2 and add methods with your rules... done. Want
> a new dialect of said DSL? Subclass your first DSL and tweak as needed.
> Want to write a program in your DSL? Create a new class and setup the
> compiler for that class to use your parser as its 'Language'. For
> example, I could create a subclass called Lisp and write every method in
> that class as either pure Lisp or as a hyrid of Lisp/Smalltalk/and any
> other DSLs I had created, provided I set up the parsing correctly. I'm
> not aware of any other parser that does it quite so elegantly.
>
> Now here are the downsides: Alex, the original author or OMeta, is a
> parser / languages guy. This work was related to his employment at VPRI
> and subsequent PhD work. He's since moved on to other things and
> there's still a lot missing from OMeta on Smalltalk in terms of tooling
> to actually realize the vision. The lack of debugging support will
> drive you nuts until you get used to what it's telling you: have a
> syntax error in your rules? '<-- parse error around here -->'... have
> fun! A semantic error in your parser? Get used to looking at your
> decompiled code (i.e. the actual Smalltalk it generates) when things go
> wrong to figure it out. Have a logic/runtime error (i.e. your generated
> code is sending a message to nil)? Ditto re: looking at the decompiled
> code when it crashes while running. When everything is correct and
> working, OMeta is pure joy. When it isn't, welcome back to 1980's style
> debugging. Also, if you have an ambiguous grammar look elsewhere...
> OMeta won't work for you. Finally, as I mentioned at the top, OMeta
> isn't going to set any new parser speed records.
>
>>
>> Thierry
>>
>
> Hope this helps,
> Phil
>
>
>
> _______________________________________________
> Cuis mailing list
> [hidden email]
> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

Thierry Goubier

Re: Why or why not OMeta? (was Re: Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited))

Hi Frank,

Le 23/05/2015 18:16, Frank Shearar a écrit :

>
> I finally found it: I once implemented pattern matching with OMeta:
> http://www.lshift.net/blog/2011/05/15/algebraic-data-types-and-ometa2/
>
> Given a standard binary tree implementation with Tree whose subclasses
> are Node, Leaf or Empty, I could write:
>
> depth =
> {#Empty} -> [0]
> | {#Leaf anything} -> [1]
> | {#Node depth:l depth:r} -> [(l max: r) + 1]
>
> sum =
> {#Empty} -> [0]
> | {#Leaf anything:v} -> [v]
> | {#Node sum:l sum:r} -> [l + r]
>
> which, I think, is pretty neat.

Yes, it is.

Thanks,

Thierry

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

Phil B

Re: Why or why not OMeta? (was Re: Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited))

In reply to this post by Frank Shearar-3

On Sat, 2015-05-23 at 17:16 +0100, Frank Shearar wrote:

> >
> > I understand that OMeta isn't unique in being an object parser and I
> > started this thread mainly because I'm wondering how much value people
> > can see in parsing things other than text/binary streams. i.e. is it a
> > genuinely useful feature or a gimmick/freebie that won't see much use?
>
> I finally found it: I once implemented pattern matching with OMeta:
> http://www.lshift.net/blog/2011/05/15/algebraic-data-types-and-ometa2/
>
> Given a standard binary tree implementation with Tree whose subclasses
> are Node, Leaf or Empty, I could write:
>
> depth =
> {#Empty} -> [0]
> | {#Leaf anything} -> [1]
> | {#Node depth:l depth:r} -> [(l max: r) + 1]
>
> sum =
> {#Empty} -> [0]
> | {#Leaf anything:v} -> [v]
> | {#Node sum:l sum:r} -> [l + r]
>
> which, I think, is pretty neat.
>

That is pretty cool. I'll add that to the examples.

> frank
>

Thanks,
Phil

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

Thierry Goubier

Re: Why or why not OMeta? (was Re: Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited))

In reply to this post by Frank Shearar-3

Le 23/05/2015 18:16, Frank Shearar a écrit :

> I finally found it: I once implemented pattern matching with OMeta:
> http://www.lshift.net/blog/2011/05/15/algebraic-data-types-and-ometa2/
>
> Given a standard binary tree implementation with Tree whose subclasses
> are Node, Leaf or Empty, I could write:
>
> depth =
> {#Empty} -> [0]
> | {#Leaf anything} -> [1]
> | {#Node depth:l depth:r} -> [(l max: r) + 1]
>
> sum =
> {#Empty} -> [0]
> | {#Leaf anything:v} -> [v]
> | {#Node sum:l sum:r} -> [l + r]
>
> which, I think, is pretty neat.

After reading through, the SmaCC syntax would be:

depth:
<Empty> { 0 }
| <Leaf> <Any> { 1 }
| <Node> depth 'l' depth 'r' { (l max: r) + 1 }
;

sum:
<Empty> { 0 }
| <Leaf> <Number> 'v' { v }
| <Node> sum 'l' sum 'r' { l + r }
;

Which is a very clever way of using a parser :)

And a nice benefit of this is that it would validate that the entry is a
binary tree.

Thanks Frank for that example.

Thierry

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org

Thierry Goubier

Re: Why or why not OMeta? (was Re: Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited))

In reply to this post by Frank Shearar-3

Le 23/05/2015 18:16, Frank Shearar a écrit :

> On 22 May 2015 at 20:49, Phil (list) <[hidden email]> wrote:
>> Hi Thierry,
>>
>> On Fri, 2015-05-22 at 09:56 +0200, Thierry Goubier wrote:
>>> Hi all,
>>>
>>>
>>> first post here about Cuis, and this is a question I am interested
>>> in... I do believe the viewpoint institutes documents have a few
>>> answers about that (parsers for network protocols, etc...). But
>>> still...
>>>
>>
>> I could see network protocols as another time-series application.
>>
>>>
>>> I'm in a strange position about OMeta which is I don't see the
>>> benefits. I do have the same position about PetitParser, but with even
>>> worse data points which is I know precisely the performance loss of
>>> going the petit parser way.
>>>
>>
>> Not strange at all given where it sounds like you're coming from re:
>> performance being a key requirement. I won't try to put any spin on it:
>> everything I've seen indicates that OMeta is among the slowest parsers
>> out there, but pretty quick given its approach. Computing power what it
>> is today, for many applications the response is 'it's fast enough' or
>> 'who cares?' (see the World Wide Web, client- and server-side, for a
>> perfect example) I would imagine that if you have heavy data processing
>> workloads or have very specific response time requirements, then you do
>> care and OMeta wouldn't work for the application. However, as a
>> language for DSLs, at most you're typically only going to see a small
>> fraction of a second of overhead. Another way to think of it: if speed
>> OF the solution is the priority, don't use OMeta. If speed TO the
>> solution is the priority, that's what OMeta does well. I'll get more
>> specific below...
>>
>>>
>>> I have been writing compiler front-ends for the past 7 years, first
>>> with Flex / Bison and C, and then with Smalltalk / SmaCC (I maintain
>>> SmaCC for Pharo). I see the work done by John Brant and Don Roberts
>>> first hand (RB, SmaCC, generalised refactoring in SmaCC) and I know
>>> that both OMeta and petit parser are using for me what is a very
>>> limited form of parsing, with additionally a large performance
>>> penalty. Moreover, grammars produced in the PetitParser case are as
>>> long, if not longer than the equivalent SmaCC grammar.
>>>
>>
>> I believe this is one of the areas where OMeta is quite strong: its
>> grammars are short... very short... 'where did the grammar go?' short.
>> Consider this example I posted earlier to parse Squeak array
>> constructors. Here is the Smalltalk version (i.e. what OMeta is
>> actually doing behind the scenes):
>>
>> arrayConstr
>> ^ self ometaOr: {[true
>> ifTrue: [self apply: #token withArgs: {'{'}.
>> self apply: #expr.
>> self
>> many: [true
>> ifTrue: [self
>> apply: #token withArgs: {'.'}.
>> self
>> apply: #expr]].
>> self ometaOr: {[self apply: #token
>> withArgs: {'.'}]. [self apply:
>> #empty]}.
>> self apply: #token withArgs: {'}'}]].
>> [true
>> ifTrue: [self apply: #token withArgs: {'{'}.
>> self apply: #token withArgs: {'}'}]]}
>>
>> and here's the OMeta version:
>>
>> arrayConstr =
>>
>> "{" expr ("." expr)* ("." | empty) "}"
>> | "{" "}"
>>
>> The only thing that's missing are any semantic predicates and actions so
>> the ultimate size and readability will be more dictated by how much
>> Smalltalk code it takes to actually do the work with what OMeta has
>> parsed.
>>
>>>
>>> So what are the benefits of OMeta? Note that SmaCC would very easily
>>> do parsing over any kind of objects, not only tokens.
>>>
>>
>> I understand that OMeta isn't unique in being an object parser and I
>> started this thread mainly because I'm wondering how much value people
>> can see in parsing things other than text/binary streams. i.e. is it a
>> genuinely useful feature or a gimmick/freebie that won't see much use?
>
> I finally found it: I once implemented pattern matching with OMeta:
> http://www.lshift.net/blog/2011/05/15/algebraic-data-types-and-ometa2/
>
> Given a standard binary tree implementation with Tree whose subclasses
> are Node, Leaf or Empty, I could write:
>
> depth =
> {#Empty} -> [0]
> | {#Leaf anything} -> [1]
> | {#Node depth:l depth:r} -> [(l max: r) + 1]
>
> sum =
> {#Empty} -> [0]
> | {#Leaf anything:v} -> [v]
> | {#Node sum:l sum:r} -> [l + r]
>
> which, I think, is pretty neat.

Thanks Frank, I got it to work with SmaCC as well. Very clever use of a
parser.

Thierry

> frank

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org