Smalltalk › Frameworks & Tools › Moose

PetitParser performance

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

24 messages Options

Usman Bhatti

PetitParser performance

Hi,

Since the work on the integration of PPContext in PetitParser, there is significant performance degradation. I have already mentioned that on simple grammar the factor is about 2. But on complex grammar (for example, our proprietary parser for 4D language ), we have seen that degradation is goes well beyond this factor. So, for example, for 800K lines that we parse in under 10 minutes without PPContext work, with the latest version it goes beyond 2h.

I have not been able to reproduce my case on simple grammars. May be we can have some benchmarks using open source parsers and large code bases (e.g. PP Java Parser on JHotDraw or ArgoUML).

Currently, I circumvent this issue by using PP v1.51 but I can provide relevant feedback and run benches on any improvements.

regards.

Usman

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Tudor Girba-2

Re: PetitParser performance

Hi,

Hmm. I thought this was fixed and that you said at the end that the performance penalty is no longer a factor of 2.

Do you have a simple example for checking the simple case of 2x performance loss?

Doru

On Fri, Jan 30, 2015 at 12:28 PM, Usman Bhatti <[hidden email]> wrote:

Hi,

Since the work on the integration of PPContext in PetitParser, there is significant performance degradation. I have already mentioned that on simple grammar the factor is about 2. But on complex grammar (for example, our proprietary parser for 4D language ), we have seen that degradation is goes well beyond this factor. So, for example, for 800K lines that we parse in under 10 minutes without PPContext work, with the latest version it goes beyond 2h.

I have not been able to reproduce my case on simple grammars. May be we can have some benchmarks using open source parsers and large code bases (e.g. PP Java Parser on JHotDraw or ArgoUML).

Currently, I circumvent this issue by using PP v1.51 but I can provide relevant feedback and run benches on any improvements.

regards.

Usman

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

www.tudorgirba.com

"Every thing has its own flow"

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Usman Bhatti

Re: PetitParser performance

Here is a script that shows the factor of 2 in time taken for parsing. The idea of the script is to introduce several wrong branches before hitting the correct one (originally proposed by Guillaume).

|rule wrongBranches|

[wrongBranches := (25 to: 45) inject: 24 asCharacter asParser into: [:acc :each | acc / each asCharacter asParser].

rule := PPDelegateParser new.

rule setParser: $a asParser / ((wrongBranches / $. asParser), rule).

rule parse: (String streamContents: [ :s |

30000 timesRepeat: [s nextPut: $.]. s nextPut: $a ])] timeToRun

Below are the graphs that were generated when varying the string length provided to the above parser description. Green part shows the time taken by the older version and the red part with the latest PP.

Data values and graph generated by the script in the attached file.

regards.

usman

On Fri, Jan 30, 2015 at 1:30 PM, Tudor Girba <[hidden email]> wrote:

Hi,

Hmm. I thought this was fixed and that you said at the end that the performance penalty is no longer a factor of 2.

Do you have a simple example for checking the simple case of 2x performance loss?

Doru

On Fri, Jan 30, 2015 at 12:28 PM, Usman Bhatti <[hidden email]> wrote:
Hi,

Since the work on the integration of PPContext in PetitParser, there is significant performance degradation. I have already mentioned that on simple grammar the factor is about 2. But on complex grammar (for example, our proprietary parser for 4D language ), we have seen that degradation is goes well beyond this factor. So, for example, for 800K lines that we parse in under 10 minutes without PPContext work, with the latest version it goes beyond 2h.

I have not been able to reproduce my case on simple grammars. May be we can have some benchmarks using open source parsers and large code bases (e.g. PP Java Parser on JHotDraw or ArgoUML).

Currently, I circumvent this issue by using PP v1.51 but I can provide relevant feedback and run benches on any improvements.

regards.

Usman

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

--
www.tudorgirba.com

"Every thing has its own flow"

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

script-graph.txt (1K) Download Attachment

Sven Van Caekenberghe-2

Re: PetitParser performance

Usman, this is a really cool bug report !

> On 30 Jan 2015, at 16:36, Usman Bhatti <[hidden email]> wrote:
>
> Here is a script that shows the factor of 2 in time taken for parsing. The idea of the script is to introduce several wrong branches before hitting the correct one (originally proposed by Guillaume).
>
> |rule wrongBranches|
> [wrongBranches := (25 to: 45) inject: 24 asCharacter asParser into: [:acc :each | acc / each asCharacter asParser].
> rule := PPDelegateParser new.
> rule setParser: $a asParser / ((wrongBranches / $. asParser), rule).
> rule parse: (String streamContents: [ :s |
> 30000 timesRepeat: [s nextPut: $.]. s nextPut: $a ])] timeToRun
>
>
> Below are the graphs that were generated when varying the string length provided to the above parser description. Green part shows the time taken by the older version and the red part with the latest PP.
>
> <graph.png>
>
> Data values and graph generated by the script in the attached file.
>
> regards.
>
> usman
>
> On Fri, Jan 30, 2015 at 1:30 PM, Tudor Girba <[hidden email]> wrote:
> Hi,
>
> Hmm. I thought this was fixed and that you said at the end that the performance penalty is no longer a factor of 2.
>
> Do you have a simple example for checking the simple case of 2x performance loss?
>
> Doru
>
>
> On Fri, Jan 30, 2015 at 12:28 PM, Usman Bhatti <[hidden email]> wrote:
> Hi,
>
> Since the work on the integration of PPContext in PetitParser, there is significant performance degradation. I have already mentioned that on simple grammar the factor is about 2. But on complex grammar (for example, our proprietary parser for 4D language ), we have seen that degradation is goes well beyond this factor. So, for example, for 800K lines that we parse in under 10 minutes without PPContext work, with the latest version it goes beyond 2h.
>
> I have not been able to reproduce my case on simple grammars. May be we can have some benchmarks using open source parsers and large code bases (e.g. PP Java Parser on JHotDraw or ArgoUML).
>
> Currently, I circumvent this issue by using PP v1.51 but I can provide relevant feedback and run benches on any improvements.
>
> regards.
>
> Usman
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
>
> --
> www.tudorgirba.com
>
> "Every thing has its own flow"
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
> <script-graph.txt>_______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Usman Bhatti

Re: PetitParser performance

On Fri, Jan 30, 2015 at 9:36 PM, Sven Van Caekenberghe <[hidden email]> wrote:

Usman, this is a really cool bug report !

Thanks :)

> On 30 Jan 2015, at 16:36, Usman Bhatti <[hidden email]> wrote:
>
> Here is a script that shows the factor of 2 in time taken for parsing. The idea of the script is to introduce several wrong branches before hitting the correct one (originally proposed by Guillaume).
>
> |rule wrongBranches|
> [wrongBranches := (25 to: 45) inject: 24 asCharacter asParser into: [:acc :each | acc / each asCharacter asParser].
> rule := PPDelegateParser new.
> rule setParser: $a asParser / ((wrongBranches / $. asParser), rule).
> rule parse: (String streamContents: [ :s |
> 30000 timesRepeat: [s nextPut: $.]. s nextPut: $a ])] timeToRun
>
>
> Below are the graphs that were generated when varying the string length provided to the above parser description. Green part shows the time taken by the older version and the red part with the latest PP.
>
> <graph.png>

>
> Data values and graph generated by the script in the attached file.
>
> regards.
>
> usman
>
> On Fri, Jan 30, 2015 at 1:30 PM, Tudor Girba <[hidden email]> wrote:
> Hi,
>
> Hmm. I thought this was fixed and that you said at the end that the performance penalty is no longer a factor of 2.
>
> Do you have a simple example for checking the simple case of 2x performance loss?
>
> Doru
>
>
> On Fri, Jan 30, 2015 at 12:28 PM, Usman Bhatti <[hidden email]> wrote:
> Hi,
>
> Since the work on the integration of PPContext in PetitParser, there is significant performance degradation. I have already mentioned that on simple grammar the factor is about 2. But on complex grammar (for example, our proprietary parser for 4D language ), we have seen that degradation is goes well beyond this factor. So, for example, for 800K lines that we parse in under 10 minutes without PPContext work, with the latest version it goes beyond 2h.
>
> I have not been able to reproduce my case on simple grammars. May be we can have some benchmarks using open source parsers and large code bases (e.g. PP Java Parser on JHotDraw or ArgoUML).
>
> Currently, I circumvent this issue by using PP v1.51 but I can provide relevant feedback and run benches on any improvements.
>
> regards.
>
> Usman
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
>
> --
> www.tudorgirba.com
>
> "Every thing has its own flow"
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>

> <script-graph.txt>_______________________________________________

> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Jan Kurš

Re: PetitParser performance

Hey all, thanks for the report, I will have a look at it asap.

Cheers Jan

On 1 Feb 2015 20:55, "Usman Bhatti" <[hidden email]> wrote:

On Fri, Jan 30, 2015 at 9:36 PM, Sven Van Caekenberghe <[hidden email]> wrote:
Usman, this is a really cool bug report !

Thanks :)

> On 30 Jan 2015, at 16:36, Usman Bhatti <[hidden email]> wrote:
>
> Here is a script that shows the factor of 2 in time taken for parsing. The idea of the script is to introduce several wrong branches before hitting the correct one (originally proposed by Guillaume).
>
> |rule wrongBranches|
> [wrongBranches := (25 to: 45) inject: 24 asCharacter asParser into: [:acc :each | acc / each asCharacter asParser].
> rule := PPDelegateParser new.
> rule setParser: $a asParser / ((wrongBranches / $. asParser), rule).
> rule parse: (String streamContents: [ :s |
> 30000 timesRepeat: [s nextPut: $.]. s nextPut: $a ])] timeToRun
>
>
> Below are the graphs that were generated when varying the string length provided to the above parser description. Green part shows the time taken by the older version and the red part with the latest PP.
>
> <graph.png>

>
> Data values and graph generated by the script in the attached file.
>
> regards.
>
> usman
>
> On Fri, Jan 30, 2015 at 1:30 PM, Tudor Girba <[hidden email]> wrote:
> Hi,
>
> Hmm. I thought this was fixed and that you said at the end that the performance penalty is no longer a factor of 2.
>
> Do you have a simple example for checking the simple case of 2x performance loss?
>
> Doru
>
>
> On Fri, Jan 30, 2015 at 12:28 PM, Usman Bhatti <[hidden email]> wrote:
> Hi,
>
> Since the work on the integration of PPContext in PetitParser, there is significant performance degradation. I have already mentioned that on simple grammar the factor is about 2. But on complex grammar (for example, our proprietary parser for 4D language ), we have seen that degradation is goes well beyond this factor. So, for example, for 800K lines that we parse in under 10 minutes without PPContext work, with the latest version it goes beyond 2h.
>
> I have not been able to reproduce my case on simple grammars. May be we can have some benchmarks using open source parsers and large code bases (e.g. PP Java Parser on JHotDraw or ArgoUML).
>
> Currently, I circumvent this issue by using PP v1.51 but I can provide relevant feedback and run benches on any improvements.
>
> regards.
>
> Usman
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
>
> --
> www.tudorgirba.com
>
> "Every thing has its own flow"
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>

> <script-graph.txt>_______________________________________________

> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Jan Kurš

Re: PetitParser performance

Hey all, Please note that a new version in of PetitParser (JanKurs-271) should improve performance roughly comparable to the original performance. On my computer, I got improvement from 3900ms to 1300ms.

Cheers,

Jan

On 1 February 2015 at 22:55, Jan Kurš <[hidden email]> wrote:

Hey all, thanks for the report, I will have a look at it asap.

Cheers Jan

On 1 Feb 2015 20:55, "Usman Bhatti" <[hidden email]> wrote:

On Fri, Jan 30, 2015 at 9:36 PM, Sven Van Caekenberghe <[hidden email]> wrote:
Usman, this is a really cool bug report !

Thanks :)

> On 30 Jan 2015, at 16:36, Usman Bhatti <[hidden email]> wrote:
>
> Here is a script that shows the factor of 2 in time taken for parsing. The idea of the script is to introduce several wrong branches before hitting the correct one (originally proposed by Guillaume).
>
> |rule wrongBranches|
> [wrongBranches := (25 to: 45) inject: 24 asCharacter asParser into: [:acc :each | acc / each asCharacter asParser].
> rule := PPDelegateParser new.
> rule setParser: $a asParser / ((wrongBranches / $. asParser), rule).
> rule parse: (String streamContents: [ :s |
> 30000 timesRepeat: [s nextPut: $.]. s nextPut: $a ])] timeToRun
>
>
> Below are the graphs that were generated when varying the string length provided to the above parser description. Green part shows the time taken by the older version and the red part with the latest PP.
>
> <graph.png>

>
> Data values and graph generated by the script in the attached file.
>
> regards.
>
> usman
>
> On Fri, Jan 30, 2015 at 1:30 PM, Tudor Girba <[hidden email]> wrote:
> Hi,
>
> Hmm. I thought this was fixed and that you said at the end that the performance penalty is no longer a factor of 2.
>
> Do you have a simple example for checking the simple case of 2x performance loss?
>
> Doru
>
>
> On Fri, Jan 30, 2015 at 12:28 PM, Usman Bhatti <[hidden email]> wrote:
> Hi,
>
> Since the work on the integration of PPContext in PetitParser, there is significant performance degradation. I have already mentioned that on simple grammar the factor is about 2. But on complex grammar (for example, our proprietary parser for 4D language ), we have seen that degradation is goes well beyond this factor. So, for example, for 800K lines that we parse in under 10 minutes without PPContext work, with the latest version it goes beyond 2h.
>
> I have not been able to reproduce my case on simple grammars. May be we can have some benchmarks using open source parsers and large code bases (e.g. PP Java Parser on JHotDraw or ArgoUML).
>
> Currently, I circumvent this issue by using PP v1.51 but I can provide relevant feedback and run benches on any improvements.
>
> regards.
>
> Usman
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
>
> --
> www.tudorgirba.com
>
> "Every thing has its own flow"
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>

> <script-graph.txt>_______________________________________________

> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Tudor Girba-2

Re: PetitParser performance

Great news.

@Usman: could you check?

Doru

On Fri, Feb 20, 2015 at 9:04 AM, Jan Kurš <[hidden email]> wrote:

Hey all, Please note that a new version in of PetitParser (JanKurs-271) should improve performance roughly comparable to the original performance. On my computer, I got improvement from 3900ms to 1300ms.

Cheers,
Jan

On 1 February 2015 at 22:55, Jan Kurš <[hidden email]> wrote:
Hey all, thanks for the report, I will have a look at it asap.

Cheers Jan

On 1 Feb 2015 20:55, "Usman Bhatti" <[hidden email]> wrote:

On Fri, Jan 30, 2015 at 9:36 PM, Sven Van Caekenberghe <[hidden email]> wrote:
Usman, this is a really cool bug report !

Thanks :)

> On 30 Jan 2015, at 16:36, Usman Bhatti <[hidden email]> wrote:
>
> Here is a script that shows the factor of 2 in time taken for parsing. The idea of the script is to introduce several wrong branches before hitting the correct one (originally proposed by Guillaume).
>
> |rule wrongBranches|
> [wrongBranches := (25 to: 45) inject: 24 asCharacter asParser into: [:acc :each | acc / each asCharacter asParser].
> rule := PPDelegateParser new.
> rule setParser: $a asParser / ((wrongBranches / $. asParser), rule).
> rule parse: (String streamContents: [ :s |
> 30000 timesRepeat: [s nextPut: $.]. s nextPut: $a ])] timeToRun
>
>
> Below are the graphs that were generated when varying the string length provided to the above parser description. Green part shows the time taken by the older version and the red part with the latest PP.
>
> <graph.png>

>
> Data values and graph generated by the script in the attached file.
>
> regards.
>
> usman
>
> On Fri, Jan 30, 2015 at 1:30 PM, Tudor Girba <[hidden email]> wrote:
> Hi,
>
> Hmm. I thought this was fixed and that you said at the end that the performance penalty is no longer a factor of 2.
>
> Do you have a simple example for checking the simple case of 2x performance loss?
>
> Doru
>
>
> On Fri, Jan 30, 2015 at 12:28 PM, Usman Bhatti <[hidden email]> wrote:
> Hi,
>
> Since the work on the integration of PPContext in PetitParser, there is significant performance degradation. I have already mentioned that on simple grammar the factor is about 2. But on complex grammar (for example, our proprietary parser for 4D language ), we have seen that degradation is goes well beyond this factor. So, for example, for 800K lines that we parse in under 10 minutes without PPContext work, with the latest version it goes beyond 2h.
>
> I have not been able to reproduce my case on simple grammars. May be we can have some benchmarks using open source parsers and large code bases (e.g. PP Java Parser on JHotDraw or ArgoUML).
>
> Currently, I circumvent this issue by using PP v1.51 but I can provide relevant feedback and run benches on any improvements.
>
> regards.
>
> Usman
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
>
> --
> www.tudorgirba.com
>
> "Every thing has its own flow"
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>

> <script-graph.txt>_______________________________________________

> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

www.tudorgirba.com

"Every thing has its own flow"

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Usman Bhatti

Re: PetitParser performance

I executed the script that I shared earlier in this thread and I see a considerable performance improvement in the latest version of PP. The attached chart shows the comparison chart between PP v 1.51 (green) with the newer version (red). For shorter strings (i.e. to the left on x-axis), the execution time is almost comparable to the previous version of PetitParser that dates before changes introduced by Jan.

However, when I take the new version and parse the code source of larger code bases, the execution time still get a hit. For example, parsing 100K lines takes almost twice longer. The increase is exponential with even larger ones. That is consistent with results we see in the graph (the larger the chain to be parsed i.e. going from left to right in the x-axis, the higher the performance penalty). Meaning I still prefer the older version.

What will be good is to prepare some benches with Java Parser in PP, as it is open source, and a set of programs (e.g. ArguUML) that can serve as a standard to test execution time for PP that we can run as a jenkins service.

regards.

On Fri, Feb 20, 2015 at 9:09 AM, Tudor Girba <[hidden email]> wrote:

Great news.

@Usman: could you check?

Doru

On Fri, Feb 20, 2015 at 9:04 AM, Jan Kurš <[hidden email]> wrote:
Hey all, Please note that a new version in of PetitParser (JanKurs-271) should improve performance roughly comparable to the original performance. On my computer, I got improvement from 3900ms to 1300ms.

Cheers,
Jan

On 1 February 2015 at 22:55, Jan Kurš <[hidden email]> wrote:
Hey all, thanks for the report, I will have a look at it asap.

Cheers Jan

On 1 Feb 2015 20:55, "Usman Bhatti" <[hidden email]> wrote:

On Fri, Jan 30, 2015 at 9:36 PM, Sven Van Caekenberghe <[hidden email]> wrote:
Usman, this is a really cool bug report !

Thanks :)

> On 30 Jan 2015, at 16:36, Usman Bhatti <[hidden email]> wrote:
>
> Here is a script that shows the factor of 2 in time taken for parsing. The idea of the script is to introduce several wrong branches before hitting the correct one (originally proposed by Guillaume).
>
> |rule wrongBranches|
> [wrongBranches := (25 to: 45) inject: 24 asCharacter asParser into: [:acc :each | acc / each asCharacter asParser].
> rule := PPDelegateParser new.
> rule setParser: $a asParser / ((wrongBranches / $. asParser), rule).
> rule parse: (String streamContents: [ :s |
> 30000 timesRepeat: [s nextPut: $.]. s nextPut: $a ])] timeToRun
>
>
> Below are the graphs that were generated when varying the string length provided to the above parser description. Green part shows the time taken by the older version and the red part with the latest PP.
>
> <graph.png>

>
> Data values and graph generated by the script in the attached file.
>
> regards.
>
> usman
>
> On Fri, Jan 30, 2015 at 1:30 PM, Tudor Girba <[hidden email]> wrote:
> Hi,
>
> Hmm. I thought this was fixed and that you said at the end that the performance penalty is no longer a factor of 2.
>
> Do you have a simple example for checking the simple case of 2x performance loss?
>
> Doru
>
>
> On Fri, Jan 30, 2015 at 12:28 PM, Usman Bhatti <[hidden email]> wrote:
> Hi,
>
> Since the work on the integration of PPContext in PetitParser, there is significant performance degradation. I have already mentioned that on simple grammar the factor is about 2. But on complex grammar (for example, our proprietary parser for 4D language ), we have seen that degradation is goes well beyond this factor. So, for example, for 800K lines that we parse in under 10 minutes without PPContext work, with the latest version it goes beyond 2h.
>
> I have not been able to reproduce my case on simple grammars. May be we can have some benchmarks using open source parsers and large code bases (e.g. PP Java Parser on JHotDraw or ArgoUML).
>
> Currently, I circumvent this issue by using PP v1.51 but I can provide relevant feedback and run benches on any improvements.
>
> regards.
>
> Usman
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
>
> --
> www.tudorgirba.com
>
> "Every thing has its own flow"
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>

> <script-graph.txt>_______________________________________________

> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

--
www.tudorgirba.com

"Every thing has its own flow"

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Jan Kurš

Re: PetitParser performance

Hey,

I was not aware about some exponential element in the overall complexity, it is strange and I will investigate it. Thank you Usman, for pointing this out.

In general, it is hard to add features into the PetitParser and preserve performance. Therefore, we work on a tool, that will take PetitParser, analyze it and generates faster parser. The idea is to use PetitParser while developing the grammar and than generate a fast parser for "real" use. We hope we will be present the tool during the next ESUG.

Cheers,

Jan

On 20 February 2015 at 17:47, Usman Bhatti <[hidden email]> wrote:

I executed the script that I shared earlier in this thread and I see a considerable performance improvement in the latest version of PP. The attached chart shows the comparison chart between PP v 1.51 (green) with the newer version (red). For shorter strings (i.e. to the left on x-axis), the execution time is almost comparable to the previous version of PetitParser that dates before changes introduced by Jan.

However, when I take the new version and parse the code source of larger code bases, the execution time still get a hit. For example, parsing 100K lines takes almost twice longer. The increase is exponential with even larger ones. That is consistent with results we see in the graph (the larger the chain to be parsed i.e. going from left to right in the x-axis, the higher the performance penalty). Meaning I still prefer the older version.

What will be good is to prepare some benches with Java Parser in PP, as it is open source, and a set of programs (e.g. ArguUML) that can serve as a standard to test execution time for PP that we can run as a jenkins service.

regards.

On Fri, Feb 20, 2015 at 9:09 AM, Tudor Girba <[hidden email]> wrote:
Great news.

@Usman: could you check?

Doru

On Fri, Feb 20, 2015 at 9:04 AM, Jan Kurš <[hidden email]> wrote:
Hey all, Please note that a new version in of PetitParser (JanKurs-271) should improve performance roughly comparable to the original performance. On my computer, I got improvement from 3900ms to 1300ms.

Cheers,
Jan

On 1 February 2015 at 22:55, Jan Kurš <[hidden email]> wrote:
Hey all, thanks for the report, I will have a look at it asap.

Cheers Jan

On 1 Feb 2015 20:55, "Usman Bhatti" <[hidden email]> wrote:

On Fri, Jan 30, 2015 at 9:36 PM, Sven Van Caekenberghe <[hidden email]> wrote:
Usman, this is a really cool bug report !

Thanks :)

> On 30 Jan 2015, at 16:36, Usman Bhatti <[hidden email]> wrote:
>
> Here is a script that shows the factor of 2 in time taken for parsing. The idea of the script is to introduce several wrong branches before hitting the correct one (originally proposed by Guillaume).
>
> |rule wrongBranches|
> [wrongBranches := (25 to: 45) inject: 24 asCharacter asParser into: [:acc :each | acc / each asCharacter asParser].
> rule := PPDelegateParser new.
> rule setParser: $a asParser / ((wrongBranches / $. asParser), rule).
> rule parse: (String streamContents: [ :s |
> 30000 timesRepeat: [s nextPut: $.]. s nextPut: $a ])] timeToRun
>
>
> Below are the graphs that were generated when varying the string length provided to the above parser description. Green part shows the time taken by the older version and the red part with the latest PP.
>
> <graph.png>

>
> Data values and graph generated by the script in the attached file.
>
> regards.
>
> usman
>
> On Fri, Jan 30, 2015 at 1:30 PM, Tudor Girba <[hidden email]> wrote:
> Hi,
>
> Hmm. I thought this was fixed and that you said at the end that the performance penalty is no longer a factor of 2.
>
> Do you have a simple example for checking the simple case of 2x performance loss?
>
> Doru
>
>
> On Fri, Jan 30, 2015 at 12:28 PM, Usman Bhatti <[hidden email]> wrote:
> Hi,
>
> Since the work on the integration of PPContext in PetitParser, there is significant performance degradation. I have already mentioned that on simple grammar the factor is about 2. But on complex grammar (for example, our proprietary parser for 4D language ), we have seen that degradation is goes well beyond this factor. So, for example, for 800K lines that we parse in under 10 minutes without PPContext work, with the latest version it goes beyond 2h.
>
> I have not been able to reproduce my case on simple grammars. May be we can have some benchmarks using open source parsers and large code bases (e.g. PP Java Parser on JHotDraw or ArgoUML).
>
> Currently, I circumvent this issue by using PP v1.51 but I can provide relevant feedback and run benches on any improvements.
>
> regards.
>
> Usman
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
>
> --
> www.tudorgirba.com
>
> "Every thing has its own flow"
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>

> <script-graph.txt>_______________________________________________

> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

--
www.tudorgirba.com

"Every thing has its own flow"

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Thierry Goubier

Re: PetitParser performance

Le 21/02/2015 11:14, Jan Kurš a écrit :

> Hey,
>
> I was not aware about some exponential element in the overall
> complexity, it is strange and I will investigate it. Thank you Usman,
> for pointing this out.
>
> In general, it is hard to add features into the PetitParser and preserve
> performance. Therefore, we work on a tool, that will take PetitParser,
> analyze it and generates faster parser. The idea is to use PetitParser
> while developing the grammar and than generate a fast parser for "real"
> use. We hope we will be present the tool during the next ESUG.

I certainly be interested with what you come up with. From profiling
PetitParser, it seems there is at least one low-hanging fruit in term of
performance.

And thanks too Usman, for pointing out long source code files may (or
do) show diverging asymptotic behaviors. And real life use for most
programming languages are very long files :(

Thierry

>
> Cheers,
> Jan
>
> On 20 February 2015 at 17:47, Usman Bhatti <[hidden email]
> <mailto:[hidden email]>> wrote:
>
> I executed the script that I shared earlier in this thread and I see
> a considerable performance improvement in the latest version of
> PP. The attached chart shows the comparison chart between PP v 1.51
> (green) with the newer version (red). For shorter strings (i.e. to
> the left on x-axis), the execution time is almost comparable to the
> previous version of PetitParser that dates before changes introduced
> by Jan.
>
> However, when I take the new version and parse the code source of
> larger code bases, the execution time still get a hit. For example,
> parsing 100K lines takes almost twice longer. The increase is
> exponential with even larger ones. That is consistent with results
> we see in the graph (the larger the chain to be parsed i.e. going
> from left to right in the x-axis, the higher the performance
> penalty). Meaning I still prefer the older version.
>
> What will be good is to prepare some benches with Java Parser in PP,
> as it is open source, and a set of programs (e.g. ArguUML) that can
> serve as a standard to test execution time for PP that we can run as
> a jenkins service.
>
>
> Inline image 1
>
> regards.
>
> On Fri, Feb 20, 2015 at 9:09 AM, Tudor Girba <[hidden email]
> <mailto:[hidden email]>> wrote:
>
> Great news.
>
> @Usman: could you check?
>
> Doru
>
> On Fri, Feb 20, 2015 at 9:04 AM, Jan Kurš <[hidden email]
> <mailto:[hidden email]>> wrote:
>
> Hey all, Please note that a new version in of PetitParser
> (JanKurs-271) should improve performance roughly comparable
> to the original performance. On my computer, I got
> improvement from 3900ms to 1300ms.
>
>
> Cheers,
> Jan
>
> On 1 February 2015 at 22:55, Jan Kurš <[hidden email]
> <mailto:[hidden email]>> wrote:
>
> Hey all, thanks for the report, I will have a look at it
> asap.
>
> Cheers Jan
>
> On 1 Feb 2015 20:55, "Usman Bhatti"
> <[hidden email] <mailto:[hidden email]>>
> wrote:
>
>
>
> On Fri, Jan 30, 2015 at 9:36 PM, Sven Van
> Caekenberghe <[hidden email] <mailto:[hidden email]>> wrote:
>
> Usman, this is a really cool bug report !
>
>
> Thanks :)
>
>
> > On 30 Jan 2015, at 16:36, Usman Bhatti <[hidden email]
> <mailto:[hidden email]>> wrote:
> >
> > Here is a script that shows the factor of 2 in time taken for parsing. The idea of the script is to introduce several wrong branches before hitting the correct one (originally proposed by Guillaume).
> >
> > |rule wrongBranches|
> > [wrongBranches := (25 to: 45) inject: 24 asCharacter asParser into: [:acc :each | acc / each asCharacter asParser].
> > rule := PPDelegateParser new.
> > rule setParser: $a asParser / ((wrongBranches / $. asParser), rule).
> > rule parse: (String streamContents: [ :s |
> > 30000 timesRepeat: [s nextPut: $.]. s nextPut: $a ])] timeToRun
> >
> >
> > Below are the graphs that were generated when varying the string length provided to the above parser description. Green part shows the time taken by the older version and the red part with the latest PP.
> >
> > <graph.png>
> >
> > Data values and graph generated by the script
> in the attached file.
> >
> > regards.
> >
> > usman
> >
> > On Fri, Jan 30, 2015 at 1:30 PM, Tudor Girba
> <[hidden email]
> <mailto:[hidden email]>> wrote:
> > Hi,
> >
> > Hmm. I thought this was fixed and that you
> said at the end that the performance penalty is
> no longer a factor of 2.
> >
> > Do you have a simple example for checking the
> simple case of 2x performance loss?
> >
> > Doru
> >
> >
> > On Fri, Jan 30, 2015 at 12:28 PM, Usman
> Bhatti <[hidden email]
> <mailto:[hidden email]>> wrote:
> > Hi,
> >
> > Since the work on the integration of
> PPContext in PetitParser, there is significant
> performance degradation. I have already
> mentioned that on simple grammar the factor is
> about 2. But on complex grammar (for example,
> our proprietary parser for 4D language ), we
> have seen that degradation is goes well beyond
> this factor. So, for example, for 800K lines
> that we parse in under 10 minutes without
> PPContext work, with the latest version it goes
> beyond 2h.
> >
> > I have not been able to reproduce my case on
> simple grammars. May be we can have some
> benchmarks using open source parsers and large
> code bases (e.g. PP Java Parser on JHotDraw or
> ArgoUML).
> >
> > Currently, I circumvent this issue by using
> PP v1.51 but I can provide relevant feedback and
> run benches on any improvements.
> >
> > regards.
> >
> > Usman
> >
> > _______________________________________________
> > Moose-dev mailing list
> > [hidden email]
> <mailto:[hidden email]>
> >
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
> >
> >
> >
> >
> > --
> > www.tudorgirba.com <http://www.tudorgirba.com>
> >
> > "Every thing has its own flow"
> >
> > _______________________________________________
> > Moose-dev mailing list
> > [hidden email]
> <mailto:[hidden email]>
> >
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
> >
> >
> >
> <script-graph.txt>_______________________________________________
> > Moose-dev mailing list
> > [hidden email]
> <mailto:[hidden email]>
> >
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> <mailto:[hidden email]>
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email] <mailto:[hidden email]>
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email] <mailto:[hidden email]>
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
>
> --
> www.tudorgirba.com <http://www.tudorgirba.com>
>
> "Every thing has its own flow"
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email] <mailto:[hidden email]>
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email] <mailto:[hidden email]>
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Tudor Girba-2

Re: PetitParser performance

Hi,

On Sat, Feb 21, 2015 at 8:07 PM, Thierry Goubier <[hidden email]> wrote:

Le 21/02/2015 11:14, Jan Kurš a écrit :

Hey,

I was not aware about some exponential element in the overall
complexity, it is strange and I will investigate it. Thank you Usman,
for pointing this out.

In general, it is hard to add features into the PetitParser and preserve
performance. Therefore, we work on a tool, that will take PetitParser,
analyze it and generates faster parser. The idea is to use PetitParser
while developing the grammar and than generate a fast parser for "real"
use. We hope we will be present the tool during the next ESUG.

I certainly be interested with what you come up with. From profiling PetitParser, it seems there is at least one low-hanging fruit in term of performance.

What is that log hanging fruit?

Doru

And thanks too Usman, for pointing out long source code files may (or do) show diverging asymptotic behaviors. And real life use for most programming languages are very long files :(

Thierry

Cheers,
Jan

On 20 February 2015 at 17:47, Usman Bhatti <[hidden email]
<mailto:[hidden email]>> wrote:

I executed the script that I shared earlier in this thread and I see
a considerable performance improvement in the latest version of
PP. The attached chart shows the comparison chart between PP v 1.51
(green) with the newer version (red). For shorter strings (i.e. to
the left on x-axis), the execution time is almost comparable to the
previous version of PetitParser that dates before changes introduced
by Jan.

However, when I take the new version and parse the code source of
larger code bases, the execution time still get a hit. For example,
parsing 100K lines takes almost twice longer. The increase is
exponential with even larger ones. That is consistent with results
we see in the graph (the larger the chain to be parsed i.e. going
from left to right in the x-axis, the higher the performance
penalty). Meaning I still prefer the older version.

What will be good is to prepare some benches with Java Parser in PP,
as it is open source, and a set of programs (e.g. ArguUML) that can
serve as a standard to test execution time for PP that we can run as
a jenkins service.

Inline image 1

regards.

On Fri, Feb 20, 2015 at 9:09 AM, Tudor Girba <[hidden email]
<mailto:[hidden email]>> wrote:

Great news.

@Usman: could you check?

Doru

On Fri, Feb 20, 2015 at 9:04 AM, Jan Kurš <[hidden email]
<mailto:[hidden email]>> wrote:

Hey all, Please note that a new version in of PetitParser
(JanKurs-271) should improve performance roughly comparable
to the original performance. On my computer, I got
improvement from 3900ms to 1300ms.

Cheers,
Jan

On 1 February 2015 at 22:55, Jan Kurš <[hidden email]
<mailto:[hidden email]>> wrote:

Hey all, thanks for the report, I will have a look at it
asap.

Cheers Jan

On 1 Feb 2015 20:55, "Usman Bhatti"
<[hidden email] <mailto:[hidden email]>>
wrote:

On Fri, Jan 30, 2015 at 9:36 PM, Sven Van
Caekenberghe <[hidden email] <mailto:[hidden email]>> wrote:

Usman, this is a really cool bug report !

Thanks :)

> On 30 Jan 2015, at 16:36, Usman Bhatti <[hidden email]
<mailto:[hidden email]>> wrote:
>
> Here is a script that shows the factor of 2 in time taken for parsing. The idea of the script is to introduce several wrong branches before hitting the correct one (originally proposed by Guillaume).
>
> |rule wrongBranches|
> [wrongBranches := (25 to: 45) inject: 24 asCharacter asParser into: [:acc :each | acc / each asCharacter asParser].
> rule := PPDelegateParser new.
> rule setParser: $a asParser / ((wrongBranches / $. asParser), rule).
> rule parse: (String streamContents: [ :s |
> 30000 timesRepeat: [s nextPut: $.]. s nextPut: $a ])] timeToRun
>
>
> Below are the graphs that were generated when varying the string length provided to the above parser description. Green part shows the time taken by the older version and the red part with the latest PP.
>
> <graph.png>
>
> Data values and graph generated by the script
in the attached file.
>
> regards.
>
> usman
>
> On Fri, Jan 30, 2015 at 1:30 PM, Tudor Girba
<[hidden email]
<mailto:[hidden email]>> wrote:
> Hi,
>
> Hmm. I thought this was fixed and that you
said at the end that the performance penalty is
no longer a factor of 2.
>
> Do you have a simple example for checking the
simple case of 2x performance loss?
>
> Doru
>
>
> On Fri, Jan 30, 2015 at 12:28 PM, Usman
Bhatti <[hidden email]
<mailto:[hidden email]>> wrote:
> Hi,
>
> Since the work on the integration of
PPContext in PetitParser, there is significant
performance degradation. I have already
mentioned that on simple grammar the factor is
about 2. But on complex grammar (for example,
our proprietary parser for 4D language ), we
have seen that degradation is goes well beyond
this factor. So, for example, for 800K lines
that we parse in under 10 minutes without
PPContext work, with the latest version it goes
beyond 2h.
>
> I have not been able to reproduce my case on
simple grammars. May be we can have some
benchmarks using open source parsers and large
code bases (e.g. PP Java Parser on JHotDraw or
ArgoUML).
>
> Currently, I circumvent this issue by using
PP v1.51 but I can provide relevant feedback and
run benches on any improvements.
>
> regards.
>
> Usman
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
<mailto:[hidden email]>
>
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
>
> --
> www.tudorgirba.com <http://www.tudorgirba.com>
>
> "Every thing has its own flow"
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
<mailto:[hidden email]>
>
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
<script-graph.txt>_______________________________________________
> Moose-dev mailing list
> [hidden email]
<mailto:[hidden email]>
>
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
<mailto:[hidden email]>
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email] <mailto:[hidden email]>
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email] <mailto:[hidden email]>
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

--
www.tudorgirba.com <http://www.tudorgirba.com>

"Every thing has its own flow"

_______________________________________________
Moose-dev mailing list
[hidden email] <mailto:[hidden email]>
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email] <mailto:[hidden email]>
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

www.tudorgirba.com

"Every thing has its own flow"

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

stepharo

Re: PetitParser performance

In reply to this post by Jan Kurš

Hey,

I was not aware about some exponential element in the overall complexity, it is strange and I will investigate it. Thank you Usman, for pointing this out.

In general, it is hard to add features into the PetitParser and preserve performance.

I imagine.
May be one idea would be also to have different flavors of PP.

Therefore, we work on a tool, that will take PetitParser, analyze it and generates faster parser. The idea is to use PetitParser while developing the grammar and than generate a fast parser for "real" use. We hope we will be present the tool during the next ESUG.

That could be gorgeous.
If Synectique takes off for real (which I really hope), I'm sure that Synectique will think hard about you as a synectiquer :).

Stef

Cheers,

Jan

On 20 February 2015 at 17:47, Usman Bhatti <[hidden email]> wrote:

I executed the script that I shared earlier in this thread and I see a considerable performance improvement in the latest version of PP. The attached chart shows the comparison chart between PP v 1.51 (green) with the newer version (red). For shorter strings (i.e. to the left on x-axis), the execution time is almost comparable to the previous version of PetitParser that dates before changes introduced by Jan.

However, when I take the new version and parse the code source of larger code bases, the execution time still get a hit. For example, parsing 100K lines takes almost twice longer. The increase is exponential with even larger ones. That is consistent with results we see in the graph (the larger the chain to be parsed i.e. going from left to right in the x-axis, the higher the performance penalty). Meaning I still prefer the older version.

What will be good is to prepare some benches with Java Parser in PP, as it is open source, and a set of programs (e.g. ArguUML) that can serve as a standard to test execution time for PP that we can run as a jenkins service.

regards.

On Fri, Feb 20, 2015 at 9:09 AM, Tudor Girba <[hidden email]> wrote:

Great news.

@Usman: could you check?

Doru

On Fri, Feb 20, 2015 at 9:04 AM, Jan Kurš <[hidden email]> wrote:

Hey all, Please note that a new version in of PetitParser (JanKurs-271) should improve performance roughly comparable to the original performance. On my computer, I got improvement from 3900ms to 1300ms.

Cheers,

Jan

On 1 February 2015 at 22:55, Jan Kurš <[hidden email]> wrote:

Hey all, thanks for the report, I will have a look at it asap.

Cheers Jan

On 1 Feb 2015 20:55, "Usman Bhatti" <[hidden email]> wrote:

On Fri, Jan 30, 2015 at 9:36 PM, Sven Van Caekenberghe <[hidden email]> wrote:

Usman, this is a really cool bug report !

Thanks :)

> On 30 Jan 2015, at 16:36, Usman Bhatti <[hidden email]> wrote:
>
> Here is a script that shows the factor of 2 in time taken for parsing. The idea of the script is to introduce several wrong branches before hitting the correct one (originally proposed by Guillaume).
>
> |rule wrongBranches|
> [wrongBranches := (25 to: 45) inject: 24 asCharacter asParser into: [:acc :each | acc / each asCharacter asParser].
> rule := PPDelegateParser new.
> rule setParser: $a asParser / ((wrongBranches / $. asParser), rule).
> rule parse: (String streamContents: [ :s |
> 30000 timesRepeat: [s nextPut: $.]. s nextPut: $a ])] timeToRun
>
>
> Below are the graphs that were generated when varying the string length provided to the above parser description. Green part shows the time taken by the older version and the red part with the latest PP.
>
> <graph.png>

>
> Data values and graph generated by the script in the attached file.
>
> regards.
>
> usman
>
> On Fri, Jan 30, 2015 at 1:30 PM, Tudor Girba <[hidden email]> wrote:
> Hi,
>
> Hmm. I thought this was fixed and that you said at the end that the performance penalty is no longer a factor of 2.
>
> Do you have a simple example for checking the simple case of 2x performance loss?
>
> Doru
>
>
> On Fri, Jan 30, 2015 at 12:28 PM, Usman Bhatti <[hidden email]> wrote:
> Hi,
>
> Since the work on the integration of PPContext in PetitParser, there is significant performance degradation. I have already mentioned that on simple grammar the factor is about 2. But on complex grammar (for example, our proprietary parser for 4D language ), we have seen that degradation is goes well beyond this factor. So, for example, for 800K lines that we parse in under 10 minutes without PPContext work, with the latest version it goes beyond 2h.
>
> I have not been able to reproduce my case on simple grammars. May be we can have some benchmarks using open source parsers and large code bases (e.g. PP Java Parser on JHotDraw or ArgoUML).
>
> Currently, I circumvent this issue by using PP v1.51 but I can provide relevant feedback and run benches on any improvements.
>
> regards.
>
> Usman
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
>
> --
> www.tudorgirba.com
>
> "Every thing has its own flow"
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>

> <script-graph.txt>_______________________________________________

> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

--

www.tudorgirba.com

"Every thing has its own flow"

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

stepharo

Re: PetitParser performance

In reply to this post by Thierry Goubier

>
> And thanks too Usman, for pointing out long source code files may (or
> do) show diverging asymptotic behaviors. And real life use for most
> programming languages are very long files :(
>
Like one method with 15 pages ? :)

Stef
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Thierry Goubier

Re: PetitParser performance

In reply to this post by Tudor Girba-2

Hi Doru,

the low hanging fruit is the time needed to create a PP parser (time for
new).

I profiled PPSmalltalkParser over a bench used by Lucas Renggli a few
years ago, and PPSmalltalkParser spends half the time in "new" (and it
is a lot slower than it used to be).

Thierry

Le 21/02/2015 21:13, Tudor Girba a écrit :

> Hi,
>
> On Sat, Feb 21, 2015 at 8:07 PM, Thierry Goubier
> <[hidden email] <mailto:[hidden email]>> wrote:
>
> Le 21/02/2015 11:14, Jan Kurš a écrit :
>
> Hey,
>
> I was not aware about some exponential element in the overall
> complexity, it is strange and I will investigate it. Thank you
> Usman,
> for pointing this out.
>
> In general, it is hard to add features into the PetitParser and
> preserve
> performance. Therefore, we work on a tool, that will take
> PetitParser,
> analyze it and generates faster parser. The idea is to use
> PetitParser
> while developing the grammar and than generate a fast parser for
> "real"
> use. We hope we will be present the tool during the next ESUG.
>
>
> I certainly be interested with what you come up with. From profiling
> PetitParser, it seems there is at least one low-hanging fruit in
> term of performance.
>
>
> What is that log hanging fruit?
>
> Doru

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Thierry Goubier

Re: PetitParser performance

In reply to this post by stepharo

Le 21/02/2015 22:04, stepharo a écrit :
>
>>
>> And thanks too Usman, for pointing out long source code files may (or
>> do) show diverging asymptotic behaviors. And real life use for most
>> programming languages are very long files :(
>>
> Like one method with 15 pages ? :)

Only 15 pages? I believe SmaCC can generate longuer methods than that :)

Thierry
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Tudor Girba-2

Re: PetitParser performance

In reply to this post by Thierry Goubier

Hi,

On Sat, Feb 21, 2015 at 11:03 PM, Thierry Goubier <[hidden email]> wrote:

Hi Doru,

the low hanging fruit is the time needed to create a PP parser (time for new).

I still do not see the low hanging fruit. Do you have a concrete idea of how to make it faster?

I profiled PPSmalltalkParser over a bench used by Lucas Renggli a few years ago, and PPSmalltalkParser spends half the time in "new" (and it is a lot slower than it used to be).

What does "a lot slower" mean?

Doru

Thierry

Le 21/02/2015 21:13, Tudor Girba a écrit :

Hi,

On Sat, Feb 21, 2015 at 8:07 PM, Thierry Goubier
<[hidden email] <mailto:[hidden email]>> wrote:

Le 21/02/2015 11:14, Jan Kurš a écrit :

Hey,

I was not aware about some exponential element in the overall
complexity, it is strange and I will investigate it. Thank you
Usman,
for pointing this out.

In general, it is hard to add features into the PetitParser and
preserve
performance. Therefore, we work on a tool, that will take
PetitParser,
analyze it and generates faster parser. The idea is to use
PetitParser
while developing the grammar and than generate a fast parser for
"real"
use. We hope we will be present the tool during the next ESUG.

I certainly be interested with what you come up with. From profiling
PetitParser, it seems there is at least one low-hanging fruit in
term of performance.

What is that log hanging fruit?

Doru

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

www.tudorgirba.com

"Every thing has its own flow"

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

jfabry

Re: PetitParser performance

In reply to this post by stepharo

Therefore, we work on a tool, that will take PetitParser, analyze it and generates faster parser. The idea is to use PetitParser while developing the grammar and than generate a fast parser for "real" use. We hope we will be present the tool during the next ESUG.

Oh yes that would be extremely cool! I would love to have faster parsing for Live Robot Programming as it parses the program at every keystroke. Right now it’s not a real issue but as robot code will get bigger I imagine that it will become so …

---> Save our in-boxes! http://emailcharter.org <---

Johan Fabry - http://pleiad.cl/~jfabry
PLEIAD lab - Computer Science Department (DCC) - University of Chile

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

jfabry

Re: PetitParser performance

In reply to this post by Thierry Goubier

So the easy solution in ‘production use’ would be to create the parser once and then keep hanging on to it? I would not say that this is really a PetitParser issue, but more how the parser is used.

> On Feb 21, 2015, at 19:03, Thierry Goubier <[hidden email]> wrote:
>
> Hi Doru,
>
> the low hanging fruit is the time needed to create a PP parser (time for new).
>
> I profiled PPSmalltalkParser over a bench used by Lucas Renggli a few years ago, and PPSmalltalkParser spends half the time in "new" (and it is a lot slower than it used to be).
>
> Thierry

---> Save our in-boxes! http://emailcharter.org <---

Johan Fabry - http://pleiad.cl/~jfabry
PLEIAD lab - Computer Science Department (DCC) - University of Chile

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Thierry Goubier

Re: PetitParser performance

Le 21/02/2015 23:24, Johan Fabry a écrit :
>
> So the easy solution in ‘production use’ would be to create the
> parser once and then keep hanging on to it? I would not say that this
> is really a PetitParser issue, but more how the parser is used.

All other Smalltalk parsers are instantiated as many times as the
PetitParser one in this benchmark, and their instanciation time is
around 0 ms.

But, yes, I would think that one of the solution would be a
pre-instantiation of a PetitParser, and a final instantiation to create
an instance.

In a way, one could imagine writing SmaCC parsers in the same way that
PetitParser parsers are written... pre-instantiation being the part
where SmaCC generates and compile its parsers.

In this benchmark, this would cut parsing time by 2.

Thierry
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev