Dear Andrew,
I didn't find time to answer earlier. Some time ago, I was looking for a (MT)BDD package in ST as well. I didn't found one. So the only two options left are 1) implementing a new BDD lib in ST and 2) doing FFI to some existing lib, e.g. CUDD, Sylvan, Jinc I'd prefer 2) since the existing libraries are feature-rich and highly optimized - which took quite some time. As a bonus, a solution could potentially switch between those backends. The biggest hurdle, in my option, is memory management, since most libs use some sort of reference counting. And you do not want to end up with nightmarish dozens of ref(bddNode) deref(bddNode) in your application code (like the probabilistic model checker PRISM does). This introduces hard to track bugs easily. However, I have an idea in mind how to tackle this but I didn't found the time to put it into code yet. May I ask, which sort of application do you have in mind? Best, Steffen Am .10.2017, 07:54 Uhr, schrieb Prof. Andrew P. Black <[hidden email]>: > Thanks for the responses so far. I see that I need to clarify my > enquiry. > > B-Trees and BDDs are not the same. BDDs are an efficient and compact > representations for Boolean functions, sometimes used in SAT-solvers and > electronics design. The key idea is that since the output must be 0 or > 1, you can represent any Boolean function as a tree whose depth is the > same as the number of bits in the input. > > To make the tree small and efficient, though, you need to eliminate any > node whose two children are the same, and to share subtrees, so that you > really get a DAG, not a tree. The full name for these efficient > compressed trees is “Reduced Order Binary Decision Diagrams”, or > ROBDDs. I was hoping that someone else had implemented the algorithms > necessary to build this representation. > > Because sets can be considered to be Booleans functions (true => > argument is in the set), you can use ROBDDs to efficiently represent > large sets. > > To be clear, despite the word “diagram” in the name, one is not normally > interested in drawing the BDD — except in the documentation for the > package ;-). Normally, BDDs they are used to represent sets, or > functions, where the drawing would be hopelessly large. > > The BuDDy package (http://buddy.sourceforge.net/manual/main.html) is an > example of what I’m looking for, but unfortunately it’s in C++. > > Andrew > > >> On 25 Oct 2017, at 21:39 , Stephane Ducasse <[hidden email]> >> wrote: >> >> Hi andrew >> >> I think that Avi did a package about BDD (but I thought it was special >> binary trees) so this is probably the same. >> Did you check on Squeaksource? >> http://www.squeaksource.com/BTree.html >> If this is what you are looking for I can help porting it to Pharo. >> >> Stef >> >> >> On Wed, Oct 25, 2017 at 9:02 PM, Prof. Andrew P. Black >> <[hidden email]> wrote: >>> Does anyone know of a BDD — that’s Binary Decision Diagram — package >>> written in Smalltalk? >>> >>> Andrew >>> >>> >> > |
It was for test inclusion of UTF-8 characters so we do not want to
rely on external libraries. On Fri, Oct 27, 2017 at 1:54 PM, Steffen Märcker <[hidden email]> wrote: > Dear Andrew, > > I didn't find time to answer earlier. Some time ago, I was looking for a > (MT)BDD package in ST as well. I didn't found one. So the only two options > left are > > 1) implementing a new BDD lib in ST and > 2) doing FFI to some existing lib, e.g. CUDD, Sylvan, Jinc > > I'd prefer 2) since the existing libraries are feature-rich and highly > optimized - which took quite some time. As a bonus, a solution could > potentially switch between those backends. The biggest hurdle, in my option, > is memory management, since most libs use some sort of reference counting. > And you do not want to end up with nightmarish dozens of ref(bddNode) > deref(bddNode) in your application code (like the probabilistic model > checker PRISM does). This introduces hard to track bugs easily. However, I > have an idea in mind how to tackle this but I didn't found the time to put > it into code yet. > > May I ask, which sort of application do you have in mind? > > Best, Steffen > > > > > Am .10.2017, 07:54 Uhr, schrieb Prof. Andrew P. Black <[hidden email]>: > >> Thanks for the responses so far. I see that I need to clarify my enquiry. >> >> B-Trees and BDDs are not the same. BDDs are an efficient and compact >> representations for Boolean functions, sometimes used in SAT-solvers and >> electronics design. The key idea is that since the output must be 0 or 1, >> you can represent any Boolean function as a tree whose depth is the same as >> the number of bits in the input. >> >> To make the tree small and efficient, though, you need to eliminate any >> node whose two children are the same, and to share subtrees, so that you >> really get a DAG, not a tree. The full name for these efficient compressed >> trees is “Reduced Order Binary Decision Diagrams”, or ROBDDs. I was hoping >> that someone else had implemented the algorithms necessary to build this >> representation. >> >> Because sets can be considered to be Booleans functions (true => argument >> is in the set), you can use ROBDDs to efficiently represent large sets. >> >> To be clear, despite the word “diagram” in the name, one is not normally >> interested in drawing the BDD — except in the documentation for the package >> ;-). Normally, BDDs they are used to represent sets, or functions, where >> the drawing would be hopelessly large. >> >> The BuDDy package (http://buddy.sourceforge.net/manual/main.html) is an >> example of what I’m looking for, but unfortunately it’s in C++. >> >> Andrew >> >> >>> On 25 Oct 2017, at 21:39 , Stephane Ducasse <[hidden email]> >>> wrote: >>> >>> Hi andrew >>> >>> I think that Avi did a package about BDD (but I thought it was special >>> binary trees) so this is probably the same. >>> Did you check on Squeaksource? >>> http://www.squeaksource.com/BTree.html >>> If this is what you are looking for I can help porting it to Pharo. >>> >>> Stef >>> >>> >>> On Wed, Oct 25, 2017 at 9:02 PM, Prof. Andrew P. Black <[hidden email]> >>> wrote: >>>> >>>> Does anyone know of a BDD — that’s Binary Decision Diagram — package >>>> written in Smalltalk? >>>> >>>> Andrew >>>> >>>> >>> >> > |
I see. What is the task in detail? Are some of the set fixed or known in advance? What's the argument against a bitset-based solution?
Cheers, Steffen Am 27. Oktober 2017 19:10:35 MESZ schrieb Stephane Ducasse <[hidden email]>:
It was for test inclusion of UTF-8 characters so we do not want to |
I think that andrew would like to improve smacc when parsing inputs
containing utf-8 characters. On Sat, Oct 28, 2017 at 1:46 PM, Steffen Märcker <[hidden email]> wrote: > I see. What is the task in detail? Are some of the set fixed or known in > advance? What's the argument against a bitset-based solution? > > Cheers, Steffen > > > > Am 27. Oktober 2017 19:10:35 MESZ schrieb Stephane Ducasse > <[hidden email]>: >> >> It was for test inclusion of UTF-8 characters so we do not want to >> rely on external libraries. >> >> On Fri, Oct 27, 2017 at 1:54 PM, Steffen Märcker <[hidden email]> wrote: >>> >>> Dear Andrew, >>> >>> I didn't find time to answer earlier. Some time ago, I was looking for a >>> (MT)BDD package in ST as well. I didn't found one. So the only two >>> options >>> left are >>> >>> 1) implementing a new BDD lib in ST and >>> 2) doing FFI to some existing lib, e.g. CUDD, Sylvan, Jinc >>> >>> I'd prefer 2) since the existing libraries are feature-rich and highly >>> optimized - which took quite some time. As a bonus, a solution could >>> potentially switch between those backends. The biggest hurdle, in my >>> option, >>> is memory management, since most libs use some sort of reference >>> counting. >>> And you do not want to end up with nightmarish dozens of ref(bddNode) >>> deref(bddNode) in your application code (like the probabilistic model >>> checker PRISM does). This introduces hard to track bugs easily. However, >>> I >>> have an idea in mind how to tackle this but I didn't found the time to >>> put >>> it into code yet. >>> >>> May I ask, which sort of application do you have in mind? >>> >>> Best, Steffen >>> >>> >>> >>> >>> Am .10.2017, 07:54 Uhr, schrieb Prof. Andrew P. Black >>> <[hidden email]>: >>> >>>> Thanks for the responses so far. I see that I need to clarify my >>>> enquiry. >>>> >>>> B-Trees and BDDs are not the same. BDDs are an efficient and compact >>>> representations for Boolean functions, sometimes used in SAT-solvers >>>> and >>>> electronics design. The key idea is that since the output must be 0 >>>> or 1, >>>> you can represent any Boolean function as a tree whose depth is the >>>> same as >>>> the number of bits in the input. >>>> >>>> To make the tree small and efficient, though, you need to eliminate any >>>> node whose two children are the same, and to share subtrees, so that >>>> you >>>> really get a DAG, not a tree. The full name for these efficient >>>> compressed >>>> trees is “Reduced Order Binary Decision Diagrams”, or ROBDDs. I was >>>> hoping >>>> that someone else had implemented the algorithms necessary to build >>>> this >>>> representation. >>>> >>>> Because sets can be considered to be Booleans functions (true => >>>> argument >>>> is in the set), you can use ROBDDs to efficiently represent large sets. >>>> >>>> To be clear, despite the word “diagram” in the name, one is not >>>> normally >>>> interested in drawing the BDD — except in the documentation for the >>>> package >>>> ;-). Normally, BDDs they are used to represent sets, or functions, >>>> where >>>> the drawing would be hopelessly large. >>>> >>>> The BuDDy package (http://buddy.sourceforge.net/manual/main.html) is an >>>> example of what I’m looking for, but unfortunately it’s in C++. >>>> >>>> Andrew >>>> >>>> >>>>> On 25 Oct 2017, at 21:39 , Stephane Ducasse <[hidden email]> >>>>> wrote: >>>>> >>>>> Hi andrew >>>>> >>>>> I think that Avi did a package about BDD (but I thought it was special >>>>> binary trees) so this is probably the same. >>>>> Did you check on Squeaksource? >>>>> http://www.squeaksource.com/BTree.html >>>>> If this is what you are looking for I can help porting it to Pharo. >>>>> >>>>> Stef >>>>> >>>>> >>>>> On Wed, Oct 25, 2017 at 9:02 PM, Prof. Andrew P. Black >>>>> <[hidden email]> >>>>> wrote: >>>>>> >>>>>> >>>>>> Does anyone know of a BDD — that’s Binary Decision Diagram — package >>>>>> written in Smalltalk? >>>>>> >>>>>> Andrew >>>>> >>>>> >>>>> >>>>> >>>> >>> >> > |
Part of the reasoning is that by writing JVM bytecode, the differences between the various JVM languages become largely irrelevant, though in some ways it becomes slightly less convenient if calling, for example, Scala or Clojure code from Pharo. The other is inherent JVM limitations and discrepancies between Java as a semantic spec and various specific implementations as dynamic systems, particularly since the latter differ both by Java version and by which frameworks might be in use, and dynamically create further discrepancies contingently based not only on the above, but also specifics of the system and its current state. Cheers Andrew Sent from Mail for Windows 10 From: [hidden email] I think that andrew would like to improve smacc when parsing inputs containing utf-8 characters. On Sat, Oct 28, 2017 at 1:46 PM, Steffen Märcker <[hidden email]> wrote: > I see. What is the task in detail? Are some of the set fixed or known in > advance? What's the argument against a bitset-based solution? > > Cheers, Steffen > > > > Am 27. Oktober 2017 19:10:35 MESZ schrieb Stephane Ducasse > <[hidden email]>: >> >> It was for test inclusion of UTF-8 characters so we do not want to >> rely on external libraries. >> >> On Fri, Oct 27, 2017 at 1:54 PM, Steffen Märcker <[hidden email]> wrote: >>> >>> Dear Andrew, >>> >>> I didn't find time to answer earlier. Some time ago, I was looking for a >>> (MT)BDD package in ST as well. I didn't found one. So the only two >>> options >>> left are >>> >>> 1) implementing a new BDD lib in ST and >>> 2) doing FFI to some existing lib, e.g. CUDD, Sylvan, Jinc >>> >>> I'd prefer 2) since the existing libraries are feature-rich and highly >>> optimized - which took quite some time. As a bonus, a solution could >>> potentially switch between those backends. The biggest hurdle, in my >>> option, >>> is memory management, since most libs use some sort of reference >>> counting. >>> And you do not want to end up with nightmarish dozens of ref(bddNode) >>> deref(bddNode) in your application code (like the probabilistic model >>> checker PRISM does). This introduces hard to track bugs easily. However, >>> I >>> have an idea in mind how to tackle this but I didn't found the time to >>> put >>> it into code yet. >>> >>> May I ask, which sort of application do you have in mind? >>> >>> Best, Steffen >>> >>> >>> >>> >>> Am .10.2017, 07:54 Uhr, schrieb Prof. Andrew P. Black >>> <[hidden email]>: >>> >>>> Thanks for the responses so far. I see that I need to clarify my >>>> enquiry. >>>> >>>> B-Trees and BDDs are not the same. BDDs are an efficient and compact >>>> representations for Boolean functions, sometimes used in SAT-solvers >>>> and >>>> electronics design. The key idea is that since the output must be 0 >>>> or 1, >>>> you can represent any Boolean function as a tree whose depth is the >>>> same as >>>> the number of bits in the input. >>>> >>>> To make the tree small and efficient, though, you need to eliminate any >>>> node whose two children are the same, and to share subtrees, so that >>>> you >>>> really get a DAG, not a tree. The full name for these efficient >>>> compressed >>>> trees is “Reduced Order Binary Decision Diagrams”, or ROBDDs. I was >>>> hoping >>>> that someone else had implemented the algorithms necessary to build >>>> this >>>> representation. >>>> >>>> Because sets can be considered to be Booleans functions (true => >>>> argument >>>> is in the set), you can use ROBDDs to efficiently represent large sets. >>>> >>>> To be clear, despite the word “diagram” in the name, one is not >>>> normally >>>> interested in drawing the BDD — except in the documentation for the >>>> package >>>> ;-). Normally, BDDs they are used to represent sets, or functions, >>>> where >>>> the drawing would be hopelessly large. >>>> >>>> The BuDDy package (http://buddy.sourceforge.net/manual/main.html) is an >>>> example of what I’m looking for, but unfortunately it’s in C++. >>>> >>>> Andrew >>>> >>>> >>>>> On 25 Oct 2017, at 21:39 , Stephane Ducasse <[hidden email]> >>>>> wrote: >>>>> >>>>> Hi andrew >>>>> >>>>> I think that Avi did a package about BDD (but I thought it was special >>>>> binary trees) so this is probably the same. >>>>> Did you check on Squeaksource? >>>>> http://www.squeaksource.com/BTree.html >>>>> If this is what you are looking for I can help porting it to Pharo. >>>>> >>>>> Stef >>>>> >>>>> >>>>> On Wed, Oct 25, 2017 at 9:02 PM, Prof. Andrew P. Black >>>>> <[hidden email]> >>>>> wrote: >>>>>> >>>>>> >>>>>> Does anyone know of a BDD — that’s Binary Decision Diagram — package >>>>>> written in Smalltalk? >>>>>> >>>>>> Andrew >>>>> >>>>> >>>>> >>>>> >>>> >>> >> > |
In reply to this post by Stephane Ducasse-3
Does that mean the sets/bdd would be constructed mainly at comile time?
Anyway, Andrew, feel free to contact me, I might help you with this. Best, Steffen Am .10.2017, 16:05 Uhr, schrieb Stephane Ducasse <[hidden email]>: > I think that andrew would like to improve smacc when parsing inputs > containing utf-8 characters. > > > On Sat, Oct 28, 2017 at 1:46 PM, Steffen Märcker <[hidden email]> wrote: >> I see. What is the task in detail? Are some of the set fixed or known in >> advance? What's the argument against a bitset-based solution? >> >> Cheers, Steffen >> >> >> >> Am 27. Oktober 2017 19:10:35 MESZ schrieb Stephane Ducasse >> <[hidden email]>: >> > |
In reply to this post by Pharo Smalltalk Users mailing list
Dear all,
I would like to give ZeroMQ a try in Smalltalk, but am unable to load it on a new image successfully. In Pharo 5, I loaded the package ConfigurationOfZeroMQ from the repository in smalltalkhub and evaluated the following to load the code: ConfigurationOfZeroMQ loadBleedingEdge It seems to load everything well, but when I try to look at the source code of the methods that call the C routines from libzmq using the system browser, the system exhibits the following error message: UndefinedObject(Object)>>doesNotUnderstand: #keywords RBFFICallPragma(RBPragmaNode)>>selectorParts RubSHTextStylerST80(SHRBTextStyler)>>visitPragmaNode: RBFFICallPragma(RBPragmaNode)>>acceptVisitor: RubSHTextStylerST80(SHRBTextStyler)>>visitNode: [ :each | self visitNode: each ] in RubSHTextStylerST80>>visitMethodNode: in Block: [ :each | self visitNode: each ] OrderedCollection>>do: (its a long stack, cutting here) When loading ZeroMQ on Pharo 6, a window pops-up with the title 'Syntax Error: Literal constant expected' showing the source code of a method defining a call to a routine from a C library: apiDeleteDC: aHDC <apicall: Literal constant expected -> bool 'DeleteDC' (Win32HDC) module:'gdi32.dll'> ^self externalCallFailed If I update the syntax, re-writing the code above as apiDeleteDC: aHDC ^self ffiCall: #( bool DeleteDC (Win32HDC aHDC) ) module:'gdi32.dll' the compiler accepts it and the process can go on, but eventually the system will become unstable... Any ideas on how to solve this problems? Is this syntax still supported in Pharo5 & 6? Cheers, Paulo On 10/25/2017 03:47 PM, Sebastian
Heidbrink via Pharo-users wrote:
|
well, I downloaded the .mcz
package, unzipped it and changed the
source code accordingly using a text editor (Smalltalkers can use text editors to edit code too!). Later on I could successfully file-in the file in a new Pharo 6 image. The code need more changes to work, however. Lets see how long it takes to get a few ZeroMQ examples running. I will publish my "fork". Cheers, Paulo On 10/29/2017 03:09 PM, Paulo R.
Dellani wrote:
Dear all, |
In reply to this post by Steffen Märcker
Thanks I think that the sets will be known since the scanner
definition should not changed once defined. Now andrew told me that he could use an existing solution but not build one because he should make progress on his compiler. But Steffen if you know how to build such package it would be a nice addition and we will use it. Stef On Sat, Oct 28, 2017 at 5:37 PM, Steffen Märcker <[hidden email]> wrote: > Does that mean the sets/bdd would be constructed mainly at comile time? > Anyway, Andrew, feel free to contact me, I might help you with this. > > Best, Steffen > > > Am .10.2017, 16:05 Uhr, schrieb Stephane Ducasse <[hidden email]>: > >> I think that andrew would like to improve smacc when parsing inputs >> containing utf-8 characters. >> >> >> On Sat, Oct 28, 2017 at 1:46 PM, Steffen Märcker <[hidden email]> wrote: >>> >>> I see. What is the task in detail? Are some of the set fixed or known in >>> advance? What's the argument against a bitset-based solution? >>> >>> Cheers, Steffen >>> >>> >>> >>> Am 27. Oktober 2017 19:10:35 MESZ schrieb Stephane Ducasse >>> <[hidden email]>: > > <---Schnitt---> >>> >>> >> > > > |
In reply to this post by dellani
To load such code switch to old compiler in settings browser. It supports old FFI syntax. Then you can manually adopt all ffi calls to UFFI and commit new version. 2017-10-29 15:09 GMT+01:00 Paulo R. Dellani <[hidden email]>:
|
In reply to this post by Steffen Märcker
> On 28 Oct 2017, at 17:37 , Steffen Märcker <[hidden email]> wrote: > > Does that mean the sets/bdd would be constructed mainly at comile time? Anyway, Andrew, feel free to contact me, I might help you with this. > Thanks for the offer, Steffen! The problem is that I need to use SmaCC for my current project, and really do not have a month to take off and re-design the way that it builds its scanner. I’ve talked to Thierry Goubier about, and he doesn’t have time either! It would be a fun project, though, and it ought to be fairly separate from other parts of SmaCC. I’ve spent a fair bit of time thinking about how to do it, but don’t think that I will be able to actually focus on it. An alternative approach, which Thierry has suggested, is to make SmaCC work on the UTF-8 representation of the Unicode. Then we could represent character sets as prefix trees. But the core problem would still exist: you can’t run an algorithm that repeatedly executes for all characters in the alphabet do: when there are 2^21 characters in the alphabet! Andrew |
Hi Andrew, Steffen,
2017-11-07 13:10 GMT+01:00 Prof. Andrew P. Black <[hidden email]>:
Yes, this is the essence of the issue. There are a few alternatives about it, but none we have the time to pursue.
The main issue is that `for all characters`... All the literature on scanner building uses 'for all characters do'. Thierry
|
A possible way to accomplish it would be to use an object graph with an incremental query engine, such as EMF/CDO with Viatra or something similar. You could then put different character sets in connected objects and query only as far as you need to. Andrew Glynn Sent from Mail for Windows 10 From: [hidden email] Hi Andrew, Steffen, 2017-11-07 13:10 GMT+01:00 Prof. Andrew P. Black <[hidden email]>:
Yes, this is the essence of the issue. There are a few alternatives about it, but none we have the time to pursue.
The main issue is that `for all characters`... All the literature on scanner building uses 'for all characters do'. Thierry
|
I am not familiar with the literature on scanners. May I ask you about some details on the "for all characters" algorithms you are referring to?
Building a (or connecting to) a BDD library would be fun, indeed. But within that time frame it seems not realistic. Anyway, after finishing my thesis, I'd like to come back to that idea. Ciao, Steffen Am 7. November 2017 16:33:03 MEZ schrieb Andrew Glynn <[hidden email]>:
|
Le 07/11/2017 à 23:00, Steffen Märcker a écrit :
> I am not familiar with the literature on scanners. May I ask you about > some details on the "for all characters" algorithms you are referring to? The two main ones available, from the typical Aho/Ullman textbook, are: - NFA to DFA conversion (i.e. build a NFA with your regular expressions, then convert it to a DFA) - Direct regular expression to DFA construction Both of them have loop of the type: for (each input symbol a) { ... > Building a (or connecting to) a BDD library would be fun, indeed. But > within that time frame it seems not realistic. Anyway, after finishing > my thesis, I'd like to come back to that idea. It would certainly be interesting. Please contact us again when you'll have time :) Regards, Thierry > Ciao, Steffen > > > Am 7. November 2017 16:33:03 MEZ schrieb Andrew Glynn <[hidden email]>: > > A possible way to accomplish it would be to use an object graph with > an incremental query engine, such as EMF/CDO with Viatra or > something similar. You could then put different character sets in > connected objects and query only as far as you need to. > > Andrew Glynn > > Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for > Windows 10 > > *From: *Thierry Goubier <mailto:[hidden email]> > *Sent: *Tuesday, November 7, 2017 7:17 AM > *To: *Any question about pharo is welcome > <mailto:[hidden email]> > *Subject: *Re: [Pharo-users] Binary Decision Diagram Package in > Smalltalk > > Hi Andrew, Steffen, > > 2017-11-07 13:10 GMT+01:00 Prof. Andrew P. Black <[hidden email] > <mailto:[hidden email]>>: > > > > On 28 Oct 2017, at 17:37 , Steffen Märcker <[hidden email] > <mailto:[hidden email]>> wrote: > > > > Does that mean the sets/bdd would be constructed mainly at > comile time? Anyway, Andrew, feel free to contact me, I might > help you with this. > > > > Thanks for the offer, Steffen! The problem is that I need to > use SmaCC for my current project, and really do not have a month > to take off and re-design the way that it builds its scanner. > I’ve talked to Thierry Goubier about, and he doesn’t have time > either! It would be a fun project, though, and it ought to be > fairly separate from other parts of SmaCC. I’ve spent a fair > bit of time thinking about how to do it, but don’t think that I > will be able to actually focus on it. > > Yes, this is the essence of the issue. There are a few alternatives > about it, but none we have the time to pursue. > > > An alternative approach, which Thierry has suggested, is to make > SmaCC work on the UTF-8 representation of the Unicode. Then we > could represent character sets as prefix trees. But the core > problem would still exist: you can’t run an algorithm that > repeatedly executes > > for all characters in the alphabet do: > > when there are 2^21 characters in the alphabet! > > The main issue is that `for all characters`... All the literature on > scanner building uses 'for all characters do'. > > Thierry > > > Andrew > > > |
I see. How about the following (sketched) solution to avoid looping over
all characters? It might be very well the case that you already considered (and dismissed) that path. A) Assumption In order to allow any meaningful matching, the input to the scanner is normalized according to the unicode spec. B) Abstraction Treat each character and character group of an regex as a set of intervals in the unicode code points. Lets call them "character tests" and lift the common set operations union, intersection and difference to them. C) Construct NFA NFA has potentially overlapping character tests at the transitions of each state. D) Construct DFA Given a product state s in the DFA and two transitions t1, t2 from the original NFA, add three new transitions to the DFA: - a transition labeled with the character test of t1 minus the character test of t2 - a transition labeled with the intersection of the character tests of t1 and t2 - a transition labeled with the character test of t2 minus the character test of t1 E) Extension Instead of sets of unicode intervals we could also use test-functions, e.g. blocks. Then, in step D), the set operations are translated to boolean operations: - difference t1 - t2 becomes: t1 && not t2 - intersection of t1 and t2 becomes: t1 && t2 This would allow to use optimized test functions, e.g., bdds, instead of relying on charcter tests only. Cheers, Steffen Am .11.2017, 23:16 Uhr, schrieb Thierry Goubier <[hidden email]>: > Le 07/11/2017 à 23:00, Steffen Märcker a écrit : >> I am not familiar with the literature on scanners. May I ask you about >> some details on the "for all characters" algorithms you are referring >> to? > > The two main ones available, from the typical Aho/Ullman textbook, are: > > - NFA to DFA conversion (i.e. build a NFA with your regular expressions, > then convert it to a DFA) > > - Direct regular expression to DFA construction > > Both of them have loop of the type: > > for (each input symbol a) { > ... > >> Building a (or connecting to) a BDD library would be fun, indeed. But >> within that time frame it seems not realistic. Anyway, after finishing >> my thesis, I'd like to come back to that idea. > > It would certainly be interesting. Please contact us again when you'll > have time :) > > Regards, > > Thierry > >> Ciao, Steffen >> Am 7. November 2017 16:33:03 MEZ schrieb Andrew Glynn >> <[hidden email]>: >> A possible way to accomplish it would be to use an object graph >> with >> an incremental query engine, such as EMF/CDO with Viatra or >> something similar. You could then put different character sets in >> connected objects and query only as far as you need to. >> Andrew Glynn >> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for >> Windows 10 >> *From: *Thierry Goubier <mailto:[hidden email]> >> *Sent: *Tuesday, November 7, 2017 7:17 AM >> *To: *Any question about pharo is welcome >> <mailto:[hidden email]> >> *Subject: *Re: [Pharo-users] Binary Decision Diagram Package in >> Smalltalk >> Hi Andrew, Steffen, >> 2017-11-07 13:10 GMT+01:00 Prof. Andrew P. Black <[hidden email] >> <mailto:[hidden email]>>: >> > On 28 Oct 2017, at 17:37 , Steffen Märcker <[hidden email] >> <mailto:[hidden email]>> wrote: >> > >> > Does that mean the sets/bdd would be constructed mainly at >> comile time? Anyway, Andrew, feel free to contact me, I might >> help you with this. >> > >> Thanks for the offer, Steffen! The problem is that I need to >> use SmaCC for my current project, and really do not have a month >> to take off and re-design the way that it builds its scanner. >> I’ve talked to Thierry Goubier about, and he doesn’t have time >> either! It would be a fun project, though, and it ought to be >> fairly separate from other parts of SmaCC. I’ve spent a fair >> bit of time thinking about how to do it, but don’t think that I >> will be able to actually focus on it. >> Yes, this is the essence of the issue. There are a few alternatives >> about it, but none we have the time to pursue. >> An alternative approach, which Thierry has suggested, is to >> make >> SmaCC work on the UTF-8 representation of the Unicode. Then we >> could represent character sets as prefix trees. But the core >> problem would still exist: you can’t run an algorithm that >> repeatedly executes >> for all characters in the alphabet do: >> when there are 2^21 characters in the alphabet! >> The main issue is that `for all characters`... All the literature >> on >> scanner building uses 'for all characters do'. >> Thierry >> Andrew >> > |
Hi Steffen,
in fact, I considered B) directly, but not up to the point of building onto C) and D) (with an obvious A, but that one is a given in Pharo implementation of strings). Thanks for the explanation, then. Regards, Thierry 2017-11-08 14:21 GMT+01:00 Steffen Märcker <[hidden email]>: I see. How about the following (sketched) solution to avoid looping over all characters? It might be very well the case that you already considered (and dismissed) that path. |
In reply to this post by kilon.alios
On 26 October 2017 at 18:37, Dimitris Chloupis <[hidden email]> wrote:
And others are finding similar results... "Inter-process communication is a (surprisingly) effective way to couple two programming languages " R into Go... http://user2015.math.aau.dk/presentations/236.pdf cheers -ben |
Free forum by Nabble | Edit this page |