Hi all,
Trying to understand here how tokens are used in AST. So far I can not see any order in usage of tokens. For instance, why RBValueNode doesn't have token? Is it haow it's supposed to be? Cheers, Mark |
2014-10-27 19:36 GMT+01:00 Mark Rizun <[hidden email]>:
RBValueNode is an abstract class.
|
Thanks. I see that, however RBBlockNode or RBArrayNode doesn't have tokens. These classes have only methods in accessing-token protocol. I think it would be better if we have token object for those classes, because it makes more sense to hold such information in token object. Mark 2014-10-28 11:59 GMT+02:00 Nicolai Hess <[hidden email]>:
|
> On 28 Oct 2014, at 11:23, Mark Rizun <[hidden email]> wrote: > > Thanks. I see that, however RBBlockNode or RBArrayNode doesn't have tokens. > These classes have only methods in accessing-token protocol. > I think it would be better if we have token object for those classes, because it makes more sense to hold such information in token object. > No, actually we should get rid of tokens.. they are just used for parsing (the scanner produces tokens, there is no token for a block, as a block consists of many tokens… so conceptually, a block can not have a token). Tokens expose a very low level implementation artefact of the parser to the AST model, this is not good. There is… https://pharo.fogbugz.com/f/cases/11992/Remove-tokens-from-the-AST-Core-Node-classes We should have a look at that and integrate it. This should simplify lots of things. Marcus |
In reply to this post by Mark Rizun
Le 28/10/2014 11:23, Mark Rizun a écrit :
> Thanks. I see that, however RBBlockNode or RBArrayNode doesn't have tokens. > These classes have only methods in accessing-token protocol. > I think it would be better if we have token object for those classes, > because it makes more sense to hold such information in token object. Well, not really. Technically, tokens are used to drive a parser from a scanner. If an AST node knows how to relate itself to its original source code chunk and is able to print itself correctly, then tokens are redundant. In short, if you work with parsers, you'd better know what tokens are. If you're only working with the AST, tokens are redundant and noise (i.e. they often have a type (or more than one) which is only understood by the parser). Example of how it is done: RBPragmaNode accessing-tokens gives access to left and right, which are positions, not tokens. Thierry > > Mark > > > > 2014-10-28 11:59 GMT+02:00 Nicolai Hess <[hidden email] > <mailto:[hidden email]>>: > > 2014-10-27 19:36 GMT+01:00 Mark Rizun <[hidden email] > <mailto:[hidden email]>>: > > Hi all, > > Trying to understand here how tokens are used in AST. > So far I can not see any order in usage of tokens. > For instance, why RBValueNode doesn't have token? Is it haow > it's supposed to be? > > > RBValueNode is an abstract class. > > > Cheers, > Mark > > > |
Well, not really. I'm working with ASTs sourceInterval. Trying to calculate it after method replaceWith:. You see, my proble was that each node of AST doesn't hold its start and stop position in same place. So I thought that token is such a place, however, eventually I understood that RBValueNodes don't have tokens:) Example of how it is done: Yes, I know.
|
Le 28/10/2014 12:12, Mark Rizun a écrit :
> Well, not really. > > Technically, tokens are used to drive a parser from a scanner. > > If an AST node knows how to relate itself to its original source > code chunk and is able to print itself correctly, then tokens are > redundant. > > In short, if you work with parsers, you'd better know what tokens > are. If you're only working with the AST, tokens are redundant and > noise (i.e. they often have a type (or more than one) which is only > understood by the parser). > > > I'm working with ASTs sourceInterval. Trying to calculate it after > method replaceWith:. > You see, my proble was that each node of AST doesn't hold its start and > stop position in same place. So I thought that token is such a place, > however, eventually I understood that RBValueNodes don't have tokens:) Do you mean you're trying to do a replace and update the positions of all the nodes ? Thierry |
Yes, because they are wrong. Here is an issue: https://pharo.fogbugz.com/f/cases/14254/AST-method-replaceWith-does-not-change-source-interval 2014-10-28 13:32 GMT+02:00 Thierry Goubier <[hidden email]>: Le 28/10/2014 12:12, Mark Rizun a écrit : |
Le 28/10/2014 12:45, Mark Rizun a écrit :
> Yes, because they are wrong. Here is an issue: > https://pharo.fogbugz.com/f/cases/14254/AST-method-replaceWith-does-not-change-source-interval I would say that they are correct. When I write source to source compilers, I admit that anything that I change in the AST (via replaceWith equivalent) has no valid source interval (since it does not exist in the original source). However, all unmodified nodes should keep their 'non-modified' source interval (since I may need it to fetch the relevant text from the source). If I want my modification to the AST to have valid source intervals, then, I need to regenerate the source from the modified AST. And only then they are valid. You may want to update the source interval when you do a replaceWith, but the only thing we will get with what you want to do is that, after a replaceWith, no source interval can be trusted since it may end up past the end of the original source string length. Thierry > > 2014-10-28 13:32 GMT+02:00 Thierry Goubier <[hidden email] > <mailto:[hidden email]>>: > > Le 28/10/2014 12:12, Mark Rizun a écrit : > > Well, not really. > > Technically, tokens are used to drive a parser from a scanner. > > If an AST node knows how to relate itself to its original > source > code chunk and is able to print itself correctly, then > tokens are > redundant. > > In short, if you work with parsers, you'd better know what > tokens > are. If you're only working with the AST, tokens are > redundant and > noise (i.e. they often have a type (or more than one) which > is only > understood by the parser). > > > I'm working with ASTs sourceInterval. Trying to calculate it after > method replaceWith:. > You see, my proble was that each node of AST doesn't hold its > start and > stop position in same place. So I thought that token is such a > place, > however, eventually I understood that RBValueNodes don't have > tokens:) > > > Do you mean you're trying to do a replace and update the positions > of all the nodes ? > > Thierry > > |
In the first place why I'm doing this. I work on Rewrite Tool and it's main functionality bases on replacing nodes in AST. Plus it works with sourceIntervals of nodes. Untill now my solution was: if I replace node, I reparse tree to get intervals updated However this solution brought new problems. Second reason is, that I think it makes sense to update interval of all AST if you replace one node. For example, we have: obj1 foo + obj2 bar and we replace obj1 with myObject. The interval of ast was 1 to: 19, and now it should be 1 to: 23. Mark 2014-10-28 14:00 GMT+02:00 Thierry Goubier <[hidden email]>: Le 28/10/2014 12:45, Mark Rizun a écrit : |
Le 28/10/2014 14:33, Mark Rizun a écrit :
> In the first place why I'm doing this. I work on Rewrite Tool and it's > main functionality bases on replacing nodes in AST. Plus it works with > sourceIntervals of nodes. Untill now my solution was: > if I replace node, I reparse tree to get intervals updated I print the modified tree and parse :) > However this solution brought new problems. Which ones? > Second reason is, that I think it makes sense to update interval of all > AST if you replace one node. > For example, we have: > obj1 foo + obj2 bar > and we replace obj1 with myObject. > The interval of ast was 1 to: 19, and now it should be 1 to: 23. No it shouldn't. If the source has not been regenerated from the modified AST, then 'source copyFrom: theBarASTNode start to: theBarASTNode stop' end past the end of it (20 to 23 with source ending at 19). If you regenerate the source, then you can parse it and you'll have correct intervals. Thierry > Mark > > 2014-10-28 14:00 GMT+02:00 Thierry Goubier <[hidden email] > <mailto:[hidden email]>>: > > Le 28/10/2014 12:45, Mark Rizun a écrit : > > Yes, because they are wrong. Here is an issue: > https://pharo.fogbugz.com/f/__cases/14254/AST-method-__replaceWith-does-not-change-__source-interval > <https://pharo.fogbugz.com/f/cases/14254/AST-method-replaceWith-does-not-change-source-interval> > > > I would say that they are correct. > > When I write source to source compilers, I admit that anything that > I change in the AST (via replaceWith equivalent) has no valid source > interval (since it does not exist in the original source). However, > all unmodified nodes should keep their 'non-modified' source > interval (since I may need it to fetch the relevant text from the > source). > > If I want my modification to the AST to have valid source intervals, > then, I need to regenerate the source from the modified AST. And > only then they are valid. > > You may want to update the source interval when you do a > replaceWith, but the only thing we will get with what you want to do > is that, after a replaceWith, no source interval can be trusted > since it may end up past the end of the original source string length. > > Thierry > > > 2014-10-28 13:32 GMT+02:00 Thierry Goubier > <[hidden email] <mailto:[hidden email]> > <mailto:thierry.goubier@gmail.__com > <mailto:[hidden email]>>>: > > > Le 28/10/2014 12:12, Mark Rizun a écrit : > > Well, not really. > > Technically, tokens are used to drive a parser > from a scanner. > > If an AST node knows how to relate itself to its > original > source > code chunk and is able to print itself correctly, then > tokens are > redundant. > > In short, if you work with parsers, you'd better > know what > tokens > are. If you're only working with the AST, tokens are > redundant and > noise (i.e. they often have a type (or more than > one) which > is only > understood by the parser). > > > I'm working with ASTs sourceInterval. Trying to > calculate it after > method replaceWith:. > You see, my proble was that each node of AST doesn't > hold its > start and > stop position in same place. So I thought that token is > such a > place, > however, eventually I understood that RBValueNodes > don't have > tokens:) > > > Do you mean you're trying to do a replace and update the > positions > of all the nodes ? > > Thierry > > > > > |
Which ones? In my tool each node has property oldNodes, which holds collection of obviously AST nodes:) When I replace one of node I have to update source interval in some way. 1)If I update it with reparsing, I loose all data about oldNodes for each node of my AST. So I have to save old AST with all oldNodes, and somehow detect which nodes were not changed and reassign their lost oldNodes. But sometimes it's difficult to detect where and what you have to assign, as sometimes AST may be changed in dramatic way. 2) But if source interval is updated automatically I don't bother with losing data for all AST. I just have to update oldNodes for node that was replaced. That is way I'd like to have automatically updated source interval.
Marcus, do I have to redo everything back, or you can somehow remove that slice from newest version? Mark |
2014-10-29 8:32 GMT+01:00 Mark Rizun <[hidden email]>:
Ok; yes, I can relate to that. But, knowing who has designed the RB ast, I'm sure it has a proper equality property where: oldNodeFromAST = sameNodeFromASTreparsed holds true. So you can reparse and rematch old node to new node. But I would only regenerate and reparse when I need to update the source intervals (or display the modified source).
Well, not really. If during regeneration of your source, you change the way the code is formatted (more tabs here, removing a return there, etc...) then regenerated source intervals are different from modified ast source intervals. So your 'modify source intervals' is very fragile for me. The only way I can see a way out for that problem: Inserted / replaced node have source intervals in a separate source; unmodified nodes keep their source interval. This is the way I do it in my source to source compilers when I want to inject code (and tag in the generated file where that code comes from: usefull for debugging). Or the match of old node to new node as seen above.
I'm still not entirely sure why. Source intervals are only there to help relating the ast to the source, not much else, really. Thierry
|
I use source intervals to detect which node is selected and than in the right-click menu user can see only options that are relevanto to selected node, as it is also made in SmartSuggestions. |
P.S. I have a solution, but don't know if it's appropriate: I remove updating of source interval from replaceWith: method, but my tool will do all the calculations of interval on it's own. 2014-10-29 10:59 GMT+02:00 Mark Rizun <[hidden email]>:
|
In reply to this post by Mark Rizun
2014-10-29 9:59 GMT+01:00 Mark Rizun <[hidden email]>:
I know that use case ;) Ok, then this means you are regenerating the code (1) (or are you doing replace a node / insert new source at right place? (2)) Thinking a bit about it, I'd try reparse, get node from selection index, find equal old node in old (modified) ast, or replace old (modified) ast with new one. Thierry |
Second one, I do replace node all the time.
Can you explain this, sorry I didn't get the point |
In reply to this post by Mark Rizun
2014-10-29 10:09 GMT+01:00 Mark Rizun <[hidden email]>:
This is a possibility: have a transform; when you get selection intervals, look if they are inside "replaced areas" or outside of it; increase or decrease the indexes to compensate for added code / removed code. But this is more complex than it looks. But I would try the equality over an AST first. A lot more robust for me. Thierry
|
In reply to this post by Mark Rizun
2014-10-29 10:22 GMT+01:00 Mark Rizun <[hidden email]>:
You insert new code inside the text view? For me, if you replace nodes and display that, then you are "slowly" replacing nodes. Anything which has some display to a user in the loop is "slow".
Use either =, equalTo:withMapping:, match:inContext: to find the relevant new node equal to your old node in the new ast. Thierry |
2014-10-29 11:40 GMT+02:00 Thierry Goubier <[hidden email]>:
Yes
I'm not doing this in loop. I have AST and text view of it. Than I do one replace and update text view. When I wrote "all the time", I ment replacing nodes is very important in my tool, as it does main functionality.
Good, thanks for advice. Firstly, I will check your suggestion with equality. If it fails for me, I'll try my suggestion with calculating inside tool. Thanks again, Mark |
Free forum by Nabble | Edit this page |