Hi marcus et al
why RBComment is not called RBCommentNode and why it is not a subclass of RBNode? Stef |
2016-11-03 13:41 GMT+01:00 stepharo <[hidden email]>: Hi marcus et al RBComment wasn't part of the original RefactoringBrowser, it only had a "comments" instance variable on its nodes. This was refactored out to its own RBComment class (I think). But no "real" AST-Node class. +1 for converting this into a real AST node. |
Why convert it to a node? It has no affect on the running code.
John Brant > On Nov 3, 2016, at 8:23 AM, Nicolai Hess <[hidden email]> wrote: > > > > 2016-11-03 13:41 GMT+01:00 stepharo <[hidden email]>: > Hi marcus et al > > why RBComment is not called RBCommentNode and why it is not a subclass of RBNode? > > Stef > > > > RBComment wasn't part of the original RefactoringBrowser, it only had a "comments" instance variable on its nodes. > This was refactored out to its own RBComment class (I think). But no "real" AST-Node class. > > +1 for converting this into a real AST node. > |
Because like that
we have a correct and nice hierarchy and all the nodes inherits from it and we can have a nice visitor and make assumption about the root. Regularity is nice and it helps controlling complexity. Why a comment could not have a parent? Stef Le 3/11/16 à 14:53, John Brant a écrit : > Why convert it to a node? It has no affect on the running code. > > > John Brant > >> On Nov 3, 2016, at 8:23 AM, Nicolai Hess <[hidden email]> wrote: >> >> >> >> 2016-11-03 13:41 GMT+01:00 stepharo <[hidden email]>: >> Hi marcus et al >> >> why RBComment is not called RBCommentNode and why it is not a subclass of RBNode? >> >> Stef >> >> >> >> RBComment wasn't part of the original RefactoringBrowser, it only had a "comments" instance variable on its nodes. >> This was refactored out to its own RBComment class (I think). But no "real" AST-Node class. >> >> +1 for converting this into a real AST node. >> > > |
> On Nov 3, 2016, at 9:00 AM, stepharo <[hidden email]> wrote:
> > Because like that > > we have a correct and nice hierarchy and > > all the nodes inherits from it and > > we can have a nice visitor and make assumption about the root. > > Regularity is nice and it helps controlling complexity. > > Why a comment could not have a parent? A comment isn’t code. The nodes are for code. A comment is more like a token — RBComment basically has the same two instance variables as RBToken. Should tokens also be nodes? For example, should a left paren token be represented as a node? If a comment is a node, how is it visited? For example, if you have (“comment1” self “comment2” foo: “comment3” a bar: “comment4” b “comment5”), what is the traversal order for the nodes? Furthermore, if comments are nodes, how do they affect all of the code rewriting and validation that is part of the RB and its rewrite tools? John Brant |
Am 03.11.2016 17:15 schrieb "John Brant" <[hidden email]>: OK, I think I am conceived. |
In reply to this post by John Brant-2
> A comment isn’t code. The nodes are for code. A comment is more like a token — RBComment basically has the same two instance variables as RBToken. Should tokens also be nodes? For example, should a left paren token be represented as a node? > > If a comment is a node, how is it visited? > For example, if you have (“comment1” self “comment2” foo: “comment3” a bar: “comment4” b “comment5”), what is the traversal order for the nodes? To reproduce the code as it was typed why the representation a message is not a sequence. because you need to encode that you have comment1 and self and comment2 We do not have a structure that is responsible for that order. We could have two entries in AST node; one discrete where you can cherrypick what you want receiver arguments selector and one that represent the node placement. Now when we emit bytecode comment would do nothing. How having them as token helps? You will visit the receiver and check if it has a token and place it. I found that strange that comments are not part of the code tree as first class citizen. > Furthermore, if comments are nodes, how do they affect all of the code rewriting and validation that is part of the RB and its rewrite tools With comments cannot act as nullobjects for such operation? Stef |
https://pharo.fogbugz.com/f/cases/19281/We-should-improve-the-class-comment-of-RBComment
If we would have better comments.... a kind of a dream. Stef |
In reply to this post by stepharo
> On Nov 3, 2016, at 11:46 AM, stepharo <[hidden email]> wrote: > > >> A comment isn’t code. The nodes are for code. A comment is more like a token — RBComment basically has the same two instance variables as RBToken. Should tokens also be nodes? For example, should a left paren token be represented as a node? >> >> If a comment is a node, how is it visited? >> For example, if you have (“comment1” self “comment2” foo: “comment3” a bar: “comment4” b “comment5”), what is the traversal order for the nodes? > To reproduce the code as it was typed why the representation a message is not a sequence. because you need to encode that you > have comment1 and self and comment2 But how does this fit with a tree structure. For example, should “comment5” be associated with the “b” variable node, or the #foo:bar: message, or the sequence node that contains the #foo:bar: message, or the block or method node that contains the sequence node. You can choose, but only the writer of the comment can tell if you are right. > We do not have a structure that is responsible for that order. > We could have two entries in AST node; > one discrete where you can cherrypick what you want receiver arguments selector > and one that represent the node placement. > > Now when we emit bytecode comment would do nothing. > > How having them as token helps? You will visit the receiver and check if it has a token and place it. > I found that strange that comments are not part of the code tree as first class citizen. Tokens are essentially what they are now — although they aren’t in the RBToken hierarchy. So far, the only advantages that I’ve heard you mention for nodes are visiting nodes and having a parent. The example above shows why neither of these are valid — you can’t determine what is the correct parent for the comment. Anyway, I think comments should not be in code, but rather on code. When you add a comment, you would select a node or nodes that it applied to. When viewing a method, the comment could be displayed inline (like we have now), off to the side like reviewer comments in word processors, or a highlight that when you move the mouse over would show in a popup. Furthermore, there could be different comment types. For example, one could have design comments, todo/review comments, example code comments, bug fix comments, etc. The code view could be customized to show each one differently. Comments could have pictures and other non-string items. Having comments as objects outside the method’s source string would allow them to be easily searched. For example, someone could search for a certain bug fix. Currently, if a bug fix id is in a comment, it requires a full text search of all method text. This is slow. Therefore, many companies start adding these bug fix ids to the code as symbols so that a senders search will work. I’ve seen companies that put symbols like #Fix12345 as statements inside of methods. However, this doesn’t work for Smalltalk’s that remove unused literals, so in those cases they do something like adding a statement like #Fix12345 == #Fix12345 which generally isn’t optimized away. In addition to comments, I believe that annotations (not primitives) should be on code instead of in code like they are now. With a good implementation one could unify the concepts of comments and annotations. >> Furthermore, if comments are nodes, how do they affect all of the code rewriting and validation that is part of the RB and its rewrite tools > With comments cannot act as nullobjects for such operation? Yes, someone could spend time making all of this work with the existing code. However, why not fix the real problem in that comments and methods should not be a single string, but rather objects. John Brant |
> But how does this fit with a tree structure. For example, should “comment5” be associated with the “b” variable node, or the #foo:bar: message, or the sequence node that contains the #foo:bar: message, or the block or method node that contains the sequence node. You can choose, but only the writer of the comment can tell if you are right.
How tokens solves this problem? I think that we do not really care one objective is to be able to recreate the code from the tree. I worked on a project in C where the guys want to know if the code is well commented and we built heuristics to associate the comments to the closest node. > >> We do not have a structure that is responsible for that order. >> We could have two entries in AST node; >> one discrete where you can cherrypick what you want receiver arguments selector >> and one that represent the node placement. >> >> Now when we emit bytecode comment would do nothing. >> >> How having them as token helps? You will visit the receiver and check if it has a token and place it. >> I found that strange that comments are not part of the code tree as first class citizen. > Tokens are essentially what they are now — although they aren’t in the RBToken hierarchy. So far, the only advantages that I’ve heard you mention for nodes are visiting nodes and having a parent. The example above shows why neither of these are valid — you can’t determine what is the correct parent for the comment. > > Anyway, I think comments should not be in code, but rather on code. When you add a comment, you would select a node or nodes that it applied to. When viewing a method, the comment could be displayed inline (like we have now), off to the side like reviewer comments in word processors, or a highlight that when you move the mouse over would show in a popup. > Furthermore, there could be different comment types. For example, one could have design comments, todo/review comments, example code comments, bug fix comments, etc. The code view could be customized to show each one differently. Comments could have pictures and other non-string items. It would be really nice to have that. I would put optional type declaration (and they should be optional). Now for the main comment of a method I would not separate it because this is too important. > Having comments as objects outside the method’s source string would allow them to be easily searched. For example, someone could search for a certain bug fix. Currently, if a bug fix id is in a comment, it requires a full text search of all method text. This is slow. Therefore, many companies start adding these bug fix ids to the code as symbols so that a senders search will work. I’ve seen companies that put symbols like #Fix12345 as statements inside of methods. However, this doesn’t work for Smalltalk’s that remove unused literals, so in those cases they do something like adding a statement like #Fix12345 == #Fix12345 which generally isn’t optimized away. :) > In addition to comments, I believe that annotations (not primitives) should be on code instead of in code like they are now. With a good implementation one could unify the concepts of comments and annotations. Yes this is interesting. Now the problem with annotations is that you have two kinds; - the ones that decorate - the ones that change or define the semantics (for those ones I would not separate) them >>> Furthermore, if comments are nodes, how do they affect all of the code rewriting and validation that is part of the RB and its rewrite tools >> With comments cannot act as nullobjects for such operation? > Yes, someone could spend time making all of this work with the existing code. However, why not fix the real problem in that comments and methods should not be a single string, but rather objects. Marcus did that during his phd and he blew up memory. So after we got a guy that worked on tree compression but he never finished. Because this is the part that makes all these ideas flying or not. Anyway thanks for the discussion. It lets me think and I learn something. I improved the class comment. Because if it would have been good in the first place I would not even ask the question. > > > John Brant > |
> On Nov 3, 2016, at 4:03 PM, stepharo <[hidden email]> wrote: > >> But how does this fit with a tree structure. For example, should “comment5” be associated with the “b” variable node, or the #foo:bar: message, or the sequence node that contains the #foo:bar: message, or the block or method node that contains the sequence node. You can choose, but only the writer of the comment can tell if you are right. > How tokens solves this problem? > I think that we do not really care one objective is to be able to recreate the code from the tree. I worked on a project in C > where the guys want to know if the code is well commented and we built heuristics to associate the comments to the closest node. The nodes have comments and the comments have their source and position so you can recreate the code from the tree already. It may not be as easy as you want, but all the information is there. >>>> Furthermore, if comments are nodes, how do they affect all of the code rewriting and validation that is part of the RB and its rewrite tools >>> With comments cannot act as nullobjects for such operation? >> Yes, someone could spend time making all of this work with the existing code. However, why not fix the real problem in that comments and methods should not be a single string, but rather objects. > > Marcus did that during his phd and he blew up memory. So after we got a guy that worked on tree compression but he never finished. > Because this is the part that makes all these ideas flying or not. From what I remember, ASTs are roughly 10x the size of the code string. I’m not suggesting holding on to them in the image when they can be loaded/built from a file like the method sources currently are. Instead, we can associate a comment/annotation to a particular node in the AST without having to keep the AST around. For example, we could say that some comment is associated to the 4th node visited from the standard AST traversal. We don’t need to keep the AST around since we can recreate it and the 4th node visited will be the same. John Brant |
2016-11-04 1:54 GMT+01:00 John Brant <[hidden email]>:
But this vision is a bit static. Code is living. Shall these annotations be lost on next refactoring? Or are we going to diff the two AST and salvage as many annotations as we can? Or are we going to keep an history of these comments in the versioning system? |
+1 Norbert |
On 11/04/2016 03:50 AM, Norbert Hartl wrote:
> >> Am 04.11.2016 um 07:53 schrieb Nicolas Cellier >> <[hidden email] >> <mailto:[hidden email]>>: >> >> >> >> 2016-11-04 1:54 GMT+01:00 John Brant <[hidden email] >> <mailto:[hidden email]>>: >> >> >> > On Nov 3, 2016, at 4:03 PM, stepharo <[hidden email] <mailto:[hidden email]>> wrote: >> >>>> Furthermore, if comments are nodes, how do they affect all of the code rewriting and validation that is part of the RB and its rewrite tools >> >>> With comments cannot act as nullobjects for such operation? >> >> Yes, someone could spend time making all of this work with the existing code. However, why not fix the real problem in that comments and methods should not be a single string, but rather objects. >> > >> > Marcus did that during his phd and he blew up memory. So after we got a guy that worked on tree compression but he never finished. >> > Because this is the part that makes all these ideas flying or not. >> >> From what I remember, ASTs are roughly 10x the size of the code >> string. I’m not suggesting holding on to them in the image when >> they can be loaded/built from a file like the method sources >> currently are. Instead, we can associate a comment/annotation to a >> particular node in the AST without having to keep the AST around. >> For example, we could say that some comment is associated to the >> 4th node visited from the standard AST traversal. We don’t need to >> keep the AST around since we can recreate it and the 4th node >> visited will be the same. >> >> >> John Brant >> >> >> But this vision is a bit static. Code is living. Shall these >> annotations be lost on next refactoring? Or are we going to diff the >> two AST and salvage as many annotations as we can? Or are we going to >> keep an history of these comments in the versioning system? > > +1 It's only static if you do a poor implementation. The refactorings already operate on the AST level so we don't need any AST diffing for them. We would need to update the rewriter to update this information for nodes that get replaced. This updating would likely need the help of the annotation objects since some annotations might not want to be copied to the replaced node, and others (e.g., comments) would need to be copied. The editor will need to be changed to display these annotations/comments so we can also change it to keep track of the rearrangement of the nodes. That leaves custom tools that use the compile:* methods. These will need to be updated if they want to generate code with proper annotations/comments. Of course, all of these will need to be stored where they can be versioned. Currently, we store some of the method properties outside of a method body: !SomeClass methodsFor: 'some category' stamp: 'SomeAuthor 1/1/1111 11:11:11'! The additional properties could be store similarly. The ANSI standard already defines the SIF format which contains annotations that can be added to filed-out objects. John Brant |
Free forum by Nabble | Edit this page |