I have just had a very frustrating time after rearranging the structure of
some of my data. I have now worked out a way round it, so it's really just a question of my understanding what is going on; I would be grateful if anyone could explain. I wanted to iterate over the leaf nodes of a TreeModel performing the same operation on each. The TreeModel was introduced to provide a meaningful structure to what had been just an OrderedCollection of the leaf nodes, so my previous code used OrderedCollection>>do: for the iteration; the quick and easy way to do it was to retain the same code operating on the TreeModel, but include a test to do nothing if the node operated on was not a leaf. The #do: method kept falling over, saying that an object I had inserted in the tree was not found. Tracking it through in the debugger, the problem seemed to be in consulting the objectNodeMap, which is an IdentityDictionary, in TreeModel>>getNodeFor: anObject. It appeared that, when looking for an object, the system generated a different hash from that generated when inserting the object; in my case it generated an index of 2, when the object was at index 3, and the actual entry at 2 was nil. Playing with the objects in the debugger, I found that the TreeNodes responded correctly to #children. I therefore tried replacing #do: with something which explicitly iterated through the levels of the tree, performing 'aTreeNode children do:' at each level; all the leaf nodes were at the same level, so it was easy to determine at which level to break out and carry out the necessary operation on the leaves. To my surprise, this worked correctly, so I have now made the change permanent. But I still cannot see why the easy method using TreeModel>>do: fails. I am surprised that it is using #getNodeFor: to identify the nodes; the method I have used could easily be adapted to provide a much clearer and shorter implementation of #do:. However, the real mystery is about the failure of looking up in the objectNodeMap; the construction and the reading of the tree are only a second or two apart, so it is difficult to see how the system can get a different hash, if that is what has happened. Every time I think I am getting the hang of how Smalltalk works, something comes along to disabuse me; life is very frustrating! Can anybody explain? Peter |
"Peter Kenny" <[hidden email]> wrote in message
news:[hidden email]... > I have just had a very frustrating time after rearranging the structure of > some of my data. I have now worked out a way round it, so it's really just a > question of my understanding what is going on; I would be grateful if anyone > could explain. > > I wanted to iterate over the leaf nodes of a TreeModel performing the same > operation on each. ... >... The #do: method kept falling over, saying that an object I had > inserted in the tree was not found. Tracking it through in the debugger, the > problem seemed to be in consulting the objectNodeMap, which is an > IdentityDictionary, in TreeModel>>getNodeFor: anObject. It appeared that, > when looking for an object, the system generated a different hash from that > generated when inserting the object; ... An important quality of objects inserted in hashed collection is that they must have temporally invariant implementations of #hash. You don't say what you have inserted in your tree, but if it is instances of your own classes then make sure that the hash remains constant. If it isn't instances of your own class, or your implementation of #hash relies on combining the hash values of contained objects, then it might be because your are directly or indirectly using the hash of a sequenceable collection which has changing membership. In general it is best to assume that the hash of a Collection will change when its membership is changed. > > Playing with the objects in the debugger, I found that the TreeNodes > responded correctly to #children. I therefore tried replacing #do: with > something which explicitly iterated through the levels of the tree, > performing 'aTreeNode children do:' at each level; ... I am surprised > that [the system implementation of #do: ends up] using #getNodeFor: to identify the nodes; the method I have used > could easily be adapted to provide a much clearer and shorter implementation > of #do:. .... Its because the implementation is in the abstract class, TreeModelAbstract, which does not hold nodes. It has to implement in terms of the value objects in the tree rather than nodes, because that is the only common protocol available at that level. Its subclass TreeModel does hold nodes, and thus probably has a way to implement the enumeration more efficiently as you suggest, but I suspect this has not been done because it has not arisen as a performance or other issue over the years. Anytime one adds an optimisation that exploits implementation detail in a class hierarchy it makes subclassing potentially more difficult - the Collection classes are testament to that - so it makes sense to wait until there is a good reason. In this case I suspect, for I have not tried it, that implementing #do: as you suggest (a perfectly good suggestion though) would break ExpandingTreeModel, and would therefore require further changes to that class. This is because ExpandingTreeModel only holds the nodes of subtrees which have been visited, so an enumeration based on TreeNode>>children would stop as soon as one reached that extent, rather than enumerating the entire tree. At a glance it looks as if an implementaiton in terms of TreeModel>>childrenOfNode: would work, since this is overridden in ExpandingTreeModel to expand unexpanded nodes. Anyway that's probably a lot more detail than you wanted. The important thing is to make sure your hash is temporarlly invariant, or that you are storing objects with temporally invariant hash values. The other possibility that springs to mind is a multi-threading issue, i.e. that one thread is updating the model at the same time as another thread is enumerating it. I only mention this for completeness. It should not be happening unless you have needed to introduce multiple threads Regards Blair |
"Blair McGlashan" <[hidden email]> wrote in message
news:[hidden email]... > > An important quality of objects inserted in hashed collection is that they > must have temporally invariant implementations of #hash. You don't say what > you have inserted in your tree, but if it is instances of your own classes > then make sure that the hash remains constant. I /am/ inserting instances of my own classes in the tree. I have not defined a hash value for my objects, so it should be whatever the system assigns by default. The objects do have instance variables of their own which are in some cases collections, but I hoped that changes there would not affect the hash. I did try to read this up in the comments, and I found something (in Object I think) saying that the hash of an arbitrary object is a pseudo-random number assigned when the object is created and temporally invariant. I hoped this would mean I could use my objects in this way safely, but evidently not. I shall have to stick to my own enumeration. > > Anyway that's probably a lot more detail than you wanted. The important > thing is to make sure your hash is temporally invariant, or that you are > storing objects with temporally invariant hash values. > > It was a lot of detail, and showed me that as usual I had only a partial picture. But anyway it all contributes to my education, so thanks. Peter |
"Peter Kenny" <[hidden email]> wrote in message
news:[hidden email]... > > "Blair McGlashan" <[hidden email]> wrote in message > news:[hidden email]... > > > > An important quality of objects inserted in hashed collection is that they > > must have temporally invariant implementations of #hash. You don't say > what > > you have inserted in your tree, but if it is instances of your own classes > > then make sure that the hash remains constant. > > I /am/ inserting instances of my own classes in the tree. I have not defined > a hash value for my objects, so it should be whatever the system assigns by > default. ... Well that would depend on what you subclassed. If Object, then it would be the identity hash. >....The objects do have instance variables of their own which are in > some cases collections, but I hoped that changes there would not affect the > hash.... Well it won't unless you are inheriting a #hash implementation that does do that for some reason. This seems unlikely from what you say. >... I did try to read this up in the comments, and I found something (in > Object I think) saying that the hash of an arbitrary object is a > pseudo-random number assigned when the object is created and temporally > invariant. I hoped this would mean I could use my objects in this way > safely, but evidently not. ... I don't think you should draw that conclusion. The system's identity hash is without doubt temporally invariant. It is saved with the image and preserved throughout an objects lifetime. Some things to be aware of are: 1) When you reconstitute an object through STB, it is effectively a deep copy, and will not have the same identity hash value since it has a separate identity. 2) If you are using #become: then it is the object's value that is swapped, not its identity. Hence the identity hash of an object referenced before a #become: will be the same after the #become: even though a different object value is referenced. However since custom #hash implementations are typically (and rightly) based on object contents, #become: will change the apparent hash of a referenced object. In short #become: will not invalidate IdentitySets but it might invalidate Sets. #oneWayBecome:, on the other hand, effectively discards one objects identity, and so it may invalidate any hashed collection referencing the object which has been replaced. If your objects are using the identity hash, as apparently they are, and the above do not apply then something else must be going wrong, since the identity hash will not change. Regards Blair |
In reply to this post by Peter Kenny-2
Peter Kenny wrote:
>The #do: method kept falling over, saying that an object I had > inserted in the tree was not found. Tracking it through in the debugger, > the problem seemed to be in consulting the objectNodeMap, which is an > IdentityDictionary, in TreeModel>>getNodeFor: anObject. It appeared that, > when looking for an object, the system generated a different hash from > that generated when inserting the object; in my case it generated an > index of 2, when the object was at index 3, and the actual entry at 2 was > nil. This doesn't sound possible to me, if the situation is as you describe. Just to check, is the 'hash' you mention the #hash method inherited from Object, or is it the #identityHash method (also inherited from Object) ? The two should have the same result if you (as you said later) haven't overriden (or inherited an overriden version of) #hash, but the #hash method should not be called at all in this context, and the way you phrase this makes it sound as if it is being called. Alternatively, if #identityHash is being called, and it is changing between when the object is inserted and your later #do: loop, then I'd say you have a MUCH more serious problem than just finding a replacement for #do: -- chris |
In reply to this post by Blair McGlashan-3
"Blair McGlashan" <[hidden email]> wrote in message
news:[hidden email]... > > "Peter Kenny" <[hidden email]> wrote in message > news:[hidden email]... > >... > > I /am/ inserting instances of my own classes in the tree. I have not > defined > > a hash value for my objects, so it should be whatever the system assigns > by > > default. ... > > Well that would depend on what you subclassed. If Object, then it would be > the identity hash. > ... ... and furthermore if you are using the default search policy for a TreeModel, which is identity, then the identity hash will be used regardless. As I see from your original posting, you have an IdentityDictionary in your tree model holding the object/node map, so this must be the case. Sorry for not reading it more carefully in the first place. As Chris says, if the identity hashes are apparently changing, then there is some other problem, or a misunderstanding of what you are seeing in the debugger. Regards Blair |
In reply to this post by Chris Uppal-3
"Chris Uppal" <[hidden email]> wrote in message
news:[hidden email]... > > Just to check, is the 'hash' you mention the #hash method inherited from > Object, or is it the #identityHash method (also inherited from Object) ? The > two should have the same result if you (as you said later) haven't overriden > (or inherited an overriden version of) #hash, but the #hash method should not > be called at all in this context, and the way you phrase this makes it sound as > if it is being called. > > Alternatively, if #identityHash is being called, and it is changing between > when the object is inserted and your later #do: loop, then I'd say you have a > MUCH more serious problem than just finding a replacement for #do: > Chris, Blair Thanks for your comments. I think I have now cracked it. As usual, it is my own stupidity - I dug a hole several weeks ago, while struggling to get my graphing program to work, and have now fallen into it. To answer your query, I am using identity hash - all my objects are subclassed from Model, and I used the inherited methods button in the Class Browser to verify that they inherit #hash from Object. But in this context #hash and #identityHash are the same - both are primitive 75. My problem arose because, some weeks ago, I found that the graphing program, which was meant just to take the data for the time series being plotted and turn it into pixel coordinates, was somehow altering the original objects in the process. Rather than doing it properly and finding out where the changes were taking place, I took a deep copy of the whole data structure before starting and operated on that. Move on a few weeks, I introduce the TreeModel as part of the data structure, and now I am trying to iterate over a deep copy of the tree I have just constructed. This presumably contains deep copies of the model objects I inserted, which are created at different times and have different hashes from the originals, so the LookupTable search fails. But the copy has the same parent-child relationships as the original, so my search works. I know it all sounds too daft to be true, and of course I should have seen the deep copy operation before I fired off the query, but I find it easy to make changes like this and then forget exactly why they are there. Interactive development is great, but it does (when used by me, at least) lead to poorly documented (or undocumented) code. For myself I can put this down as part of the learning process, but I'm sorry to have wasted your time on it. I know I am having a series of problems like this. I do make a serious effort to solve them myself before firing off a query, but I feel sometimes I am abusing the newsgroup system by getting my education this way. Am I being unreasonable in my use of your kindness? (Beware - another problem coming up!) Peter |
Peter Kenny wrote:
> I know it all sounds too daft to be true, and of course I should have seen > the deep copy operation before I fired off the query, but I find it easy > to make changes like this and then forget exactly why they are there. > Interactive development is great, but it does (when used by me, at least) > lead to poorly documented (or undocumented) code. For myself I can put > this down as part of the learning process A few people around here prefer a style of working where you re-build your image from scratch every day. The advantage is that you automatically maintain the discipline of keeping everything clearly packaged (and ensuring that any non-code changes to the image state, globals, etc, are defined in packages too -- in post load scripts, and so on). Also you always are developing from a fairly clean position, so you can be reasonably sure that old bugs, etc, won't be affecting you. Personally, I don't work that way. And it's not something I'd recommend to beginners because (if overdone) it could distract your attention from the image as a repository of /state/ - which is, I think, central to Smalltalk -- but it may suit you. (To read earlier discussions on this topic, search the archives for "bonkers" ;-) BTW, although I work in a "dirty" image normally, I /never/ deploy from anything but a clean image -- deploying from a dirty image, I find, causes /far/ more problems than is justified by the very few seconds it saves. > I know I am having a series of problems like this. I do make a serious > effort to solve them myself before firing off a query, but I feel > sometimes I am abusing the newsgroup system by getting my education this > way. Am I being unreasonable in my use of your kindness? (Beware - > another problem coming up!) /I/ don't think you are being unreasonable at all, obviously I don't speak for anyone else. In any case, nobody is /forced/ to answer questions (or even read them -- that's what killfiles are for ;-) so I wouldn't worry about it if I were you... -- chris |
Free forum by Nabble | Edit this page |