Problems using TreeModel>>do:

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Problems using TreeModel>>do:

Peter Kenny-2
I have just had a very frustrating time after rearranging the structure of
some of my data. I have now worked out a way round it, so it's really just a
question of my understanding what is going on; I would be grateful if anyone
could explain.

I wanted to iterate over the leaf nodes of a TreeModel performing the same
operation on each. The TreeModel was introduced to provide a meaningful
structure to what had been just an OrderedCollection of the leaf nodes, so
my previous code used OrderedCollection>>do: for the iteration; the quick
and easy way to do it was to retain the same code operating on the
TreeModel, but include a test to do nothing if the node operated on was not
a leaf. The #do: method kept falling over, saying that an object I had
inserted in the tree was not found. Tracking it through in the debugger, the
problem seemed to be in consulting the objectNodeMap, which is an
IdentityDictionary, in TreeModel>>getNodeFor: anObject. It appeared that,
when looking for an object, the system generated a different hash from that
generated when inserting the object; in my case it generated an index of 2,
when the object was at index 3, and the actual entry at 2 was nil.

Playing with the objects in the debugger, I found that the TreeNodes
responded correctly to #children. I therefore tried replacing #do: with
something which explicitly iterated through the levels of the tree,
performing 'aTreeNode children do:' at each level; all the leaf nodes were
at the same level, so it was easy to determine at which level to break out
and carry out the necessary operation on the leaves. To my surprise, this
worked correctly, so I have now made the change permanent. But I still
cannot see why the easy method using TreeModel>>do: fails. I am surprised
that it is using #getNodeFor: to identify the nodes; the method I have used
could easily be adapted to provide a much clearer and shorter implementation
of #do:. However, the real mystery is about the failure of looking up in the
objectNodeMap; the construction and the reading of the tree are only a
second or two apart, so it is difficult to see how the system can get a
different hash, if that is what has happened.

Every time I think I am getting the hang of how Smalltalk works, something
comes along to disabuse me;  life is very frustrating! Can anybody explain?

Peter


Reply | Threaded
Open this post in threaded view
|

Re: Problems using TreeModel>>do:

Blair McGlashan-3
"Peter Kenny" <[hidden email]> wrote in message
news:[hidden email]...
> I have just had a very frustrating time after rearranging the structure of
> some of my data. I have now worked out a way round it, so it's really just
a
> question of my understanding what is going on; I would be grateful if
anyone
> could explain.
>
> I wanted to iterate over the leaf nodes of a TreeModel performing the same
> operation on each. ...
>... The #do: method kept falling over, saying that an object I had
> inserted in the tree was not found. Tracking it through in the debugger,
the
> problem seemed to be in consulting the objectNodeMap, which is an
> IdentityDictionary, in TreeModel>>getNodeFor: anObject. It appeared that,
> when looking for an object, the system generated a different hash from
that
> generated when inserting the object; ...

An important quality of objects inserted in hashed collection is that they
must have temporally invariant implementations of #hash. You don't say what
you have inserted in your tree, but if it is instances of your own classes
then make sure that the hash remains constant. If it isn't instances of your
own class, or your implementation of #hash relies on combining the hash
values of contained objects, then it might be because your are directly or
indirectly using the hash of a sequenceable collection which has changing
membership. In general it is best to assume that the hash of a Collection
will change when its membership is changed.

>
> Playing with the objects in the debugger, I found that the TreeNodes
> responded correctly to #children. I therefore tried replacing #do: with
> something which explicitly iterated through the levels of the tree,
> performing 'aTreeNode children do:' at each level; ... I am surprised
> that [the system implementation of #do: ends up] using #getNodeFor: to
identify the nodes; the method I have used
> could easily be adapted to provide a much clearer and shorter
implementation
> of #do:. ....

Its because the implementation is in the abstract class, TreeModelAbstract,
which does not hold nodes. It has to implement in terms of the value objects
in the tree rather than nodes, because that is the only common protocol
available at that level. Its subclass TreeModel does hold nodes, and thus
probably has a way to implement the enumeration more efficiently as you
suggest, but I suspect this has not been done because it has not arisen as a
performance or other issue over the years. Anytime one adds an optimisation
that exploits implementation detail in a class hierarchy it makes
subclassing potentially more difficult - the Collection classes are
testament to that - so it makes sense to wait until there is a good reason.
In this case I suspect, for I have not tried it, that implementing #do: as
you suggest (a perfectly good suggestion though) would break
ExpandingTreeModel, and would therefore require further changes to that
class. This is because ExpandingTreeModel only holds the nodes of subtrees
which have been visited, so an enumeration based on TreeNode>>children would
stop as soon as one reached that extent, rather than enumerating the entire
tree. At a glance it looks as if an implementaiton in terms of
TreeModel>>childrenOfNode: would work, since this is overridden in
ExpandingTreeModel to expand unexpanded nodes.

Anyway that's probably a lot more detail than you wanted. The important
thing is to make sure your hash is temporarlly invariant, or that you are
storing objects with temporally invariant hash values.

The other possibility that springs to mind is a multi-threading issue, i.e.
that one thread is updating the model at the same time as another thread is
enumerating it. I only mention this for completeness. It should not be
happening unless you have needed to introduce multiple threads

Regards

Blair


Reply | Threaded
Open this post in threaded view
|

Re: Problems using TreeModel>>do:

Peter Kenny-2
"Blair McGlashan" <[hidden email]> wrote in message
news:[hidden email]...
>
> An important quality of objects inserted in hashed collection is that they
> must have temporally invariant implementations of #hash. You don't say
what
> you have inserted in your tree, but if it is instances of your own classes
> then make sure that the hash remains constant.

I /am/ inserting instances of my own classes in the tree. I have not defined
a hash value for my objects, so it should be whatever the system assigns by
default. The objects do have instance variables of their own which are in
some cases collections, but I hoped that changes there would not affect the
hash. I did try to read this up in the comments, and I found something (in
Object I think) saying that the hash of an arbitrary object is a
pseudo-random number assigned when the object is created and temporally
invariant. I hoped this would mean I could use my objects in this way
safely, but evidently not. I  shall have to stick to my own enumeration.
>
> Anyway that's probably a lot more detail than you wanted. The important
> thing is to make sure your hash is temporally invariant, or that you are
> storing objects with temporally invariant hash values.
>
>
It was a lot of detail, and showed me that as usual I had only a partial
picture. But anyway it all contributes to my education, so thanks.

Peter


Reply | Threaded
Open this post in threaded view
|

Re: Problems using TreeModel>>do:

Blair McGlashan-3
"Peter Kenny" <[hidden email]> wrote in message
news:[hidden email]...
>
> "Blair McGlashan" <[hidden email]> wrote in message
> news:[hidden email]...
> >
> > An important quality of objects inserted in hashed collection is that
they
> > must have temporally invariant implementations of #hash. You don't say
> what
> > you have inserted in your tree, but if it is instances of your own
classes
> > then make sure that the hash remains constant.
>
> I /am/ inserting instances of my own classes in the tree. I have not
defined
> a hash value for my objects, so it should be whatever the system assigns
by
> default. ...

Well that would depend on what you subclassed. If Object, then it would be
the identity hash.

>....The objects do have instance variables of their own which are in
> some cases collections, but I hoped that changes there would not affect
the
> hash....

Well it won't unless you are inheriting a #hash implementation that does do
that for some reason. This seems unlikely from what you say.

>... I did try to read this up in the comments, and I found something (in
> Object I think) saying that the hash of an arbitrary object is a
> pseudo-random number assigned when the object is created and temporally
> invariant. I hoped this would mean I could use my objects in this way
> safely, but evidently not. ...

I don't think you should draw that conclusion. The system's identity hash is
without doubt temporally invariant. It is saved with the image and preserved
throughout an objects lifetime. Some things to be aware of are:
1) When you reconstitute an object through STB, it is effectively a deep
copy, and will not have the same identity hash value since it has a separate
identity.
2) If you are using #become: then it is the object's value that is swapped,
not its identity. Hence the identity hash of an object referenced before a
#become: will be the same after the #become: even though a different object
value is referenced. However since custom #hash implementations are
typically (and rightly) based on object contents, #become: will change the
apparent hash of a referenced object. In short #become: will not invalidate
IdentitySets but it might invalidate Sets. #oneWayBecome:, on the other
hand, effectively discards one objects identity, and so it may invalidate
any hashed collection referencing the object which has been replaced.

If your objects are using the identity hash, as apparently they are, and the
above do not apply then something else must be going wrong, since the
identity hash will not change.

Regards

Blair


Reply | Threaded
Open this post in threaded view
|

Re: Problems using TreeModel>>do:

Chris Uppal-3
In reply to this post by Peter Kenny-2
Peter Kenny wrote:

>The #do: method kept falling over, saying that an object I had
> inserted in the tree was not found. Tracking it through in the debugger,
> the problem seemed to be in consulting the objectNodeMap, which is an
> IdentityDictionary, in TreeModel>>getNodeFor: anObject. It appeared that,
> when looking for an object, the system generated a different hash from
> that generated when inserting the object; in my case it generated an
> index of 2, when the object was at index 3, and the actual entry at 2 was
> nil.

This doesn't sound possible to me, if the situation is as you describe.

Just to check, is the 'hash' you mention the #hash method inherited from
Object, or is it the #identityHash method (also inherited from Object) ?   The
two should have the same result if you (as you said later) haven't overriden
(or inherited an overriden version of) #hash, but the #hash method should not
be called at all in this context, and the way you phrase this makes it sound as
if it is being called.

Alternatively, if #identityHash is being called, and it is changing between
when the object is inserted and your later #do: loop, then I'd say you have a
MUCH more serious problem than just finding a replacement for #do:

    -- chris


Reply | Threaded
Open this post in threaded view
|

Re: Problems using TreeModel>>do:

Blair McGlashan-3
In reply to this post by Blair McGlashan-3
"Blair McGlashan" <[hidden email]> wrote in message
news:[hidden email]...

>
> "Peter Kenny" <[hidden email]> wrote in message
> news:[hidden email]...
> >...
> > I /am/ inserting instances of my own classes in the tree. I have not
> defined
> > a hash value for my objects, so it should be whatever the system assigns
> by
> > default. ...
>
> Well that would depend on what you subclassed. If Object, then it would be
> the identity hash.
> ...

... and furthermore if you are using the default search policy for a
TreeModel, which is identity, then the identity hash will be used
regardless. As I see from your original posting, you have an
IdentityDictionary in your tree model holding the object/node map, so this
must be the case. Sorry for not reading it more carefully in the first
place.

As Chris says, if the identity hashes are apparently changing, then there is
some other problem, or a misunderstanding of what you are seeing in the
debugger.

Regards

Blair


Reply | Threaded
Open this post in threaded view
|

Re: Problems using TreeModel>>do:

Peter Kenny-2
In reply to this post by Chris Uppal-3
"Chris Uppal" <[hidden email]> wrote in message
news:[hidden email]...
>
> Just to check, is the 'hash' you mention the #hash method inherited from
> Object, or is it the #identityHash method (also inherited from Object) ?
The
> two should have the same result if you (as you said later) haven't
overriden
> (or inherited an overriden version of) #hash, but the #hash method should
not
> be called at all in this context, and the way you phrase this makes it
sound as
> if it is being called.
>
> Alternatively, if #identityHash is being called, and it is changing
between
> when the object is inserted and your later #do: loop, then I'd say you
have a
> MUCH more serious problem than just finding a replacement for #do:
>

Chris, Blair
Thanks for your comments. I think I have now cracked it. As usual, it is my
own stupidity - I dug a hole several weeks ago, while struggling to get my
graphing program to work, and have now fallen into it. To answer your query,
I am using identity hash - all my objects are subclassed from Model, and I
used the inherited methods button in the Class Browser to verify that they
inherit #hash from Object. But in this context #hash and #identityHash are
the same - both are primitive 75.

My problem arose because, some weeks ago, I found that the graphing program,
which was meant just to take the data for the time series being plotted and
turn it into pixel coordinates, was somehow altering the original objects in
the process. Rather than doing it properly and finding out where the changes
were taking place, I took a deep copy of the whole data structure before
starting and operated on that. Move on a few weeks, I introduce the
TreeModel as part of the data structure, and now I am trying to iterate over
a deep copy of the tree I have just constructed. This presumably contains
deep copies of the model objects I inserted, which are created at different
times and have different hashes from the originals, so the LookupTable
search fails. But the copy has the same parent-child relationships as the
original, so my search works.

I know it all sounds too daft to be true, and of course I should have seen
the deep copy operation before I fired off the query, but I find it easy to
make changes like this and then forget exactly why they are there.
Interactive development is great, but it does (when used by me, at least)
lead to poorly documented (or undocumented) code. For myself I can put this
down as part of the learning process, but I'm sorry to have wasted your time
on it.

I know I am having a series of problems like this. I do make a serious
effort to solve them myself before firing off a query, but I feel sometimes
I am abusing the newsgroup system by getting my education this way. Am I
being unreasonable in my use of your kindness? (Beware - another problem
coming up!)

Peter


Reply | Threaded
Open this post in threaded view
|

Re: Problems using TreeModel>>do:

Chris Uppal-3
Peter Kenny wrote:

> I know it all sounds too daft to be true, and of course I should have seen
> the deep copy operation before I fired off the query, but I find it easy
> to make changes like this and then forget exactly why they are there.
> Interactive development is great, but it does (when used by me, at least)
> lead to poorly documented (or undocumented) code. For myself I can put
> this down as part of the learning process

A few people around here prefer a style of working where you re-build your
image from scratch every day.

The advantage is that you automatically maintain the discipline of keeping
everything clearly packaged (and ensuring that any non-code changes to the
image state, globals, etc, are defined in packages too -- in post load scripts,
and so on).  Also you always are developing from a fairly clean position, so
you can be reasonably sure that old bugs, etc, won't be affecting you.

Personally, I don't work that way.  And it's not something I'd recommend to
beginners because (if overdone) it could distract your attention from the image
as a repository of /state/ - which is, I think, central to Smalltalk -- but it
may suit you.

(To read earlier discussions on this topic, search the archives for "bonkers"
;-)

BTW, although I work in a "dirty" image normally, I /never/ deploy from
anything but a clean image -- deploying from a dirty image, I find, causes
/far/ more problems than is justified by the very few seconds it saves.


> I know I am having a series of problems like this. I do make a serious
> effort to solve them myself before firing off a query, but I feel
> sometimes I am abusing the newsgroup system by getting my education this
> way. Am I being unreasonable in my use of your kindness? (Beware -
> another problem coming up!)

/I/ don't think you are being unreasonable at all, obviously I don't speak for
anyone else.  In any case, nobody is /forced/ to answer questions (or even read
them -- that's what killfiles are for ;-) so I wouldn't worry about it if I
were you...

    -- chris