Hi,
this is probably one of the most academic questions I posted on this list so far, but I guess there's a first time for everything :) I've been fiddling around with Ternary search trees, imported from a Squeak package by Avi Bryant. There's a class TSTree and a class TSTree node. To the outside world only an instance of TSTree is visible and all client requests are passed to that instance. However all operations like #at:put: #at: and so on are implemented in the nodes themselves. My question is where should this code reside from your point of view, in the nodes themselves or rather in the tree? Günther |
Günther,
> There's a class TSTree and a class TSTree node. > To the outside world only an instance of TSTree is visible and all > client requests are passed to that instance. > > However all operations like #at:put: #at: and so on are implemented in > the nodes themselves. > > My question is where should this code reside from your point of view, in > the nodes themselves or rather in the tree? Note sure what you are asking here. If you are asking whether I, as a /user/ of the tree, should be expected to manipulate the nodes directly, then I think the answer is no -- the TSTree is primarily a Collection object, and it's ternary tree /implementation/ is correctly hidden from me. (Although a different kind of object which did present itself as a tree, would make sense too, but I'm not convinced that it would fit into the Collections framework). If you are asking whether, in the implementation, TSTree>>at: (say) should delegate to the nodes themselves or manipulate the nodes directly, then it becomes a matter of design aesthetics, and design pragmatics. Which is easier to understand, and which is easer to code (correctly). The general approach in OO is to push responsibility onto the object which is best placed to provide that function. So, since (part of) a node's job is to know where it is in a tree (what its children are and so on), it makes sense to push functions that relate to location in the tree off onto the nodes too. As a rule of thumb, code which constant asks other objects for information, looks at the information and then tells the other objects to do something based on that, is not making best use of the OO paradigm. E.g, a hypothetical method in some sort of Tree inAndBelow: aNode do: a1Block a1Block value: aNode. children := aNode chldren. children do: [:each | self inAndBelow: each do: a1Block]. is not letting the nodes get on with what they are best at -- knowing what their children are -- but is second-guessing them by asking for a list of their children (which might be expensive if the node has to build a temporary collection) and then iterating over that collection. It may not seem to matter much here (and indeed it doesn't in this particular case), but that general idea -- giving responsibility to the object best placed to implement it -- is pretty much at the heart of OO. That's what people mean when they advise you: "Tell don't ask". As an example where this matters more, imagine an optimised implementation that used several different kinds of node. The nodes store their children in different ways, but are otherwise identical. The idea is that we use whichever kind of Node is most efficient in space/time for a given number of children. Not all kinds of Node can efficiently provide an Array of its children, but they all can iterate over their children efficiently. So the tree object /has/ to delegate to the nodes, or a large chunk of the optimisation is wasted. -- chris |
Chris,
thanks for answering. I was talking, as you presumed correctly about the distribution of responsibility between the TSTree collection object and and its elements, the nodes. I agree with what you say, the reason why I asked is this: I just started reading "Algorithms in C" by Sedgewick, and in quite a few examples he has implemented the traversing using loops and stacks, in quite an elegant way, instead of recursion. Now surely that reflects the non-OO nature of C and is thus not really 101 applicable in Smalltalk, but, and this is quite important, allows for having to pass less parameters when traversing. This way it's rather easy to determine "how deep" something was found in a tree, but you'd have to pass additional parameters in the OO way. At this point, as I certainly don't want to go C, I'm not even sure whether or not to continue reading the "Algorithms in C" book, and rather switch to the Java, or C++ edition. Günther Chris Uppal schrieb: > Günther, > >> There's a class TSTree and a class TSTree node. >> To the outside world only an instance of TSTree is visible and all >> client requests are passed to that instance. >> >> However all operations like #at:put: #at: and so on are implemented in >> the nodes themselves. >> >> My question is where should this code reside from your point of view, in >> the nodes themselves or rather in the tree? > > Note sure what you are asking here. > > If you are asking whether I, as a /user/ of the tree, should be expected to > manipulate the nodes directly, then I think the answer is no -- the TSTree is > primarily a Collection object, and it's ternary tree /implementation/ is > correctly hidden from me. (Although a different kind of object which did > present itself as a tree, would make sense too, but I'm not convinced that it > would fit into the Collections framework). > > If you are asking whether, in the implementation, TSTree>>at: (say) should > delegate to the nodes themselves or manipulate the nodes directly, then it > becomes a matter of design aesthetics, and design pragmatics. Which is easier > to understand, and which is easer to code (correctly). The general approach in > OO is to push responsibility onto the object which is best placed to provide > that function. So, since (part of) a node's job is to know where it is in a > tree (what its children are and so on), it makes sense to push functions that > relate to location in the tree off onto the nodes too. > > As a rule of thumb, code which constant asks other objects for information, > looks at the information and then tells the other objects to do something based > on that, is not making best use of the OO paradigm. E.g, a hypothetical method > in some sort of Tree > > inAndBelow: aNode do: a1Block > a1Block value: aNode. > children := aNode chldren. > children do: [:each | self inAndBelow: each do: a1Block]. > > is not letting the nodes get on with what they are best at -- knowing what > their children are -- but is second-guessing them by asking for a list of their > children (which might be expensive if the node has to build a temporary > collection) and then iterating over that collection. > > It may not seem to matter much here (and indeed it doesn't in this particular > case), but that general idea -- giving responsibility to the object best placed > to implement it -- is pretty much at the heart of OO. > > That's what people mean when they advise you: "Tell don't ask". > > As an example where this matters more, imagine an optimised implementation that > used several different kinds of node. The nodes store their children in > different ways, but are otherwise identical. The idea is that we use whichever > kind of Node is most efficient in space/time for a given number of children. > Not all kinds of Node can efficiently provide an Array of its children, but > they all can iterate over their children efficiently. So the tree object /has/ > to delegate to the nodes, or a large chunk of the optimisation is wasted. > > -- chris > > |
Günther,
If I may, I would suggest the following reference for the pure O-O design of data structures: --Reusable Software, Bertrand Meyer Bertrand Meyer is the creator of Eiffel, and Eiffel and Smalltalk have at least one thing in common: they are both pure O-O languages. Everything is an object. Although I am leaving Eiffel now for Smalltalk (originally I had planned to embrace the two), Eiffel offers much instruction in the way of solid data structure design. I have a commercial license for EiffelStudio, but an open source license became available in April that allows one to download and use this $5,000 product for free (no limitation of features), so long as one uses it for non-commercial purposes only. EiffelEnvision is likewise free, and allows Eiffel to run under the CLR in VisualStudio. You can download both at http://www.eiffel.com. The data structures cluster is remarkable, and will offer you a better education than most any textbook. The source for all of the Eiffel clusters is available within EiffelStudio, much as in Smalltalk. I, myself, am working on an ActiveTreeModel for Dolphin based loosely upon Eiffel's CURSOR_TREE class. Several other references that would be of immense assistance are --Fundamental Algorithms, Vol 1, Donald Knuth --Object-Oriented Software Construction, Bertrand Meyer --Data Structures and Software Development in an Object Oriented Domain, Eiffel Edition, Jean-Paul Tremblay and Grant A. Cheston Cheers, Eric > -----Original Message----- > From: Günther Schmidt [mailto:[hidden email]] > Posted At: Saturday, June 24, 2006 12:19 PM > Posted To: comp.lang.smalltalk.dolphin > Conversation: OO Recursion > Subject: Re: OO Recursion > > Chris, > > thanks for answering. > > I was talking, as you presumed correctly about the distribution of > responsibility between the TSTree collection object and and its > elements, the nodes. > > I agree with what you say, the reason why I asked is this: > > I just started reading "Algorithms in C" by Sedgewick, and in quite a > few examples he has implemented the traversing using loops and stacks, > in quite an elegant way, instead of recursion. Now surely that > the non-OO nature of C and is thus not really 101 applicable in > Smalltalk, but, and this is quite important, allows for having to pass > less parameters when traversing. > > This way it's rather easy to determine "how deep" something was found in > a tree, but you'd have to pass additional parameters in the OO way. > > At this point, as I certainly don't want to go C, I'm not even sure > whether or not to continue reading the "Algorithms in C" book, and > rather switch to the Java, or C++ edition. > > > > Günther > > > Chris Uppal schrieb: > > Günther, > > > >> There's a class TSTree and a class TSTree node. > >> To the outside world only an instance of TSTree is visible and all > >> client requests are passed to that instance. > >> > >> However all operations like #at:put: #at: and so on are implemented > >> the nodes themselves. > >> > >> My question is where should this code reside from your point of view, > in > >> the nodes themselves or rather in the tree? > > > > Note sure what you are asking here. > > > > If you are asking whether I, as a /user/ of the tree, should be expected > to > > manipulate the nodes directly, then I think the answer is no -- the > TSTree is > > primarily a Collection object, and it's ternary tree /implementation/ is > > correctly hidden from me. (Although a different kind of object which > did > > present itself as a tree, would make sense too, but I'm not convinced > that it > > would fit into the Collections framework). > > > > If you are asking whether, in the implementation, TSTree>>at: (say) > should > > delegate to the nodes themselves or manipulate the nodes directly, then > it > > becomes a matter of design aesthetics, and design pragmatics. Which is > easier > > to understand, and which is easer to code (correctly). The general > approach in > > OO is to push responsibility onto the object which is best placed to > provide > > that function. So, since (part of) a node's job is to know where it is > in a > > tree (what its children are and so on), it makes sense to push functions > that > > relate to location in the tree off onto the nodes too. > > > > As a rule of thumb, code which constant asks other objects for > information, > > looks at the information and then tells the other objects to do > something based > > on that, is not making best use of the OO paradigm. E.g, a hypothetical > method > > in some sort of Tree > > > > inAndBelow: aNode do: a1Block > > a1Block value: aNode. > > children := aNode chldren. > > children do: [:each | self inAndBelow: each do: a1Block]. > > > > is not letting the nodes get on with what they are best at -- knowing > what > > their children are -- but is second-guessing them by asking for a list > of their > > children (which might be expensive if the node has to build a temporary > > collection) and then iterating over that collection. > > > > It may not seem to matter much here (and indeed it doesn't in this > particular > > case), but that general idea -- giving responsibility to the object best > placed > > to implement it -- is pretty much at the heart of OO. > > > > That's what people mean when they advise you: "Tell don't ask". > > > > As an example where this matters more, imagine an optimised > implementation that > > used several different kinds of node. The nodes store their children in > > different ways, but are otherwise identical. The idea is that we use > whichever > > kind of Node is most efficient in space/time for a given number of > children. > > Not all kinds of Node can efficiently provide an Array of its children, > but > > they all can iterate over their children efficiently. So the tree > object /has/ > > to delegate to the nodes, or a large chunk of the optimisation is > wasted. > > > > -- chris > > > > |
Eric,
thanks. One thing I know for sure now, is that there's no point for me in continuing with Sedgewicks C book. Eventhough I understand his algorithms to be quite fundamental I am unable to *translate* them into the OO way. So I just hope that his C++ book will cover just about the same set of problems as the C book, but in a *proper* OO way this time. Thanks for pointing out Eiffel to me, but I don't think I'll follow that up, it'd just be to confusing right now. Günther |
In reply to this post by Günther Schmidt
Günther,
> I just started reading "Algorithms in C" by Sedgewick, and in quite a > few examples he has implemented the traversing using loops and stacks, > in quite an elegant way, instead of recursion. Ah I see. Yes that makes sense. > Now surely that reflects > the non-OO nature of C and is thus not really 101 applicable in > Smalltalk, but, and this is quite important, allows for having to pass > less parameters when traversing. Hmm.. not really. Good code is good code, it's just that Sedgewick's implementation is more sophisticated than either of the options we have discussed so far. More below. > At this point, as I certainly don't want to go C, I'm not even sure > whether or not to continue reading the "Algorithms in C" book, and > rather switch to the Java, or C++ edition. Whatever you do, /don't/ give up on Sedgewick ! I haven't looked at his Java or C++ books, but I'd be willing to bet that there's not too much difference between them and his C books (which I have read). It would be (I think) too much work for the author, and too confusing for the reader, to convert algorithms into the most OO form -- simply because OO tends to split the code up, which is a good idea in real systems, but not good in a textbook about algorithms. In this case, what he is doing is showing that several tree-traversal algorithms have structures which are more closely related than they appear at first sight. That's a useful insight for understanding the algorithms, and is rather elegant in its own right. So far, so good. But what happens when we want to write real code (in C or Smalltalk) ? We may not need the flexibility provided by sharing code between different traversals. We may only want to traverse in one order, or we may not care about the order at all. In such a case, we would be ill-advised to try to abstract out the traversal order -- it just means we have to write more code. (I'm pretty sure this is the case for Avi's TSTree.) But if we /do/ wamt that flexibility, or if the traversal we want to implement is one that is more easily coded using Sedgewick's insights, then what ? Well, in C we don't have a lot of choice. In Smalltalk we now have /three/ options -- we could code it in the Tree like Sedgewick, we could code it in the Node like Avi, or -- and this is what's novel -- we could say: OK we have a new responsibility: managing the traversal of a Tree in a flexible and efficient manner. So we need a new object to join in the fun. We could go the whole hog and create a system of TreeIterators, or we might keep it simple and just add a Queue object to the tree traversal code (and use a FIFO-, LIFO-, Priority-, or whatever-Queue as appropriate). I don't know if that helps any. I repeat: don't give up on Sedgewick. If necessary just code in a not-very-OO style for a while -- you can always recast any algorithm in OO style later once you understand it, and how you want to use it, fully. -- chris |
Dear Chris,
> I don't know if that helps any. I repeat: don't give up on Sedgewick. If > necessary just code in a not-very-OO style for a while -- you can always recast > any algorithm in OO style later once you understand it, and how you want to use > it, fully. > It does help actually, and no, I won't give up Sedgewick. What also helped was reading again about the Iterator pattern in the original Design Patterns book. I had dismissed it to easily as *we Smalltalkers don't need that* because we really don't need it when dealing with flat collections, as the authors of the "Design Patterns Smalltalk Companion" stated correctly. However this is true for flat collections, but when it comes to composites, it's a whole different ball game. Günther |
Free forum by Nabble | Edit this page |