Dear all,
I've written about implementing binary search trees: https://pharokeepers.github.io/pharo/2019/07/07/SmiljanaBinarySearchTreeinPharo.html Feedback and opinions is always welcome :) Best regards, Smiljana Knezev |
It makes is difficult to read (I do not know if this is wordpress effect or not). Once you have your tree implementation and tests I would really like to see if you can get another implementation and measure the impact of introducing a NullNode to remove all the isNil checks.
|
depth:aNode
"Returns depth of a tree starting from the given node"
| leftDepth rightDepth |
leftDepth := -1.
aNode leftChild isNotNil
ifTrue: [ leftDepth := self depth: aNode leftChild ].
rightDepth := -1.
aNode rightChild isNotNil
ifTrue: [ rightDepth := self depth: aNode rightChild ].
( leftDepth > rightDepth )
ifTrue: [ ^ (1 + leftDepth) ]
ifFalse: [^ (1 + rightDepth ) ]. => leftDepth > rightDepth
ifTrue: [ ^ 1 + leftDepth ]
ifFalse: [^ 1 + rightDepth ]. And even better move out of block returns when possible ^ leftDepth > rightDepth
ifTrue: [ 1 + leftDepth ]
ifFalse: [ 1 + rightDepth ]. With a nullNode the code should look like depth:aNode
"Returns depth of a tree starting from the given node”
| leftDepth rightDepth |
leftDepth := self depth: aNode leftChild.
rightDepth := self depth: aNode rightChild.
^ leftDepth > rightDepth
ifTrue: [ 1 + leftDepth ]
ifFalse: [1 + rightDepth ]
|
In reply to this post by Smiljana Knezev
More feedback :) Check the last lecture of the mooc Did you read Pharo with Style? I do not know if LinkedList is the one of the system used by Process but in that case use the alternate implementation. bfs: info
"Breadth fisrt search for an information. If there is no key containing given infomration returns nil. If the info is found returns a node containing it"
| queue |
self isEmpty
ifTrue: [ ^nil ].
queue := LinkedList new.
queue addLast: self root.
[ queue isNotEmpty ] whileTrue: [
| node |
node := queue removeFirst.
node key == info
ifTrue: [ ^node ].
node leftChild isNotNil
ifTrue: [ queue addLast: node leftChild ].
node rightChild isNotNil
ifTrue: [ queue addLast: node rightChild ].
].
^nil. => bfs: info
"Breadth fisrt search for an information. If there is no key containing given infomration returns nil. If the info is found returns a node containing it"
| queue |
self isEmpty ifTrue: [ ^ self ].
queue := LinkedList new.
queue addLast: self root.
[ queue isNotEmpty ] whileTrue: [
| node |
node := queue removeFirst.
node key == info
ifTrue: [ ^node ].
node leftChild isNotNil
ifTrue: [ queue addLast: node leftChild ].
node rightChild isNotNil
ifTrue: [ queue addLast: node rightChild ].
] |
But - bfs: is a bad name. - Also why not having a block to specify what we are looking for?
|
bfs: aValue
=> findBreadthFirst: aValue dfs: info node: aNode => findDeepFirst: aValue startingFrom: aNode
|
In reply to this post by Smiljana Knezev
Smiljana Knezev wrote
> Dear all, > > I've written about implementing binary search trees: > https://pharokeepers.github.io/pharo/2019/07/07/SmiljanaBinarySearchTreeinPharo.html > > Feedback and opinions is always welcome :) > > Best regards, > Smiljana Knezev Your implementation is unnecesarily complicated. Remember, tree is a recursive data structure. You may have have reasons to split tree and node into different classes, but at least delegate operations to node. For example, #depth and #size can be defined as following methods of Node class: depth self isLeaf ifTrue: [ ^ 0 ]. ^ leftChild depth max: rightChild depth size self isLeaf ifTrue: [ ^ 1 ]. ^ leftChild size + rightChlid size Most other operations can also be defined in similar fashion. -- Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html |
In reply to this post by ducasse
I agree #bfs: is not explicit, but #breadthFirstSearch: is a well
know term, why not just that? Or even #breathFirstSearchFor: In comparison #findBreadthFirst: and #findDeepFirst: feel a bit awkward to me. (I think its the repetition rhythm of "f" words) cheers -ben On Mon, 8 Jul 2019 at 12:55, ducasse <[hidden email]> wrote: > > bfs: aValue > > => > > findBreadthFirst: aValue > > > dfs: info node: aNode > > => > > findDeepFirst: aValue startingFrom: aNode |
Thank you all very much. This is quite useful for me and I can learn a lot. best regards, Smiljana Knezev сре, 10. јул 2019. у 12:29 Ben Coman <[hidden email]> је написао/ла: I agree #bfs: is not explicit, but #breadthFirstSearch: is a well |
In reply to this post by Smiljana Knezev
Hello Smiljana,
Thanks for having written down this document. I am not expert in algorithm, so I would consider myself a simple user. I have developed complex software for some times and I have never seen the need of having a binary search tree. I guess this is probably partly because of my lack of expertise in binary search tree and partly because experts in binary search trees assume that people know what it is about and in what it is useful. My question is, when should a programmer ever need to use binary search tree? Can you add some examples on what these trees are good for, and how an average programmer should look into it. I think this will be a valuable and easy way to expand your blog. Cheers, Alexandre
|
Binary search is extremely useful to quickly find something in a sorted collection, like finding if a string or number is included in a large collection.
BTW, we already have an implementation, see SequenceableCollection>>#findBinary: and friends. > On 10 Jul 2019, at 18:53, Alexandre Bergel via Pharo-users <[hidden email]> wrote: > > > From: Alexandre Bergel <[hidden email]> > Subject: Re: [Pharo-users] [GSoC blog post] Binary Search Trees > Date: 10 July 2019 at 18:53:01 GMT+2 > To: Any question about pharo is welcome <[hidden email]> > > > Hello Smiljana, > > Thanks for having written down this document. I am not expert in algorithm, so I would consider myself a simple user. I have developed complex software for some times and I have never seen the need of having a binary search tree. I guess this is probably partly because of my lack of expertise in binary search tree and partly because experts in binary search trees assume that people know what it is about and in what it is useful. > > My question is, when should a programmer ever need to use binary search tree? Can you add some examples on what these trees are good for, and how an average programmer should look into it. I think this will be a valuable and easy way to expand your blog. > > Cheers, > Alexandre > >> On Jul 7, 2019, at 5:12 PM, Smiljana Knezev <[hidden email]> wrote: >> >> Dear all, >> >> I've written about implementing binary search trees: https://pharokeepers.github.io/pharo/2019/07/07/SmiljanaBinarySearchTreeinPharo.html >> >> Feedback and opinions is always welcome :) >> >> Best regards, >> Smiljana Knezev > > > |
Administrator
|
In reply to this post by Pharo Smalltalk Users mailing list
On Wed, Jul 10, 2019 at 9:53 AM Alexandre Bergel via Pharo-users <[hidden email]> wrote:
Hi Alexandre, I can tell you that GemStone/S makes extensive use of b-trees and variants in its internal data structures. Two examples are the object table itself and the index data structures. I agree that the average programmer may never have a need for them, but I don't like to close doors on (other) developers.
Smiljana, Others have suggested things like the Null Object pattern for the left and right links, but an alternative or supplement to that would be a space optimizing leaf node which doesn't have instance variables for left and right sub-trees. But, don't go down the optimizing paths until you have good solutions for everything else. i.e. make right, then make it fast. I noted your code looks "C-like", which I suspect arises from being inspired by implementation in other languages. For example, I think only an empty tree should answer 0 for depth. A tree with one node should answer 1 while a tree with 2 or 3 nodes should answer 2, etc. If you drew a tree out on paper, how many levels would you see? That is the depth, in my opinion. Another example arises from one of the most powerful idioms in Smalltalk. Call it the #at:ifAbsent: pattern. (See also #detect:ifNone:) Someone else mentioned the passing around of nil and suggested you not do that. The #at:ifAbsent: pattern is the key to avoiding that: tell the receiver what *you* want to *do* if the thing can't be found by passing in your own block. Additionally, you will want your nodes to support various payloads. Right now the payload is the object held in the key. That works for identity structures, but it won't work for more advanced needs, such as one would find in a database index structure. Eventually, you will want the ability to have nodes that support additional attribute(s). I've found P. J Plauger's "0, 1, infinity" rule to be a superb design guideline. In short, once you go beyond having one of something (e.g. the key itself), you need to design things so that you can have an arbitrarily large number. The step from 1 to >1 is huge and messy if retrofitted, but easy when planned in. Eventually, you will need queries against the tree, allowing range selection, identity versus equality, etc. The tree should provide a basic infrastructure for searching, but a separate modelling of queries will be necessary. That may be out of scope for GSOC, of course. Good work and good luck with the rest of the project!
|
In reply to this post by Pharo Smalltalk Users mailing list
On Thu, 11 Jul 2019 at 00:53, Alexandre Bergel via Pharo-users
<[hidden email]> wrote: > > Hello Smiljana, > > Thanks for having written down this document. I am not expert in algorithm, so I would consider myself a simple user. I have developed complex software for some times and I have never seen the need of having a binary search tree. I guess this is probably partly because of my lack of expertise in binary search tree and partly because experts in binary search trees assume that people know what it is about and in what it is useful. > > My question is, when should a programmer ever need to use binary search tree? Can you add some examples on what these trees are good for, and how an average programmer should look into it. I think this will be a valuable and easy way to expand your blog. My "computer science" is a bit rusty but Binary Search Trees and their generalisation to B-Trees is Databases 101. Almost guaranteed you are "using" them with any database you use, you're just not directly exposed to them. If you skim "Redesigning String Data Structures to Exploit Cache" (https://people.eng.unimelb.edu.au/jzobel/fulltext/acmjea10.pdf) looking at the performance graphs, for "unordered searching" the best bet for many applications is probably our hash-based Dictionary if that covers all your needs. BSTs are more memory efficient than Dictionarys, but less cache-friendly. A few applications where BSTs can be useful: * doing range searches efficiently [1] * traversing elements in order, which is more difficult with hash tables [1] [2] * good for "order" statistics [2] like: - inexact search - finding closest lower and greater elements. - determining maximum & minimum elements - determining successor & predecessor elements - range queries - easy to do with BSTs and not a natural operation with Hash Tables. * Self-Balancing BSTs guarantee operations work in O(Logn) time. Hashing has Θ(1) is average time but some particular operations are costly, especially when table resizing happen [2] but also they are slower so if the data is sufficiently random simple BSTs may do. [1] https://stackoverflow.com/questions/371136/binary-trees-vs-linked-lists-vs-hash-tables [2] https://www.geeksforgeeks.org/advantages-of-bst-over-hash-table/ This side-related article... "Number crunching: Why you should never, ever, EVER use linked-list in your code again" (https://kjellkod.wordpress.com/2012/02/25/why-you-should-never-ever-ever-use-linked-list-in-your-code-again) indicates that for many In-Memory operations you are better using sorted-Arrays (e.g. SequenceableCollection>>#findBinary:) than cache-invalidating pointer-base structures like Lists & Trees - thus the strong association of BSTs & BTrees with slower access disk-based databases. Also notice that additional BST features closely match common database operations. cheers -ben |
In reply to this post by Pharo Smalltalk Users mailing list
Search trees (which come in many varieties) are good for collections in which adding, deleting, and lookup are interleaved, where there is an order on the elements (if used as a set) or the keys (if used as a dictionary). So are hash tables. But (balanced) search trees have guaranteed O(lg n) worst case time, which hash tables do not. And search trees let you enumerate their contents in ascending or descending order, at any time. And search trees can support queries like - tell me all the elements {<, <=, >, >=} x - tell me all the elements between L and U - tell me the next element (before,after) x My Smalltalk library includes Sorted{Set,Bag,Dictionary} implemented as splay trees. Some time I must try an alternative. in time O(lg n + number of answers). On Thu, 11 Jul 2019 at 04:53, Alexandre Bergel via Pharo-users <[hidden email]> wrote:
|
Thanks for your detailed answers
Cheers, Alexandre
|
Free forum by Nabble | Edit this page |