Hello everyone, I am a final year undergraduate student of Information Technologies in Novi Sad, Serbia. I first heard about Pharo about a year ago when professor Stephane Ducasse gave a guest lecture at our Faculty. I thought I was familiar with Object Oriented Programming, having been learning Java, but it was the first time I heard branching in OO is a sign of bad design. I was drawn to Pharo because of it's simplicity and elegance. This Tuesday I found out I was accepted for GSoC program for New Collections for Pharo project and I'm pretty excited about it. You can read about proposal here: https://summerofcode.withgoogle.com/projects/#4973054243897344 It will focus on documenting, refactoring and testing existing collections and migrating them to Containers library. Also some new collections like: SimpleTrees, B-Trees, AVL Trees, QuadTrees, OcTrees, ... will be developed Final part would be about research and development specialized/advanced data structures like: Immutable Queues, Immutable Lists, Immutable bidirectional maps,Threaded binary trees ... I would appreciate your feedback about the proposal. Should I change something? Maybe add or focus on some additional data structures? Other candidate from Serbia, Nina Medic and I discussed about writing a blog together for our weekly posts. It can be found here: https://smiljanaknezev.github.io/blog/?fbclid=IwAR2vC-N38wlPGY42ijokv67Anqd3CENh8N2q47Yl9CzrR5RDxd5Tr_bjVy0 It will be further customized and ready until Sunday. Also I have setup a twitter account to promote our blog posts and keep in touch with community. You can find me with @KnezevSmiljana . If you have any additional tips/tricks about making this GSoC project successful, I would like to hear about them. Looking forward to hearing from you. Best regards, Smiljana |
Branching as such is not a sign of bad design in OOP. The actual dogma is that *testing the class of an argument instead of dispatching* is a sign of bad design. Just consider the case of binary search. That is not the kind of branching that lends itself to dynamic dispatch. Nor is the kind of branching you will have to write if you implement some sort of balanced tree. Or quad-trees or oct-trees or k-d trees or R-trees or ... I invite you to consider the following two methods from java.lang.String: byte[] getBytes(String charsetName); byte[] getBytes(CharSet charset); In Smalltalk, you have two options. The first is double dispatch. getBytes: encoding ^encoding getBytesFor: self String>>getBytesFor: CharSet>>getBytesFor: In this case, that requires you to modify two classes instead of one; in general it would require N+1 where N is the number of overloaded methods. It also leads to *coupling*, where too many classes depend on each other. Worst of all, in a project, you might not be *allowed* to modify CharSet. The other is to use an "if": getBytesFor: encoding |charset| charset := (encoding isKindOf: CharSet) ifTrue: [encoding] ifFalse: [CharSet named: encoding]. ... This approach - involves just one method, not N+1; - does not increase coupling; - does not require the ability to change system classes; - does not introduce additional non-private method names; - BUT does not cope well with additional ad hoc overloads. There is no silver bullet. Perhaps you may now look at Java with different eyes: ad hoc overloading is similar to using type-based branching. When it comes to data structures, I remember when we had a visitor to the CS department I was working in. His goal while he was with us was to carefully implement and benchmark about a dozen "advanced" priority queue data structures from the literature on the subject. I told him "make sure you include the classic array-based heap". Guess which algorithm won! That's right, the classic array-based heap. The point of the story is that a simple algorithm implemented well is likely to do surprisingly well. The rule of thumb I use is "keep the number of memory references down". As for which data structures you should look at, perhaps it is better to start by considering what *tasks* you want to support. If you want to support computational geometry, that will lead you to certain choices. If you want to support constraint satisfaction and/or combinatorial optimisation, that will lead you to others. If you want to make concurrent programming easier, that will lead you to concurrent stacks, queues, sets, bags, dictionaries &c. Don't think of "AVL tree", think of "efficient sorted sets, bags, and dictionaries" and work back from that. Here are two (related) data structures I've wanted. A triple store - is a set of (Att,Obj,Val) triples - supports *all* of the access patterns, where + means that value is known (given) and - means that value is unknown (iterated over) (+,+,+) (+,+,-) (+,-,+) (+,-,-) (-,+,+) (-,+,-) (-,-,+) (-,-,-) where each has good asymptotic efficiency - has low space overhead. This has obvious application to RDF and rule-based engines. A good graph data structure, as in nodes and edges. Well, graph AND digraph, obviously. You might want to look at Knuth's book "The Stanford GraphBase" for ideas. On Fri, 10 May 2019 at 22:00, Smiljana Knezev <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |