Smalltalk › Pharo › Pharo Smalltalk Users

GSoC 2019 Introduction

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

2 messages Options

Smiljana Knezev

GSoC 2019 Introduction

Hello everyone,

I am a final year undergraduate student of Information Technologies in Novi Sad, Serbia. I first heard about Pharo about a year ago when professor Stephane Ducasse gave a guest lecture at our Faculty. I thought I was familiar with Object Oriented Programming, having been learning Java, but it was the first time I heard branching in OO is a sign of bad design. I was drawn to Pharo because of it's simplicity and elegance.

This Tuesday I found out I was accepted for GSoC program for New Collections for Pharo project and I'm pretty excited about it. You can read about proposal here: https://summerofcode.withgoogle.com/projects/#4973054243897344

It will focus on documenting, refactoring and testing existing collections and migrating them to Containers library. Also some new collections like: SimpleTrees, B-Trees, AVL Trees, QuadTrees, OcTrees, ... will be developed

Final part would be about research and development specialized/advanced data structures like: Immutable Queues, Immutable Lists, Immutable bidirectional maps,Threaded binary trees ... I would appreciate your feedback about the proposal. Should I change something? Maybe add or focus on some additional data structures?

Other candidate from Serbia, Nina Medic and I discussed about writing a blog together for our weekly posts. It can be found here: https://smiljanaknezev.github.io/blog/?fbclid=IwAR2vC-N38wlPGY42ijokv67Anqd3CENh8N2q47Yl9CzrR5RDxd5Tr_bjVy0

It will be further customized and ready until Sunday. Also I have setup a twitter account to promote our blog posts and keep in touch with community. You can find me with @KnezevSmiljana .

If you have any additional tips/tricks about making this GSoC project successful, I would like to hear about them.

Looking forward to hearing from you.

Best regards,

Smiljana

Richard O'Keefe

Re: GSoC 2019 Introduction

Branching as such is not a sign of bad design in OOP.

The actual dogma is that *testing the class of an

argument instead of dispatching* is a sign of bad design.

Just consider the case of binary search.  That is not

the kind of branching that lends itself to dynamic

dispatch.  Nor is the kind of branching you will have

to write if you implement some sort of balanced tree.

Or quad-trees or oct-trees or k-d trees or R-trees or ...

I invite you to consider the following two methods

from java.lang.String:

  byte[] getBytes(String charsetName);

  byte[] getBytes(CharSet charset);

In Smalltalk, you have two options.

The first is double dispatch.

  getBytes: encoding

    ^encoding getBytesFor: self

  String>>getBytesFor:  

  CharSet>>getBytesFor:

In this case, that requires you to modify two classes

instead of one; in general it would require N+1 where

N is the number of overloaded methods.  It also leads

to *coupling*, where too many classes depend on each

other.  Worst of all, in a project, you might not be

*allowed* to modify CharSet.

The other is to use an "if":

  getBytesFor: encoding

    |charset|

    charset := (encoding isKindOf: CharSet)

                 ifTrue:  [encoding]

                 ifFalse: [CharSet named: encoding].

...

This approach

 - involves just one method, not N+1;

 - does not increase coupling;

 - does not require the ability to change system classes;

 - does not introduce additional non-private method names;

 - BUT does not cope well with additional ad hoc overloads.

There is no silver bullet.

Perhaps you may now look at Java with different eyes:

ad hoc overloading is similar to using type-based branching.

When it comes to data structures, I remember when we had a

visitor to the CS department I was working in.  His goal

while he was with us was to carefully implement and benchmark

about a dozen "advanced" priority queue data structures from

the literature on the subject.  I told him "make sure you

include the classic array-based heap".  Guess which algorithm

won!  That's right, the classic array-based heap.  The point

of the story is that a simple algorithm implemented well is

likely to do surprisingly well.  The rule of thumb I use is

"keep the number of memory references down".

As for which data structures you should look at, perhaps it

is better to start by considering what *tasks* you want to

support.  If you want to support computational geometry,

that will lead you to certain choices.  If you want to

support constraint satisfaction and/or combinatorial

optimisation, that will lead you to others.  If you want

to make concurrent programming easier, that will lead you

to concurrent stacks, queues, sets, bags, dictionaries &c.

Don't think of "AVL tree", think of "efficient sorted sets,

bags, and dictionaries" and work back from that.

Here are two (related) data structures I've wanted.

A triple store

 - is a set of (Att,Obj,Val) triples

 - supports *all* of the access patterns,

   where + means that value is known (given)

     and - means that value is unknown (iterated over)

   (+,+,+) (+,+,-) (+,-,+) (+,-,-)

   (-,+,+) (-,+,-) (-,-,+) (-,-,-)

   where each has good asymptotic efficiency

 - has low space overhead.

This has obvious application to RDF and rule-based engines.

A good graph data structure, as in nodes and edges.

Well, graph AND digraph, obviously.  You might want to

look at Knuth's book "The Stanford GraphBase" for ideas.

On Fri, 10 May 2019 at 22:00, Smiljana Knezev <[hidden email]> wrote:

Hello everyone,

I am a final year undergraduate student of Information Technologies in Novi Sad, Serbia. I first heard about Pharo about a year ago when professor Stephane Ducasse gave a guest lecture at our Faculty. I thought I was familiar with Object Oriented Programming, having been learning Java, but it was the first time I heard branching in OO is a sign of bad design. I was drawn to Pharo because of it's simplicity and elegance.

This Tuesday I found out I was accepted for GSoC program for New Collections for Pharo project and I'm pretty excited about it. You can read about proposal here: https://summerofcode.withgoogle.com/projects/#4973054243897344
It will focus on documenting, refactoring and testing existing collections and migrating them to Containers library. Also some new collections like: SimpleTrees, B-Trees, AVL Trees, QuadTrees, OcTrees, ... will be developed
Final part would be about research and development specialized/advanced data structures like: Immutable Queues, Immutable Lists, Immutable bidirectional maps,Threaded binary trees ... I would appreciate your feedback about the proposal. Should I change something? Maybe add or focus on some additional data structures?

Other candidate from Serbia, Nina Medic and I discussed about writing a blog together for our weekly posts. It can be found here: https://smiljanaknezev.github.io/blog/?fbclid=IwAR2vC-N38wlPGY42ijokv67Anqd3CENh8N2q47Yl9CzrR5RDxd5Tr_bjVy0
It will be further customized and ready until Sunday. Also I have setup a twitter account to promote our blog posts and keep in touch with community. You can find me with @KnezevSmiljana .

If you have any additional tips/tricks about making this GSoC project successful, I would like to hear about them.

Looking forward to hearing from you.

Best regards,
Smiljana