DoubleLinkedList vs LinkedList vs Collection

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

DoubleLinkedList vs LinkedList vs Collection

Peter Uhnak
I was surprised to learn that DoubleLinkedList is descendant of Object, while LinkedList is descendant of SequencableCollection. Is there a particular reason behind this? Are they really so conceptually different that DLL is not even considered a collection?

Thanks,
Peter
Reply | Threaded
Open this post in threaded view
|

Re: DoubleLinkedList vs LinkedList vs Collection

Sven Van Caekenberghe-2
Peter,

> On 14 Apr 2015, at 13:52, Peter Uhnák <[hidden email]> wrote:
>
> I was surprised to learn that DoubleLinkedList is descendant of Object, while LinkedList is descendant of SequencableCollection. Is there a particular reason behind this? Are they really so conceptually different that DLL is not even considered a collection?
>
> Thanks,
> Peter

DoubleLinkedList was added to help the implementation of [LRU|TTL]Cache. It was kept small and independent.

Inheriting from [Sequenceable]Collection is a larger responsibility, entails more requirements.

I would not be against this, although I am not 100% sure it is easy (some methods return the link nodes, not the elements, a distinction unknown to collections in general - LinkedList is a bit ugly in this respect too). In any case, it would have to be supported by enough tests. It could be a nice project for Pharo 5.

Sven


Reply | Threaded
Open this post in threaded view
|

Re: DoubleLinkedList vs LinkedList vs Collection

Marcus Denker-4

> On 14 Apr 2015, at 14:00, Sven Van Caekenberghe <[hidden email]> wrote:
>
> Peter,
>
>> On 14 Apr 2015, at 13:52, Peter Uhnák <[hidden email]> wrote:
>>
>> I was surprised to learn that DoubleLinkedList is descendant of Object, while LinkedList is descendant of SequencableCollection. Is there a particular reason behind this? Are they really so conceptually different that DLL is not even considered a collection?
>>
>> Thanks,
>> Peter
>
> DoubleLinkedList was added to help the implementation of [LRU|TTL]Cache. It was kept small and independent.
>
> Inheriting from [Sequenceable]Collection is a larger responsibility, entails more requirements.
>
> I would not be against this, although I am not 100% sure it is easy (some methods return the link nodes, not the elements, a distinction unknown to collections in general - LinkedList is a bit ugly in this respect too). In any case, it would have to be supported by enough tests. It could be a nice project for Pharo 5.

One problem with LinkedList is that it is used by the scheduler and carefully written to be intererrupt-check free in some (undocumented) cases… in the past
this has already lead to very bad side-effects when people wanted to improve it or change it.

        Marcus
Reply | Threaded
Open this post in threaded view
|

Re: DoubleLinkedList vs LinkedList vs Collection

Sean P. DeNigris
Administrator
In reply to this post by Peter Uhnak
Peter Uhnák wrote
Are they really so conceptually different
that DLL is not even considered a collection?
I just reread GOF and reinforced an important idea. Although we often conflate the two, types and classes are not the same. Inheritance is an implementation detail about avoiding duplication by sharing code.

Unfortunately, Smalltalk may encourage the conflation because:
1. interfaces are not well-defined
2. and, the tools, being very class-centric, and maybe in reaction to #1, make it tempting to subclass just to make the relationships clearer whether there is code savings or not

So the question is whether they are implementation-ally similar enough to justify subclassing. Their conceptual similarity only means they should share the same interface i.e. respond to the same set of messages.
Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: DoubleLinkedList vs LinkedList vs Collection

Peter Uhnak
DoubleLinkedList was added to help the implementation of [LRU|TTL]Cache. It was kept small and independent.
Does that mean that I probably shouldn't touch it? Because when I wanted to use DDL I ran into a problem that once I add something to the list, I can no longer access the Links. LinkedList has "firstLink/lastLink", but this is missing in DoubleLinkedList -- is this design decision (to keep it small), or nobody needed it until now?

Peter
Reply | Threaded
Open this post in threaded view
|

Re: DoubleLinkedList vs LinkedList vs Collection

Peter Uhnak
Also linksDo: method is missing.

On Tue, Apr 14, 2015 at 4:46 PM, Peter Uhnák <[hidden email]> wrote:
DoubleLinkedList was added to help the implementation of [LRU|TTL]Cache. It was kept small and independent.
Does that mean that I probably shouldn't touch it? Because when I wanted to use DDL I ran into a problem that once I add something to the list, I can no longer access the Links. LinkedList has "firstLink/lastLink", but this is missing in DoubleLinkedList -- is this design decision (to keep it small), or nobody needed it until now?

Peter

Reply | Threaded
Open this post in threaded view
|

Re: DoubleLinkedList vs LinkedList vs Collection

Sven Van Caekenberghe-2
In reply to this post by Peter Uhnak

> On 14 Apr 2015, at 16:46, Peter Uhnák <[hidden email]> wrote:
>
> DoubleLinkedList was added to help the implementation of [LRU|TTL]Cache. It was kept small and independent.
> Does that mean that I probably shouldn't touch it? Because when I wanted to use DDL I ran into a problem that once I add something to the list, I can no longer access the Links. LinkedList has "firstLink/lastLink", but this is missing in DoubleLinkedList -- is this design decision (to keep it small), or nobody needed it until now?

Both ;-)

> Peter

Please make an issue with the API that you think should be added, and then we can have a look for Pharo 5.

Sven


Reply | Threaded
Open this post in threaded view
|

Re: DoubleLinkedList vs LinkedList vs Collection

Henrik Sperre Johansen
In reply to this post by Marcus Denker-4

> On 14 Apr 2015, at 2:09 , Marcus Denker <[hidden email]> wrote:
>
>
>> On 14 Apr 2015, at 14:00, Sven Van Caekenberghe <[hidden email]> wrote:
>>
>> Peter,
>>
>>> On 14 Apr 2015, at 13:52, Peter Uhnák <[hidden email]> wrote:
>>>
>>> I was surprised to learn that DoubleLinkedList is descendant of Object, while LinkedList is descendant of SequencableCollection. Is there a particular reason behind this? Are they really so conceptually different that DLL is not even considered a collection?
>>>
>>> Thanks,
>>> Peter
>>
>> DoubleLinkedList was added to help the implementation of [LRU|TTL]Cache. It was kept small and independent.
>>
>> Inheriting from [Sequenceable]Collection is a larger responsibility, entails more requirements.
>>
>> I would not be against this, although I am not 100% sure it is easy (some methods return the link nodes, not the elements, a distinction unknown to collections in general - LinkedList is a bit ugly in this respect too). In any case, it would have to be supported by enough tests. It could be a nice project for Pharo 5.
>
> One problem with LinkedList is that it is used by the scheduler and carefully written to be intererrupt-check free in some (undocumented) cases… in the past
> this has already lead to very bad side-effects when people wanted to improve it or change it.
>
> Marcus

IIRC, #removeLink:ifAbsent: is the only method (... that we've noticed)  that needs to be atomic for the scheduler to work.(in other words, it can end up trying to remove the same process from different threads at the same time)
The change was made during a sprint to allow adding arbitrary objects and create links on the fly (inspired by Ruby, or so I was told), in the process the old remove:ifAbsent: turned into removeLink:ifAbsent, and a suspension point was introduced in the process, which meant, once in a blue moon, the scheduler would get stuck trying to remove a process that had already been removed.

In other words; it's not a pleasant class to change.

WRT being a subclass of SequenceableCollection; while technically true, that API is much wider than what you'd expect from a classic link list, and the inherited implementations mostly assume O(1) #at: performance. Very little is reimplemented, so most of it is rather slow if you try to actually use any of it:
ll := LinkedList withAll:#(a b c d e f g h i j k l m n o p q r s t u v x y z).
[ll allButFirst: 15] bench '150,365 per second'

aa := Array withAll: #(a b c d e f g h i j k l m n o p q r s t u v x y z).
[aa allButFirst: 15] bench 10,055,029 per second

Aside from pure bugs, there's also other oddities like the species being Array, but iteration methods reimplemented using self class.

IOW: If you are looking for a LinkedList actually worth using, look elsewhere.
DoubleLinkedList may not be a Collection, but at least the API is small enough to grasp, and the parts that are there act as you expect them to.

Cheers,
Henry
Reply | Threaded
Open this post in threaded view
|

Re: DoubleLinkedList vs LinkedList vs Collection

Peter Uhnak
In reply to this post by Sven Van Caekenberghe-2
I guess the "System-Caching" package which I noticed just now is a bit telling. :)
I'll make an issue.

Peter

On Tue, Apr 14, 2015 at 5:02 PM, Sven Van Caekenberghe <[hidden email]> wrote:

> On 14 Apr 2015, at 16:46, Peter Uhnák <[hidden email]> wrote:
>
> DoubleLinkedList was added to help the implementation of [LRU|TTL]Cache. It was kept small and independent.
> Does that mean that I probably shouldn't touch it? Because when I wanted to use DDL I ran into a problem that once I add something to the list, I can no longer access the Links. LinkedList has "firstLink/lastLink", but this is missing in DoubleLinkedList -- is this design decision (to keep it small), or nobody needed it until now?

Both ;-)

> Peter

Please make an issue with the API that you think should be added, and then we can have a look for Pharo 5.

Sven



Reply | Threaded
Open this post in threaded view
|

Re: DoubleLinkedList vs LinkedList vs Collection

stepharo
In reply to this post by Henrik Sperre Johansen
I think that LinkedList should be a private class nobody should use :).
Better use DoubleLinkedList which should be packaged with extended
collection.

Le 14/4/15 17:07, Henrik Johansen a écrit :

>> On 14 Apr 2015, at 2:09 , Marcus Denker <[hidden email]> wrote:
>>
>>
>>> On 14 Apr 2015, at 14:00, Sven Van Caekenberghe <[hidden email]> wrote:
>>>
>>> Peter,
>>>
>>>> On 14 Apr 2015, at 13:52, Peter Uhnák <[hidden email]> wrote:
>>>>
>>>> I was surprised to learn that DoubleLinkedList is descendant of Object, while LinkedList is descendant of SequencableCollection. Is there a particular reason behind this? Are they really so conceptually different that DLL is not even considered a collection?
>>>>
>>>> Thanks,
>>>> Peter
>>> DoubleLinkedList was added to help the implementation of [LRU|TTL]Cache. It was kept small and independent.
>>>
>>> Inheriting from [Sequenceable]Collection is a larger responsibility, entails more requirements.
>>>
>>> I would not be against this, although I am not 100% sure it is easy (some methods return the link nodes, not the elements, a distinction unknown to collections in general - LinkedList is a bit ugly in this respect too). In any case, it would have to be supported by enough tests. It could be a nice project for Pharo 5.
>> One problem with LinkedList is that it is used by the scheduler and carefully written to be intererrupt-check free in some (undocumented) cases… in the past
>> this has already lead to very bad side-effects when people wanted to improve it or change it.
>>
>> Marcus
> IIRC, #removeLink:ifAbsent: is the only method (... that we've noticed)  that needs to be atomic for the scheduler to work.(in other words, it can end up trying to remove the same process from different threads at the same time)
> The change was made during a sprint to allow adding arbitrary objects and create links on the fly (inspired by Ruby, or so I was told), in the process the old remove:ifAbsent: turned into removeLink:ifAbsent, and a suspension point was introduced in the process, which meant, once in a blue moon, the scheduler would get stuck trying to remove a process that had already been removed.
>
> In other words; it's not a pleasant class to change.
>
> WRT being a subclass of SequenceableCollection; while technically true, that API is much wider than what you'd expect from a classic link list, and the inherited implementations mostly assume O(1) #at: performance. Very little is reimplemented, so most of it is rather slow if you try to actually use any of it:
> ll := LinkedList withAll:#(a b c d e f g h i j k l m n o p q r s t u v x y z).
> [ll allButFirst: 15] bench '150,365 per second'
>
> aa := Array withAll: #(a b c d e f g h i j k l m n o p q r s t u v x y z).
> [aa allButFirst: 15] bench 10,055,029 per second
>
> Aside from pure bugs, there's also other oddities like the species being Array, but iteration methods reimplemented using self class.
>
> IOW: If you are looking for a LinkedList actually worth using, look elsewhere.
> DoubleLinkedList may not be a Collection, but at least the API is small enough to grasp, and the parts that are there act as you expect them to.
>
> Cheers,
> Henry
>


Reply | Threaded
Open this post in threaded view
|

Re: DoubleLinkedList vs LinkedList vs Collection

stepharo
In reply to this post by Sven Van Caekenberghe-2
+1

Stef

Le 14/4/15 17:02, Sven Van Caekenberghe a écrit :

>> On 14 Apr 2015, at 16:46, Peter Uhnák <[hidden email]> wrote:
>>
>> DoubleLinkedList was added to help the implementation of [LRU|TTL]Cache. It was kept small and independent.
>> Does that mean that I probably shouldn't touch it? Because when I wanted to use DDL I ran into a problem that once I add something to the list, I can no longer access the Links. LinkedList has "firstLink/lastLink", but this is missing in DoubleLinkedList -- is this design decision (to keep it small), or nobody needed it until now?
> Both ;-)
>
>> Peter
> Please make an issue with the API that you think should be added, and then we can have a look for Pharo 5.
>
> Sven
>
>
>