Re: Collections

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Collections

Stephan Eggermont-3
Stef wrote:
 > - are there some collections that would be cool to improve?
 > - are there some collections that would be cool to have and that we  
do not have yet?
All collections with an Array as backing store do not perform
well in a large memory setting. That's not only a problem
in a 64-bit image.

Stephan

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: Collections

Levente Uzonyi-2
On Sat, 13 Mar 2010, Stephan Eggermont wrote:

> Stef wrote:
>> - are there some collections that would be cool to improve?
>> - are there some collections that would be cool to have and that we do not
>> have yet?
> All collections with an Array as backing store do not perform
> well in a large memory setting. That's not only a problem
> in a 64-bit image.

How bad do they perform? Do you have benchmarks?
How can
- an OrderedCollection
- a Set
- a Dictioanry
- a Heap
be implemented in a better way?


Levente

>
> Stephan
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: collections

Stephan Eggermont-3
In reply to this post by Stephan Eggermont-3
Levente wrote:
 >How bad do they perform? Do you have benchmarks?
With a
coll := OrderedCollection new.
1 to: 10000000 do: [:i | coll add: i].

a profile of:
1 to: 100 do: [:i | coll add: i beforeIndex: 10]
here takes 1438 ms. Moving memory gets to be slow.
All operations that start taking O(#elements) time are
no longer funny.

The usual thing to do is to start using multiple blocks
of memory (like a BTree). Doubling size when growing
is also not an acceptable strategy when getting close to
total ram capacity.

Stephan

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: collections

Igor Stasenko
On 13 March 2010 12:20, Stephan Eggermont <[hidden email]> wrote:

> Levente wrote:
>>How bad do they perform? Do you have benchmarks?
> With a
> coll := OrderedCollection new.
> 1 to: 10000000 do: [:i | coll add: i].
>
> a profile of:
> 1 to: 100 do: [:i | coll add: i beforeIndex: 10]
> here takes 1438 ms. Moving memory gets to be slow.
> All operations that start taking O(#elements) time are
> no longer funny.
>
> The usual thing to do is to start using multiple blocks
> of memory (like a BTree). Doubling size when growing
> is also not an acceptable strategy when getting close to
> total ram capacity.
>
Yes. Use right tool for to do job.
An OrderedCollection can't satisfy every possible combination of
tasks, which developer facing.

From this point, i barely see how we can achieve a substantial
improvements in current implementation.
And i think if we talk about collections, we should be focused more on
a refactoring (use traits) for better
library design, which will make implemetation more compact and cleaner.

> Stephan
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>



--
Best regards,
Igor Stasenko AKA sig.

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: collections

Levente Uzonyi-2
In reply to this post by Stephan Eggermont-3
On Sat, 13 Mar 2010, Stephan Eggermont wrote:

> Levente wrote:
>> How bad do they perform? Do you have benchmarks?
> With a
> coll := OrderedCollection new.
> 1 to: 10000000 do: [:i | coll add: i].
>
> a profile of:
> 1 to: 100 do: [:i | coll add: i beforeIndex: 10]
> here takes 1438 ms. Moving memory gets to be slow.
> All operations that start taking O(#elements) time are
> no longer funny.

I don't know why do you expect insertion to be fast into the middle of an
OrderedCollection? It just shouldn't be, OrderedCollection doesn't
support this. I think it's fair to assume that developers know what the
collections can and can't be used to.

>
> The usual thing to do is to start using multiple blocks
> of memory (like a BTree). Doubling size when growing

If it would depend on the size of the memory chunk, these values wouldn't
be linear:

Array streamContents: [ :stream |
  | size |
  size := 10.
  [ size <= 10000000 ] whileTrue: [
  | coll |
  coll := OrderedCollection new.
  Smalltalk garbageCollect.
  stream nextPut: size -> [ 1 to: size do: [ :i | coll add: i ] ] timeToRun.
  size := size * 10 ] ]
"==> {10->0 . 100->0 . 1000->0 . 10000->2 . 100000->22 . 1000000->282 . 10000000->2490}"

> is also not an acceptable strategy when getting close to
> total ram capacity.

If the growth is not exponentional, the O(1) and O(n) runtime will
quickly become O(n) and O(n^2).
If a single data structure's size is getting close to the total ram
capacity you have three options:
- buy more ram (this is really cheap nowadays)
- use another data struture/another way to represent your data
- use another language which is closer to your hardware
But this is really an edge case. In the last five years I only had two
such cases. I took the hard way and rewrote the programs in C++. It was
pretty easy because I just had to translate the Smalltalk code to C++.


Levente

>
> Stephan
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project