Large, Sorted Collections

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Large, Sorted Collections

Runar Jordahl
Does there exist any sorted collection or sorted dictionary
implementations that are capable of holding “many” objects? By many I
mean hundreds of thousands. I guess the internal representation has to
use a B-Tree or a similar structure, not an array.

I am looking for a regular Smalltalk in-memory collection, not a
collection for an object-oriented database.

Kind regards
Runar

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: Large, Sorted Collections

Jon Paynter-2
What problems are you seeing with the default implementations?

Ive used sorted collections with up to 5 million instances before.  While it wasnt blazingly fast - the delay in creating the collection was only about 10-15 seconds.

On Mon, Jan 16, 2012 at 1:35 PM, Runar Jordahl <[hidden email]> wrote:
Does there exist any sorted collection or sorted dictionary
implementations that are capable of holding “many” objects? By many I
mean hundreds of thousands. I guess the internal representation has to
use a B-Tree or a similar structure, not an array.

I am looking for a regular Smalltalk in-memory collection, not a
collection for an object-oriented database.

Kind regards
Runar

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: Large, Sorted Collections

Alan Knight-2
The reason I can think of you might want a different representation is if you wanted to do a lot of inserting arbitrary elements, where putting them into the middle would get expensive. If you're constructing it all at once, or in reasonably large chunks, I would think the default implementation would probably do all right.



[hidden email]
16 January, 2012 4:41 PM


What problems are you seeing with the default implementations?

Ive used sorted collections with up to 5 million instances before.  While it wasnt blazingly fast - the delay in creating the collection was only about 10-15 seconds.


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc


[hidden email]
16 January, 2012 4:35 PM


Does there exist any sorted collection or sorted dictionary
implementations that are capable of holding “many” objects? By many I
mean hundreds of thousands. I guess the internal representation has to
use a B-Tree or a similar structure, not an array.

I am looking for a regular Smalltalk in-memory collection, not a
collection for an object-oriented database.

Kind regards
Runar

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: Large, Sorted Collections

Runar Jordahl
In reply to this post by Jon Paynter-2
Thanks for your replies!

I initially tried adding a lot of elements to a sorted collection:

|collection|
collection := SortedCollection new.
1000000 timesRepeat: [collection add: FastRandom new next].

This runs painfully slow.

But if I instead add elements to an ordered collection and then sort,
it runs fast and is usable:

|collection|
collection := OrderedCollection new.
1000000 timesRepeat: [collection add: FastRandom new next].
collection asSortedCollection.

Other tests I have done show that adding one additional element to the
resulting sorted collection takes 5-10 milliseconds, which is not bad.

(Note that my (incorrect) use of FastRandom generates many equal
values. This is done by intention, to simulate collection where many
elements are equal.)

Kind regards
Runar
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: Large, Sorted Collections

Paul Baumann
VW already has efficient sorting. The slowness is in how you are using it. Pre-size your collections. Avoid creating many instances of FastRandom.


Time millisecondsToRun: [
        | collection |
        collection := OrderedCollection new.
        1000000 timesRepeat: [collection add: FastRandom new next].
        collection asSortedCollection.
].
 14910


Time millisecondsToRun: [
        (FastRandom new next: 1000000) asSortedCollection.
].
 1827
 1819
 1833


Sorted collections make for inefficient domain objects because they don't like incremental growth. Add all at once or defer sorting to finish with #sort:using: to re-sort items within a sequenceable collection.

If you have multiple sort criteria then you could collate by the first criteria into groups sorted by the remaining criteria. Many people create extremely inefficient multi-criteria sorts by creating many temporary objects during the sort. An efficient way to sort with multiple criteria is to use methods like this:

<= other
        | i a b |
        i := 1.
        [
                (a := self sortableFieldAt: i) isNil ifTrue: [^true].
                (b := other sortableFieldAt: i) isNil ifTrue: [^false].
                a ~~ b ifTrue: [
                        a < b ifTrue: [^true].
                        b < a ifTrue: [^false].
                ].
                (i := i + 1) < 1000
        ] whileTrue: [].
        self error: '#sortableFieldAt: returned an unlikely number of matches before returning nil'.

Paul Baumann


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Runar Jordahl
Sent: Tuesday, January 17, 2012 06:05
To: Jon Paynter
Cc: [hidden email]
Subject: Re: [vwnc] Large, Sorted Collections

Thanks for your replies!

I initially tried adding a lot of elements to a sorted collection:

|collection|
collection := SortedCollection new.
1000000 timesRepeat: [collection add: FastRandom new next].

This runs painfully slow.

But if I instead add elements to an ordered collection and then sort,
it runs fast and is usable:

|collection|
collection := OrderedCollection new.
1000000 timesRepeat: [collection add: FastRandom new next].
collection asSortedCollection.

Other tests I have done show that adding one additional element to the
resulting sorted collection takes 5-10 milliseconds, which is not bad.

(Note that my (incorrect) use of FastRandom generates many equal
values. This is done by intention, to simulate collection where many
elements are equal.)

Kind regards
Runar
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc


This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: Large, Sorted Collections

Alan Knight-2
In current versions, you can also write a multiple criteria sort more conveniently, and also pretty efficiently, as
   [:aPoint | aPoint x] ascending, [:aPoint | aPoint y]
or, more concisely, but probably ever so slightly less efficiently as
   #x ascending, #y



[hidden email]
17 January, 2012 10:38 AM


VW already has efficient sorting. The slowness is in how you are using it. Pre-size your collections. Avoid creating many instances of FastRandom.


Time millisecondsToRun: [
| collection |
collection := OrderedCollection new.
1000000 timesRepeat: [collection add: FastRandom new next].
collection asSortedCollection.
].
14910


Time millisecondsToRun: [
(FastRandom new next: 1000000) asSortedCollection.
].
1827
1819
1833


Sorted collections make for inefficient domain objects because they don't like incremental growth. Add all at once or defer sorting to finish with #sort:using: to re-sort items within a sequenceable collection.

If you have multiple sort criteria then you could collate by the first criteria into groups sorted by the remaining criteria. Many people create extremely inefficient multi-criteria sorts by creating many temporary objects during the sort. An efficient way to sort with multiple criteria is to use methods like this:

<= other
| i a b |
i := 1.
[
(a := self sortableFieldAt: i) isNil ifTrue: [^true].
(b := other sortableFieldAt: i) isNil ifTrue: [^false].
a ~~ b ifTrue: [
a < b ifTrue: [^true].
b < a ifTrue: [^false].
].
(i := i + 1) < 1000
] whileTrue: [].
self error: '#sortableFieldAt: returned an unlikely number of matches before returning nil'.

Paul Baumann


-----Original Message-----
From: [hidden email] [[hidden email]] On Behalf Of Runar Jordahl
Sent: Tuesday, January 17, 2012 06:05
To: Jon Paynter
Cc: [hidden email]
Subject: Re: [vwnc] Large, Sorted Collections

Thanks for your replies!

I initially tried adding a lot of elements to a sorted collection:

|collection|
collection := SortedCollection new.
1000000 timesRepeat: [collection add: FastRandom new next].

This runs painfully slow.

But if I instead add elements to an ordered collection and then sort,
it runs fast and is usable:

|collection|
collection := OrderedCollection new.
1000000 timesRepeat: [collection add: FastRandom new next].
collection asSortedCollection.

Other tests I have done show that adding one additional element to the
resulting sorted collection takes 5-10 milliseconds, which is not bad.

(Note that my (incorrect) use of FastRandom generates many equal
values. This is done by intention, to simulate collection where many
elements are equal.)

Kind regards
Runar
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc


This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc


[hidden email]
17 January, 2012 6:05 AM


Thanks for your replies!

I initially tried adding a lot of elements to a sorted collection:

|collection|
collection := SortedCollection new.
1000000 timesRepeat: [collection add: FastRandom new next].

This runs painfully slow.

But if I instead add elements to an ordered collection and then sort,
it runs fast and is usable:

|collection|
collection := OrderedCollection new.
1000000 timesRepeat: [collection add: FastRandom new next].
collection asSortedCollection.

Other tests I have done show that adding one additional element to the
resulting sorted collection takes 5-10 milliseconds, which is not bad.

(Note that my (incorrect) use of FastRandom generates many equal
values. This is done by intention, to simulate collection where many
elements are equal.)

Kind regards
Runar
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc


[hidden email]
16 January, 2012 4:41 PM


What problems are you seeing with the default implementations?

Ive used sorted collections with up to 5 million instances before.  While it wasnt blazingly fast - the delay in creating the collection was only about 10-15 seconds.


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc


[hidden email]
16 January, 2012 4:35 PM


Does there exist any sorted collection or sorted dictionary
implementations that are capable of holding “many” objects? By many I
mean hundreds of thousands. I guess the internal representation has to
use a B-Tree or a similar structure, not an array.

I am looking for a regular Smalltalk in-memory collection, not a
collection for an object-oriented database.

Kind regards
Runar

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Image capture scratch on OSX

Pierre-34
In reply to this post by Alan Knight-2
Hi all,
The image capture (UiMaskEditor, accessible from main vw Menu: Painter -> Image Editor) causes a complete scratch of VW on Mac OSX.
Any idea to solve that ??
Thank's for your help,
    Pierre

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc