Hello everyone,
While using #flatCollect: on a collection, I realized that, for example, these two code snippets do not behave the same way: #(1 (1)) flatCollect: #yourself. "Raise an error because the array does not contain only collections" (#(1 (1)) collect: #yourself) flattened "Returns #(1 1) Shouldn't these two code snippets behave the same way? Thanks in advance for your answers. Regards, Julien |
>Shouldn't these two code snippets behave the same way?
#flatCollect: expects that aBlock returns a collection for each element (see method comment) and only flattens one level, while # flattened expands all sub collections it finds: ---------------------------------------------------------------- #( #(1 #(2) ) ) flatCollect: [ :x | x ]. "#(1 #(2))" #( #(1 #(2) ) ) flattened. "#(1 2)" ---------------------------------------------------------------- Ps. Using a symbol instead of a block reduces performance. [ 1 to: 1e9 do: [ :each | each ] ] timeToRun. "0:00:00:02.463" [ 1 to: 1e9 do: #yourself ] timeToRun. "0:00:00:11.468" Best regards, Henrik -----Opprinnelig melding----- Fra: Pharo-dev [mailto:[hidden email]] På vegne av Julien Delplanque Sendt: 12 January 2017 11:27 Til: Pharo Development List <[hidden email]> Emne: [Pharo-dev] collection flatCollect: #something VS (collection collect: #something) flattened Hello everyone, While using #flatCollect: on a collection, I realized that, for example, these two code snippets do not behave the same way: #(1 (1)) flatCollect: #yourself. "Raise an error because the array does not contain only collections" (#(1 (1)) collect: #yourself) flattened "Returns #(1 1) Shouldn't these two code snippets behave the same way? Thanks in advance for your answers. Regards, Julien |
On 12/01/17 12:32, Henrik Nergaard wrote:
>> Shouldn't these two code snippets behave the same way? > #flatCollect: expects that aBlock returns a collection for each element (see method comment) and only flattens one level, while # flattened expands all sub collections it finds: > ---------------------------------------------------------------- > #( #(1 #(2) ) ) flatCollect: [ :x | x ]. "#(1 #(2))" > #( #(1 #(2) ) ) flattened. "#(1 2)" > ---------------------------------------------------------------- Oh, ok so it's a feature :) > > Ps. Using a symbol instead of a block reduces performance. > [ 1 to: 1e9 do: [ :each | each ] ] timeToRun. "0:00:00:02.463" > [ 1 to: 1e9 do: #yourself ] timeToRun. "0:00:00:11.468" Wow, I used symbols to make the example clear but I didn't know that. That's sad, I think it is sexier to use a symbol to do this kind of things. :( Regards, Julien |
On 01/12/2017 06:45 AM, Julien Delplanque wrote:
> On 12/01/17 12:32, Henrik Nergaard wrote: >> Ps. Using a symbol instead of a block reduces performance. >> [ 1 to: 1e9 do: [ :each | each ] ] timeToRun. "0:00:00:02.463" >> [ 1 to: 1e9 do: #yourself ] timeToRun. "0:00:00:11.468" > Wow, I used symbols to make the example clear but I didn't know that. > That's sad, I think it is sexier to use a symbol to do this kind of > things. :( I'm not sure what this test is supposed to show. The first one is just a loop counting to 1 billion inlined in a method. The second one is a message send of #to:do: which is implemented as a whileTrue: loop which will send the #value: message to #yourself. Essentially, it is showing the time to evaluate "#yourself value: someInt" 1 billion times. I think that a better test to show the performance difference is this: [ 1 to: 1000000000 do: [ :i | [ :e | e ] value: i ] ] timeToRun. "0:00:00:14.917" [ 1 to: 1000000000 do: [ :i | #yourself value: i ] ] timeToRun. "0:00:00:07.846" These results might lead you to believe that symbols are faster than blocks. However, the first one is also creating 1 billion blocks. If we create the block once, then blocks are faster: [ | b | b := [ :e | e ]. 1 to: 1000000000 do: [ :i | b value: i ] ] timeToRun. "0:00:00:04.515" So, if you know how many blocks you will create and how often each block is evaluated, you could come up with the optimal solution. Or, you could just write your code so the intent is expressed clearly and not worry about performance until it is needed. John Brant |
The test was meant to show the overhead when a symbol is used as a message argument instead of a block, not the evaluation of each of them by #value:
-------------------------------------------- #(a b c) collect: #asUppercase "#('A' 'B' 'C')" #(a b c) collect: [ :each | each asUppercase ]. "#('A' 'B' 'C')" -------------------------------------------- Best regards, Henrik -----Opprinnelig melding----- Fra: Pharo-dev [mailto:[hidden email]] På vegne av John Brant Sendt: 12 January 2017 14:40 Til: [hidden email] Emne: Re: [Pharo-dev] collection flatCollect: #something VS (collection collect: #something) flattened On 01/12/2017 06:45 AM, Julien Delplanque wrote: > On 12/01/17 12:32, Henrik Nergaard wrote: >> Ps. Using a symbol instead of a block reduces performance. >> [ 1 to: 1e9 do: [ :each | each ] ] timeToRun. "0:00:00:02.463" >> [ 1 to: 1e9 do: #yourself ] timeToRun. "0:00:00:11.468" > Wow, I used symbols to make the example clear but I didn't know that. > That's sad, I think it is sexier to use a symbol to do this kind of > things. :( I'm not sure what this test is supposed to show. The first one is just a loop counting to 1 billion inlined in a method. The second one is a message send of #to:do: which is implemented as a whileTrue: loop which will send the #value: message to #yourself. Essentially, it is showing the time to evaluate "#yourself value: someInt" 1 billion times. I think that a better test to show the performance difference is this: [ 1 to: 1000000000 do: [ :i | [ :e | e ] value: i ] ] timeToRun. "0:00:00:14.917" [ 1 to: 1000000000 do: [ :i | #yourself value: i ] ] timeToRun. "0:00:00:07.846" These results might lead you to believe that symbols are faster than blocks. However, the first one is also creating 1 billion blocks. If we create the block once, then blocks are faster: [ | b | b := [ :e | e ]. 1 to: 1000000000 do: [ :i | b value: i ] ] timeToRun. "0:00:00:04.515" So, if you know how many blocks you will create and how often each block is evaluated, you could come up with the optimal solution. Or, you could just write your code so the intent is expressed clearly and not worry about performance until it is needed. John Brant |
On 01/12/2017 08:03 AM, Henrik Nergaard wrote:
> The test was meant to show the overhead when a symbol is used as a message argument instead of a block, not the evaluation of each of them by #value: > -------------------------------------------- > #(a b c) collect: #asUppercase "#('A' 'B' 'C')" > #(a b c) collect: [ :each | each asUppercase ]. "#('A' 'B' 'C')" > -------------------------------------------- Passing a symbol will be faster than creating and then passing a block. If the block has already been creating then both should be about the same speed. Evaluating a block is faster than evaluating a symbol like a block (i.e., sending the #value: message). Your initial tests used the #to:do: message. That message gets inlined when the last argument is a one argument block. The code for the first example "1 to: 1e9 do: [ :each | each ]" would have been inlined to look something like this: | current end | end := 1e9. current := 1. [current <= end] whileTrue: [current := current + 1] So there are no "real" block arguments as the "[:each | each]" gets removed. BTW, there is a bug in the #to:do: inlining. The first argument is evaluated before the receiver. Here's a test case that fails: | stream | stream := ReadStream on: #(2 1). stream next to: stream next do: [ :i | self error: 'This should not occur' ] This should be evaluated as "2 to: 1 do: ...". Instead it is evaluated as "1 to: 2 do: ..." . John Brant |
Free forum by Nabble | Edit this page |