+1!
-----Ursprüngliche Nachricht----- Von: [hidden email] [mailto:[hidden email]] Im Auftrag von Andres Valloud Gesendet: Mittwoch, 22. Mai 2013 09:12 An: Visualworks Mailing List Betreff: Re: [vwnc] Reducers in Smalltalk +1... sometimes I don't understand why we focus so much on picking a winner to the detriment of diversity. One could also say such a tendency is at play when we compare different computer languages with Smalltalk. I think we all have picked our favorites a long time ago, what doesn't seem to follow for me is the apparent conclusion that since there must be *a* favorite then everything else is unworthy of attention. We tend to pay a huge price when we want to interface Smalltalk to that outside "alien" world that doesn't want to conform to our expectations of elegance, uniformity, and simplicity --- and when we would love to share our excitement and passion for what we do. Maybe we should be a bit more aware of these types of limitations so that we can keep an open mind to different ways of thinking. FYI, Gauss used to say that he learned a new language every couple years to "keep his mind flexible". On 5/21/13 10:16 , Steven Kelly wrote: > As this conversation proves, one of the hardest things is to get > people to actually try something new. It's more natural to respond > with "I don't need it, I can cope with the basic functions I already > have", or "I already know another add-on that offers some of the same benefits". > Both are true, and I'm guilty on both counts. But in a sense the value > of Reducers may be precisely that they offer something familiar to > functional programmers, as Helge says. By giving them familiar > vocabulary and things they value (e.g. parallelism), we make Smalltalk > a little more palatable. And even old hands at Smalltalk may learn > something new - after all, most agree that inject:into: always feels a > little lacking. > I'm really glad to see Steffen's work, and the Xtreams work of Michael > and Martin. I'm not using either in anger yet, because I don't have > enough pain from the existing base collection and stream support to > justify adding a dependency on a non-base component. But I do think my > brain benefitted from being stretched in undergraduate days with FP, > and more recently with a little greenfield project using Xtreams. > All the best, > Steve > > ---------------------------------------------------------------------- > -- > *From:* [hidden email] on behalf of Nowak, Helge > *Sent:* Tue 21/05/2013 17:00 > *To:* john woods > *Cc:* [hidden email] > *Subject:* Re: [vwnc] Reducers in Smalltalk > > Dear John, > > in general I agree. > > I think the real issue is that Reducers are coming from a mindset > different from Smalltalk: functional programming, which is rooted in > mathematics. From my point of view functional programming looks rather > at the "how to implement functionality" whereas Smalltalk's philosophy > is "what domain to model". As a programmer of non-mathematical > software I have an intuitive notion of what a Collection is (a > collection of things). Yet what is a "Reducible"? Naively: something > that can be reduced, i. e. a value?? If I am a programmer of > mathematical software though, I may have an intuitive notion of what a > Reducible is, just like I have a notion of vectors, matrices etc.. > Having said that I think Reducers (a Do-er: is this OO think at all?) > may have their place, yet not as a general concept. > > To not be misunderstood: I don't judge anything like implementation > details or claimed benefits of the one or other implementation. And I > don't want to belittle the work Steffen has done. I just talk about > the philosophy behind. > > Cheers > > Helge > > *Von:*[hidden email] [mailto:[hidden email]] *Im > Auftrag von *john woods *Gesende**t:*Dienstag, 21. Mai 2013 15:02 > *Cc:* [hidden email] > *Betreff:* Re: [vwnc] Reducers in Smalltalk > > (((1 to: 10) reducer filter: #odd) map: squared) reduce: #+ init: 0. > > > It can be written the classic way: > > (((1 to: 10) select: #odd) collect: #squared) inject: 0 into: #+. > > > Which creates two intermediate collections. Again, we can avoid them: > > (1 to: 10) inject: 0 into: > [:sum :n | n odd > ifTrue: [sum + n squared] > ifFalse: [sum]] > > > But now we have to think in terms of #inject:into and craft the > behavior of > #select: and #collect: into the block by hand. > > This style is harder to read, especially for more complex conditions > and mappings. > > Actually, I'd like to completely disagree. It is, in my opinion, much > easier to see what the latter does. The easiest to read is the most > naive. Write the following extension methods: > > > Given Collection>>sum, (((1 to: 10) select: #odd) collect: #squared) sum. > > Or you could use Collection>>inject:into: here instead of writing > Collection>>sum. Or you could go the other way and implement, for > instance Collection>>oddElements etc, and just do (1 to: 10) > allOddElements sumOfSquares. > > What I really don't want to see is unnecessary complexity or in the > code because programmers think that it will be faster or more > efficient. For instance, one of your concerns was the creation of an intermediate > block. But you have offered very little proof that this is even a > problem. Who knows what the VM will do, especially if it does JIT > optimization? or uses the GPU? > > As a full time performance engineer I spend a lot of time looking at > incomprehensible code (often not Smalltalk) which has been > 'pre-optimized' by developers, and this pre-optimization is largely > (though not solely) responsible for its poor readability. Even where > their efforts work (and you would be *amazed* how often they either > have zero or even negative effects on real-world performance), the > cost of maintaining such obscurantist code is nearly always vastly > more than finding better hardware. In short, write your application > first, then profile it and *then* optimize it - and only those bits which need it. > The observed performance problems will very rarely coincide with the > expectation of even some of the most talented developers. > > These Reducer things do not look self explanatory and to me that is > adverse to the entire spirit of Smalltalk. My view would be that I > would need to see *much* stronger evidence of *very* significant > advantages of this approach before I would even consider using such an > approach. > vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
http://books.google.com/books?id=_dce9Jaq7iQC&pg=PA238&lpg=PA238&dq=gauss+learning+new+languages&source=bl&ots=nzpnEIUEDT&sig=VNtIBa-b9zBGfRPBqKdQfecHXbE&hl=en&sa=X&ei=LnOcUaKcLub_igKk9oCgAQ&ved=0CGMQ6AEwBQ#v=onepage&q=gauss%20learning%20new%20languages&f=false
On 5/22/13 0:17 , Nowak, Helge wrote: > +1! > > -----Ursprüngliche Nachricht----- > Von: [hidden email] [mailto:[hidden email]] Im Auftrag von Andres Valloud > Gesendet: Mittwoch, 22. Mai 2013 09:12 > An: Visualworks Mailing List > Betreff: Re: [vwnc] Reducers in Smalltalk > > +1... sometimes I don't understand why we focus so much on picking a > winner to the detriment of diversity. One could also say such a tendency is at play when we compare different computer languages with Smalltalk. I think we all have picked our favorites a long time ago, what doesn't seem to follow for me is the apparent conclusion that since there must be *a* favorite then everything else is unworthy of attention. We tend to pay a huge price when we want to interface Smalltalk to that outside "alien" world that doesn't want to conform to our expectations of elegance, uniformity, and simplicity --- and when we would love to share our excitement and passion for what we do. Maybe we should be a bit more aware of these types of limitations so that we can keep an open mind to different ways of thinking. > > FYI, Gauss used to say that he learned a new language every couple years to "keep his mind flexible". > > On 5/21/13 10:16 , Steven Kelly wrote: >> As this conversation proves, one of the hardest things is to get >> people to actually try something new. It's more natural to respond >> with "I don't need it, I can cope with the basic functions I already >> have", or "I already know another add-on that offers some of the same benefits". >> Both are true, and I'm guilty on both counts. But in a sense the value >> of Reducers may be precisely that they offer something familiar to >> functional programmers, as Helge says. By giving them familiar >> vocabulary and things they value (e.g. parallelism), we make Smalltalk >> a little more palatable. And even old hands at Smalltalk may learn >> something new - after all, most agree that inject:into: always feels a >> little lacking. >> I'm really glad to see Steffen's work, and the Xtreams work of Michael >> and Martin. I'm not using either in anger yet, because I don't have >> enough pain from the existing base collection and stream support to >> justify adding a dependency on a non-base component. But I do think my >> brain benefitted from being stretched in undergraduate days with FP, >> and more recently with a little greenfield project using Xtreams. >> All the best, >> Steve >> >> ---------------------------------------------------------------------- >> -- >> *From:* [hidden email] on behalf of Nowak, Helge >> *Sent:* Tue 21/05/2013 17:00 >> *To:* john woods >> *Cc:* [hidden email] >> *Subject:* Re: [vwnc] Reducers in Smalltalk >> >> Dear John, >> >> in general I agree. >> >> I think the real issue is that Reducers are coming from a mindset >> different from Smalltalk: functional programming, which is rooted in >> mathematics. From my point of view functional programming looks rather >> at the "how to implement functionality" whereas Smalltalk's philosophy >> is "what domain to model". As a programmer of non-mathematical >> software I have an intuitive notion of what a Collection is (a >> collection of things). Yet what is a "Reducible"? Naively: something >> that can be reduced, i. e. a value?? If I am a programmer of >> mathematical software though, I may have an intuitive notion of what a >> Reducible is, just like I have a notion of vectors, matrices etc.. >> Having said that I think Reducers (a Do-er: is this OO think at all?) >> may have their place, yet not as a general concept. >> >> To not be misunderstood: I don't judge anything like implementation >> details or claimed benefits of the one or other implementation. And I >> don't want to belittle the work Steffen has done. I just talk about >> the philosophy behind. >> >> Cheers >> >> Helge >> >> *Von:*[hidden email] [mailto:[hidden email]] *Im >> Auftrag von *john woods *Gesende**t:*Dienstag, 21. Mai 2013 15:02 >> *Cc:* [hidden email] >> *Betreff:* Re: [vwnc] Reducers in Smalltalk >> >> (((1 to: 10) reducer filter: #odd) map: squared) reduce: #+ init: 0. >> >> >> It can be written the classic way: >> >> (((1 to: 10) select: #odd) collect: #squared) inject: 0 into: #+. >> >> >> Which creates two intermediate collections. Again, we can avoid them: >> >> (1 to: 10) inject: 0 into: >> [:sum :n | n odd >> ifTrue: [sum + n squared] >> ifFalse: [sum]] >> >> >> But now we have to think in terms of #inject:into and craft the >> behavior of >> #select: and #collect: into the block by hand. >> >> This style is harder to read, especially for more complex conditions >> and mappings. >> >> Actually, I'd like to completely disagree. It is, in my opinion, much >> easier to see what the latter does. The easiest to read is the most >> naive. Write the following extension methods: >> >> >> Given Collection>>sum, (((1 to: 10) select: #odd) collect: #squared) sum. >> >> Or you could use Collection>>inject:into: here instead of writing >> Collection>>sum. Or you could go the other way and implement, for >> instance Collection>>oddElements etc, and just do (1 to: 10) >> allOddElements sumOfSquares. >> >> What I really don't want to see is unnecessary complexity or in the >> code because programmers think that it will be faster or more >> efficient. For instance, one of your concerns was the creation of an intermediate >> block. But you have offered very little proof that this is even a >> problem. Who knows what the VM will do, especially if it does JIT >> optimization? or uses the GPU? >> >> As a full time performance engineer I spend a lot of time looking at >> incomprehensible code (often not Smalltalk) which has been >> 'pre-optimized' by developers, and this pre-optimization is largely >> (though not solely) responsible for its poor readability. Even where >> their efforts work (and you would be *amazed* how often they either >> have zero or even negative effects on real-world performance), the >> cost of maintaining such obscurantist code is nearly always vastly >> more than finding better hardware. In short, write your application >> first, then profile it and *then* optimize it - and only those bits which need it. >> The observed performance problems will very rarely coincide with the >> expectation of even some of the most talented developers. >> >> These Reducer things do not look self explanatory and to me that is >> adverse to the entire spirit of Smalltalk. My view would be that I >> would need to see *much* stronger evidence of *very* significant >> advantages of this approach before I would even consider using such an >> approach. >> > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc > . > vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Vincent Lesbros-2
Hi Vincent,
I'll try to outline some ideas according to my understanding of your problem. This answer is actually quite long and took quite some time; sorry for being lengthy. First, I want to spoil you a bit before diving into the details. We could come up with something like > analysis := validate <> calculated <> samples. > result := (analysis from: numbers) reduce: #& init: true. Where * samples takes the samples * calculated does the calculation * validate does some test on the calculated values Finally, "reduce: #& init: true" simply calculates the logic conjunction of the results. 1) Sampling If it is sufficient to have a fixed distance between your sample, you can use: > distance := 100. > numbers reducer > filter: [:index :n | (i \\ distance) isZero]. But maybe you want to take random samples with a certain expected distance: > probability := 1 / distance > random := Random fastest. > numbers reducer > filter: [:n | random next <= probability] You can write both filters in a more compositional way to reuse them. > equidistant := Reducer filter: [:index :n | (i \\ distance) isZero]. > randomized := Reducer filter: [:n | random next <= probability]. > "apply to collection" > equidistant from: numbers. > randomized from: numbers. > "even combined" > randomized <> equidistant from: numbers. Depending on the representation of the numbers collection this approach might be inefficient, since filter: (and select: as well), will enumerate the numbers. For example arrays afford direct access via the indices, e.g. > indices := (1 to: numbers size by: distance). > indices collect: [:i | numbers at: i]. > "or lazy" > indices reducer map: [:i | numbers at: i]. However, reducers make very few assumptions about the reducible (read: collection). Namely, it has to implement #reduce:init:. Thus, reducers can exploit their strengths if, for example, - the reducible's type is not previously know - the reducible's size is unknown/infinite - the reducible provides only sequential access (e.g., generated/streamed elements) - et cetera Then you can do things like: > "take 100 random elements > (randomized from: numbers) take: 100 > "sample only in the index interval (100, 1000)" > equidistant from: ((numbers drop: 99) take: 900) "or more > compositional" > aHundred := Reducer take: 100. > aHundred <> randomized from: numbers. You may find that equidistant/randomized filters are useful enough to make them available to all reducibles. If so, you want to implement two transformer classes, e.g., TakingEquidistant and TakingRandomized, and extend Reducer with convenience methods like * Reducer>>takeEvery: distance * Reducer>>takeRandom: probability Have a look into Taking or TakingWhile for the details. (Maybe the convenience methods will be generated automagically in future iterations of reducers.) 2) Calculations on Elements aka Mapping I suppose we split the work on the samples into a calculation and a verification step. We can express both as mappings: > calculated := Reducer map: [:each | "expensive calculations"]. > validate := Reducer map: [:each | "boolean condition"]. 3) Caching or Memoizing This is difficult to answer, since those optimizations heavily depend on your calculations and data. As John pointed out earlier, profiling should be your guide here. Said this, reducers allow you to apply your own caching strategies. Just an example: Travis Griggs implemented a small memoizing API for blocks (TAG-MemoizedFunctions; http://objology.blogspot.de/2012/05/tag-memoizedfunctions.html). You could use it to speed up the calculations: > calculated := Reducer map: [:each | "expensive calculations"] memoizing. Applying this optimization can be tricky. For example, if the samples are randomized, it is likely that the speedup is minimal. Even worse, performance may decline due to increased memory consumption. If you have to perform many similar calculations on the same sample set, you might want to aggregate the samples in a collection: > (samples from: numbers) > reduce: [:set :each | set add: each; yourself] init: Set new. As always, the strategy depends. 4) Wrap Up With the ideas above, we can complete the introductory example. > samples := (Reducer take: 100) <> equidistant. > calculated := Reducer map: [:each | "expensive calculations"] memoized > verify := Reducer map: [:eac | "boolean condition"] > analysis := validate <> calculated <> samples. Now, the analysis can be applied to multiple times to different sources, e.g., > result := (analysis from: numbers) reduce: #& init: true. Since "equidistant" uses indices, "numbers" is required to be indexed. We can fix this be defining an indexing step. > count := 0 > indexed := Reducer mapKeys: [:n | count := count + 1]. Now we can apply the analysis to arbitrary reducibles: > result := (analysis <> indexed from: numbers) reduce: #& init: true. > count := 0. I hope this mail gives a first impression. Kind regards, Steffen _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Nowak, Helge
++1
Am 22.05.2013, 09:17 Uhr, schrieb Nowak, Helge <[hidden email]>: > +1! > > -----Ursprüngliche Nachricht----- > Von: [hidden email] [mailto:[hidden email]] Im > Auftrag von Andres Valloud > Gesendet: Mittwoch, 22. Mai 2013 09:12 > An: Visualworks Mailing List > Betreff: Re: [vwnc] Reducers in Smalltalk > > +1... sometimes I don't understand why we focus so much on picking a > winner to the detriment of diversity. One could also say such a > tendency is at play when we compare different computer languages with > Smalltalk. I think we all have picked our favorites a long time ago, > what doesn't seem to follow for me is the apparent conclusion that since > there must be *a* favorite then everything else is unworthy of > attention. We tend to pay a huge price when we want to interface > Smalltalk to that outside "alien" world that doesn't want to conform to > our expectations of elegance, uniformity, and simplicity --- and when we > would love to share our excitement and passion for what we do. Maybe we > should be a bit more aware of these types of limitations so that we can > keep an open mind to different ways of thinking. > > FYI, Gauss used to say that he learned a new language every couple years > to "keep his mind flexible". > > On 5/21/13 10:16 , Steven Kelly wrote: >> As this conversation proves, one of the hardest things is to get >> people to actually try something new. It's more natural to respond >> with "I don't need it, I can cope with the basic functions I already >> have", or "I already know another add-on that offers some of the same >> benefits". >> Both are true, and I'm guilty on both counts. But in a sense the value >> of Reducers may be precisely that they offer something familiar to >> functional programmers, as Helge says. By giving them familiar >> vocabulary and things they value (e.g. parallelism), we make Smalltalk >> a little more palatable. And even old hands at Smalltalk may learn >> something new - after all, most agree that inject:into: always feels a >> little lacking. >> I'm really glad to see Steffen's work, and the Xtreams work of Michael >> and Martin. I'm not using either in anger yet, because I don't have >> enough pain from the existing base collection and stream support to >> justify adding a dependency on a non-base component. But I do think my >> brain benefitted from being stretched in undergraduate days with FP, >> and more recently with a little greenfield project using Xtreams. >> All the best, >> Steve >> >> ---------------------------------------------------------------------- >> -- >> *From:* [hidden email] on behalf of Nowak, Helge >> *Sent:* Tue 21/05/2013 17:00 >> *To:* john woods >> *Cc:* [hidden email] >> *Subject:* Re: [vwnc] Reducers in Smalltalk >> >> Dear John, >> >> in general I agree. >> >> I think the real issue is that Reducers are coming from a mindset >> different from Smalltalk: functional programming, which is rooted in >> mathematics. From my point of view functional programming looks rather >> at the "how to implement functionality" whereas Smalltalk's philosophy >> is "what domain to model". As a programmer of non-mathematical >> software I have an intuitive notion of what a Collection is (a >> collection of things). Yet what is a "Reducible"? Naively: something >> that can be reduced, i. e. a value?? If I am a programmer of >> mathematical software though, I may have an intuitive notion of what a >> Reducible is, just like I have a notion of vectors, matrices etc.. >> Having said that I think Reducers (a Do-er: is this OO think at all?) >> may have their place, yet not as a general concept. >> >> To not be misunderstood: I don't judge anything like implementation >> details or claimed benefits of the one or other implementation. And I >> don't want to belittle the work Steffen has done. I just talk about >> the philosophy behind. >> >> Cheers >> >> Helge >> >> *Von:*[hidden email] [mailto:[hidden email]] *Im >> Auftrag von *john woods *Gesende**t:*Dienstag, 21. Mai 2013 15:02 >> *Cc:* [hidden email] >> *Betreff:* Re: [vwnc] Reducers in Smalltalk >> >> (((1 to: 10) reducer filter: #odd) map: squared) reduce: #+ >> init: 0. >> >> >> It can be written the classic way: >> >> (((1 to: 10) select: #odd) collect: #squared) inject: 0 into: #+. >> >> >> Which creates two intermediate collections. Again, we can avoid >> them: >> >> (1 to: 10) inject: 0 into: >> [:sum :n | n odd >> ifTrue: [sum + n squared] >> ifFalse: [sum]] >> >> >> But now we have to think in terms of #inject:into and craft the >> behavior of >> #select: and #collect: into the block by hand. >> >> This style is harder to read, especially for more complex conditions >> and mappings. >> >> Actually, I'd like to completely disagree. It is, in my opinion, much >> easier to see what the latter does. The easiest to read is the most >> naive. Write the following extension methods: >> >> >> Given Collection>>sum, (((1 to: 10) select: #odd) collect: #squared) >> sum. >> >> Or you could use Collection>>inject:into: here instead of writing >> Collection>>sum. Or you could go the other way and implement, for >> instance Collection>>oddElements etc, and just do (1 to: 10) >> allOddElements sumOfSquares. >> >> What I really don't want to see is unnecessary complexity or in the >> code because programmers think that it will be faster or more >> efficient. For instance, one of your concerns was the creation of an >> intermediate >> block. But you have offered very little proof that this is even a >> problem. Who knows what the VM will do, especially if it does JIT >> optimization? or uses the GPU? >> >> As a full time performance engineer I spend a lot of time looking at >> incomprehensible code (often not Smalltalk) which has been >> 'pre-optimized' by developers, and this pre-optimization is largely >> (though not solely) responsible for its poor readability. Even where >> their efforts work (and you would be *amazed* how often they either >> have zero or even negative effects on real-world performance), the >> cost of maintaining such obscurantist code is nearly always vastly >> more than finding better hardware. In short, write your application >> first, then profile it and *then* optimize it - and only those bits >> which need it. >> The observed performance problems will very rarely coincide with the >> expectation of even some of the most talented developers. >> >> These Reducer things do not look self explanatory and to me that is >> adverse to the entire spirit of Smalltalk. My view would be that I >> would need to see *much* stronger evidence of *very* significant >> advantages of this approach before I would even consider using such an >> approach. >> > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc > > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Steffen Märcker
Thank you, Steffen,
I am convinced, I will try them. 2013/5/22 Steffen Märcker <[hidden email]>: > Hi Vincent, > > I'll try to outline some ideas according to my understanding of your > problem. This answer is actually quite long and took quite some time; sorry > for being lengthy. > > First, I want to spoil you a bit before diving into the details. We could > come up with something like > >> analysis := validate <> calculated <> samples. >> result := (analysis from: numbers) reduce: #& init: true. > > > Where > * samples takes the samples > * calculated does the calculation > * validate does some test on the calculated values > > Finally, "reduce: #& init: true" simply calculates the logic conjunction of > the results. > > > 1) Sampling > > If it is sufficient to have a fixed distance between your sample, you can > use: >> >> distance := 100. >> numbers reducer >> filter: [:index :n | (i \\ distance) isZero]. > > > But maybe you want to take random samples with a certain expected distance: >> >> probability := 1 / distance >> random := Random fastest. >> numbers reducer >> filter: [:n | random next <= probability] > > > You can write both filters in a more compositional way to reuse them. >> >> equidistant := Reducer filter: [:index :n | (i \\ distance) isZero]. >> randomized := Reducer filter: [:n | random next <= probability]. >> "apply to collection" >> equidistant from: numbers. >> randomized from: numbers. >> "even combined" >> randomized <> equidistant from: numbers. > > > Depending on the representation of the numbers collection this approach > might be inefficient, since filter: (and select: as well), will enumerate > the numbers. For example arrays afford direct access via the indices, e.g. >> >> indices := (1 to: numbers size by: distance). >> indices collect: [:i | numbers at: i]. >> "or lazy" >> indices reducer map: [:i | numbers at: i]. > > > However, reducers make very few assumptions about the reducible (read: > collection). Namely, it has to implement #reduce:init:. Thus, reducers can > exploit their strengths if, for example, > - the reducible's type is not previously know > - the reducible's size is unknown/infinite > - the reducible provides only sequential access > (e.g., generated/streamed elements) > - et cetera > > Then you can do things like: >> >> "take 100 random elements >> (randomized from: numbers) take: 100 >> "sample only in the index interval (100, 1000)" >> equidistant from: ((numbers drop: 99) take: 900) "or more compositional" >> aHundred := Reducer take: 100. >> aHundred <> randomized from: numbers. > > > You may find that equidistant/randomized filters are useful enough to make > them available to all reducibles. If so, you want to implement two > transformer classes, e.g., TakingEquidistant and TakingRandomized, and > extend Reducer with convenience methods like > * Reducer>>takeEvery: distance > * Reducer>>takeRandom: probability > Have a look into Taking or TakingWhile for the details. (Maybe the > convenience methods will be generated automagically in future iterations of > reducers.) > > > 2) Calculations on Elements aka Mapping > I suppose we split the work on the samples into a calculation and a > verification step. We can express both as mappings: >> >> calculated := Reducer map: [:each | "expensive calculations"]. >> validate := Reducer map: [:each | "boolean condition"]. > > > > 3) Caching or Memoizing > This is difficult to answer, since those optimizations heavily depend on > your calculations and data. As John pointed out earlier, profiling should be > your guide here. > Said this, reducers allow you to apply your own caching strategies. Just an > example: > > Travis Griggs implemented a small memoizing API for blocks > (TAG-MemoizedFunctions; > http://objology.blogspot.de/2012/05/tag-memoizedfunctions.html). You could > use it to speed up the calculations: >> >> calculated := Reducer map: [:each | "expensive calculations"] memoizing. > > > Applying this optimization can be tricky. For example, if the samples are > randomized, it is likely that the speedup is minimal. Even worse, > performance may decline due to increased memory consumption. > > If you have to perform many similar calculations on the same sample set, you > might want to aggregate the samples in a collection: >> >> (samples from: numbers) >> reduce: [:set :each | set add: each; yourself] init: Set new. > > > As always, the strategy depends. > > > 4) Wrap Up > With the ideas above, we can complete the introductory example. > >> samples := (Reducer take: 100) <> equidistant. >> calculated := Reducer map: [:each | "expensive calculations"] memoized >> verify := Reducer map: [:eac | "boolean condition"] >> analysis := validate <> calculated <> samples. > > > Now, the analysis can be applied to multiple times to different sources, > e.g., > >> result := (analysis from: numbers) reduce: #& init: true. > > > Since "equidistant" uses indices, "numbers" is required to be indexed. We > can fix this be defining an indexing step. > >> count := 0 >> indexed := Reducer mapKeys: [:n | count := count + 1]. > > > Now we can apply the analysis to arbitrary reducibles: > >> result := (analysis <> indexed from: numbers) reduce: #& init: true. >> count := 0. > > > > I hope this mail gives a first impression. > > Kind regards, > Steffen > > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Nice, have fun! =)
I'd be interested in your feedback and suggestions for improvement once you have used them. Regards, Steffen Am 22.05.2013, 16:57 Uhr, schrieb Vincent Lesbros <[hidden email]>: > Thank you, Steffen, > I am convinced, I will try them. > > 2013/5/22 Steffen Märcker <[hidden email]>: >> Hi Vincent, >> >> I'll try to outline some ideas according to my understanding of your >> problem. This answer is actually quite long and took quite some time; >> sorry >> for being lengthy. >> >> First, I want to spoil you a bit before diving into the details. We >> could >> come up with something like >> >>> analysis := validate <> calculated <> samples. >>> result := (analysis from: numbers) reduce: #& init: true. >> >> >> Where >> * samples takes the samples >> * calculated does the calculation >> * validate does some test on the calculated values >> >> Finally, "reduce: #& init: true" simply calculates the logic >> conjunction of >> the results. >> >> >> 1) Sampling >> >> If it is sufficient to have a fixed distance between your sample, you >> can >> use: >>> >>> distance := 100. >>> numbers reducer >>> filter: [:index :n | (i \\ distance) isZero]. >> >> >> But maybe you want to take random samples with a certain expected >> distance: >>> >>> probability := 1 / distance >>> random := Random fastest. >>> numbers reducer >>> filter: [:n | random next <= probability] >> >> >> You can write both filters in a more compositional way to reuse them. >>> >>> equidistant := Reducer filter: [:index :n | (i \\ distance) isZero]. >>> randomized := Reducer filter: [:n | random next <= probability]. >>> "apply to collection" >>> equidistant from: numbers. >>> randomized from: numbers. >>> "even combined" >>> randomized <> equidistant from: numbers. >> >> >> Depending on the representation of the numbers collection this approach >> might be inefficient, since filter: (and select: as well), will >> enumerate >> the numbers. For example arrays afford direct access via the indices, >> e.g. >>> >>> indices := (1 to: numbers size by: distance). >>> indices collect: [:i | numbers at: i]. >>> "or lazy" >>> indices reducer map: [:i | numbers at: i]. >> >> >> However, reducers make very few assumptions about the reducible (read: >> collection). Namely, it has to implement #reduce:init:. Thus, reducers >> can >> exploit their strengths if, for example, >> - the reducible's type is not previously know >> - the reducible's size is unknown/infinite >> - the reducible provides only sequential access >> (e.g., generated/streamed elements) >> - et cetera >> >> Then you can do things like: >>> >>> "take 100 random elements >>> (randomized from: numbers) take: 100 >>> "sample only in the index interval (100, 1000)" >>> equidistant from: ((numbers drop: 99) take: 900) "or more >>> compositional" >>> aHundred := Reducer take: 100. >>> aHundred <> randomized from: numbers. >> >> >> You may find that equidistant/randomized filters are useful enough to >> make >> them available to all reducibles. If so, you want to implement two >> transformer classes, e.g., TakingEquidistant and TakingRandomized, and >> extend Reducer with convenience methods like >> * Reducer>>takeEvery: distance >> * Reducer>>takeRandom: probability >> Have a look into Taking or TakingWhile for the details. (Maybe the >> convenience methods will be generated automagically in future >> iterations of >> reducers.) >> >> >> 2) Calculations on Elements aka Mapping >> I suppose we split the work on the samples into a calculation and a >> verification step. We can express both as mappings: >>> >>> calculated := Reducer map: [:each | "expensive calculations"]. >>> validate := Reducer map: [:each | "boolean condition"]. >> >> >> >> 3) Caching or Memoizing >> This is difficult to answer, since those optimizations heavily depend on >> your calculations and data. As John pointed out earlier, profiling >> should be >> your guide here. >> Said this, reducers allow you to apply your own caching strategies. >> Just an >> example: >> >> Travis Griggs implemented a small memoizing API for blocks >> (TAG-MemoizedFunctions; >> http://objology.blogspot.de/2012/05/tag-memoizedfunctions.html). You >> could >> use it to speed up the calculations: >>> >>> calculated := Reducer map: [:each | "expensive calculations"] >>> memoizing. >> >> >> Applying this optimization can be tricky. For example, if the samples >> are >> randomized, it is likely that the speedup is minimal. Even worse, >> performance may decline due to increased memory consumption. >> >> If you have to perform many similar calculations on the same sample >> set, you >> might want to aggregate the samples in a collection: >>> >>> (samples from: numbers) >>> reduce: [:set :each | set add: each; yourself] init: Set new. >> >> >> As always, the strategy depends. >> >> >> 4) Wrap Up >> With the ideas above, we can complete the introductory example. >> >>> samples := (Reducer take: 100) <> equidistant. >>> calculated := Reducer map: [:each | "expensive calculations"] memoized >>> verify := Reducer map: [:eac | "boolean condition"] >>> analysis := validate <> calculated <> samples. >> >> >> Now, the analysis can be applied to multiple times to different sources, >> e.g., >> >>> result := (analysis from: numbers) reduce: #& init: true. >> >> >> Since "equidistant" uses indices, "numbers" is required to be indexed. >> We >> can fix this be defining an indexing step. >> >>> count := 0 >>> indexed := Reducer mapKeys: [:n | count := count + 1]. >> >> >> Now we can apply the analysis to arbitrary reducibles: >> >>> result := (analysis <> indexed from: numbers) reduce: #& init: true. >>> count := 0. >> >> >> >> I hope this mail gives a first impression. >> >> Kind regards, >> Steffen >> >> _______________________________________________ >> vwnc mailing list >> [hidden email] >> http://lists.cs.uiuc.edu/mailman/listinfo/vwnc vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
ok,
It is not so easy to speak english, so I prefer to speak Smalltalk. ,-)*. Here folows a sample of simple calculations of collection of numbers with 2 differents methods. The first one is certainly easier to understand, but is in time NxM, the second one is in time on N, but... (there is allways a but), perhaps not so precise. What do you think of a Reducer implémentation for that precise case ? | alea numbersSize windowSize numbers method1 method2 temp t1 t2 delta | numbersSize := 100000. windowSize := 1000. alea := Random new. numbers := (1 to: numbersSize) collect: [:index | alea next ]. "Method 1" t1 := Time millisecondsToRun: [ method1 := (1 to: numbersSize - windowSize + 1) collect: [:index | (0 to: windowSize - 1) inject: 0 into: [:sum :deltaIndex | sum + (numbers at: index + deltaIndex) squared ] ]. ]. "Method2" t2 := Time millisecondsToRun: [ temp := (1 to: windowSize) inject: 0 into: [:sum :index | sum + (numbers at: index) squared ]. method2 := Array new: numbersSize - windowSize + 1. 1 to: numbersSize - windowSize do: [:idx | method2 at: idx put: temp. temp := temp - ((numbers at: idx) squared) + ((numbers at: idx + windowSize) squared) ]. method2 at: numbersSize - windowSize + 1 put: temp. ]. "Delta" delta := (1 to: numbersSize - windowSize + 1) collect: [:index | (method1 at: index) - (method2 at: index) ]. ^ Array with: method1 -> t1 with: method2 -> t2 with: delta -- (t1 is about 6000 ms, t2 14 ms, and at end, delta contains numbers in the order of 1.0d-12 --- 2013/5/22 Steffen Märcker <[hidden email]>: > Nice, have fun! =) > I'd be interested in your feedback and suggestions for improvement once you > have used them. > > Regards, > Steffen > > > Am 22.05.2013, 16:57 Uhr, schrieb Vincent Lesbros <[hidden email]>: > > >> Thank you, Steffen, >> I am convinced, I will try them. >> >> 2013/5/22 Steffen Märcker <[hidden email]>: >>> >>> Hi Vincent, >>> >>> I'll try to outline some ideas according to my understanding of your >>> problem. This answer is actually quite long and took quite some time; >>> sorry >>> for being lengthy. >>> >>> First, I want to spoil you a bit before diving into the details. We could >>> come up with something like >>> >>>> analysis := validate <> calculated <> samples. >>>> result := (analysis from: numbers) reduce: #& init: true. >>> >>> >>> >>> Where >>> * samples takes the samples >>> * calculated does the calculation >>> * validate does some test on the calculated values >>> >>> Finally, "reduce: #& init: true" simply calculates the logic conjunction >>> of >>> the results. >>> >>> >>> 1) Sampling >>> >>> If it is sufficient to have a fixed distance between your sample, you can >>> use: >>>> >>>> >>>> distance := 100. >>>> numbers reducer >>>> filter: [:index :n | (i \\ distance) isZero]. >>> >>> >>> >>> But maybe you want to take random samples with a certain expected >>> distance: >>>> >>>> >>>> probability := 1 / distance >>>> random := Random fastest. >>>> numbers reducer >>>> filter: [:n | random next <= probability] >>> >>> >>> >>> You can write both filters in a more compositional way to reuse them. >>>> >>>> >>>> equidistant := Reducer filter: [:index :n | (i \\ distance) isZero]. >>>> randomized := Reducer filter: [:n | random next <= probability]. >>>> "apply to collection" >>>> equidistant from: numbers. >>>> randomized from: numbers. >>>> "even combined" >>>> randomized <> equidistant from: numbers. >>> >>> >>> >>> Depending on the representation of the numbers collection this approach >>> might be inefficient, since filter: (and select: as well), will enumerate >>> the numbers. For example arrays afford direct access via the indices, >>> e.g. >>>> >>>> >>>> indices := (1 to: numbers size by: distance). >>>> indices collect: [:i | numbers at: i]. >>>> "or lazy" >>>> indices reducer map: [:i | numbers at: i]. >>> >>> >>> >>> However, reducers make very few assumptions about the reducible (read: >>> collection). Namely, it has to implement #reduce:init:. Thus, reducers >>> can >>> exploit their strengths if, for example, >>> - the reducible's type is not previously know >>> - the reducible's size is unknown/infinite >>> - the reducible provides only sequential access >>> (e.g., generated/streamed elements) >>> - et cetera >>> >>> Then you can do things like: >>>> >>>> >>>> "take 100 random elements >>>> (randomized from: numbers) take: 100 >>>> "sample only in the index interval (100, 1000)" >>>> equidistant from: ((numbers drop: 99) take: 900) "or more >>>> compositional" >>>> aHundred := Reducer take: 100. >>>> aHundred <> randomized from: numbers. >>> >>> >>> >>> You may find that equidistant/randomized filters are useful enough to >>> make >>> them available to all reducibles. If so, you want to implement two >>> transformer classes, e.g., TakingEquidistant and TakingRandomized, and >>> extend Reducer with convenience methods like >>> * Reducer>>takeEvery: distance >>> * Reducer>>takeRandom: probability >>> Have a look into Taking or TakingWhile for the details. (Maybe the >>> convenience methods will be generated automagically in future iterations >>> of >>> reducers.) >>> >>> >>> 2) Calculations on Elements aka Mapping >>> I suppose we split the work on the samples into a calculation and a >>> verification step. We can express both as mappings: >>>> >>>> >>>> calculated := Reducer map: [:each | "expensive calculations"]. >>>> validate := Reducer map: [:each | "boolean condition"]. >>> >>> >>> >>> >>> 3) Caching or Memoizing >>> This is difficult to answer, since those optimizations heavily depend on >>> your calculations and data. As John pointed out earlier, profiling should >>> be >>> your guide here. >>> Said this, reducers allow you to apply your own caching strategies. Just >>> an >>> example: >>> >>> Travis Griggs implemented a small memoizing API for blocks >>> (TAG-MemoizedFunctions; >>> http://objology.blogspot.de/2012/05/tag-memoizedfunctions.html). You >>> could >>> use it to speed up the calculations: >>>> >>>> >>>> calculated := Reducer map: [:each | "expensive calculations"] memoizing. >>> >>> >>> >>> Applying this optimization can be tricky. For example, if the samples are >>> randomized, it is likely that the speedup is minimal. Even worse, >>> performance may decline due to increased memory consumption. >>> >>> If you have to perform many similar calculations on the same sample set, >>> you >>> might want to aggregate the samples in a collection: >>>> >>>> >>>> (samples from: numbers) >>>> reduce: [:set :each | set add: each; yourself] init: Set new. >>> >>> >>> >>> As always, the strategy depends. >>> >>> >>> 4) Wrap Up >>> With the ideas above, we can complete the introductory example. >>> >>>> samples := (Reducer take: 100) <> equidistant. >>>> calculated := Reducer map: [:each | "expensive calculations"] memoized >>>> verify := Reducer map: [:eac | "boolean condition"] >>>> analysis := validate <> calculated <> samples. >>> >>> >>> >>> Now, the analysis can be applied to multiple times to different sources, >>> e.g., >>> >>>> result := (analysis from: numbers) reduce: #& init: true. >>> >>> >>> >>> Since "equidistant" uses indices, "numbers" is required to be indexed. We >>> can fix this be defining an indexing step. >>> >>>> count := 0 >>>> indexed := Reducer mapKeys: [:n | count := count + 1]. >>> >>> >>> >>> Now we can apply the analysis to arbitrary reducibles: >>> >>>> result := (analysis <> indexed from: numbers) reduce: #& init: true. >>>> count := 0. >>> >>> >>> >>> >>> I hope this mail gives a first impression. >>> >>> Kind regards, >>> Steffen >>> >>> _______________________________________________ >>> vwnc mailing list >>> [hidden email] >>> http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Steven Kelly
<base href="x-msg://1685/">+ 1
I like this discussion. We were thinking (not deeply) how to improve collection iteration to avoid intermediate collections without the loss of readability. Steffen did you consider present something at ESUG? Because we could have some nice conversations :) Or writing something for the workshop? I would love to read something on that and a comparison of Xtreams. Stef On May 21, 2013, at 7:16 PM, Steven Kelly <[hidden email]> wrote:
_______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Vincent Lesbros-2
Hi Vincet,
your's is an excellent example for the importance of choosing the right algorithm. At the end of this mail you'll find a reducer variant of the first naive algorithm. Let's have a look at a second alternative. It's basic idea is similar to your second method. However, it does not rely on array access via indices. Thus, it will work with arbitrary reducibles, even infinite sources. (For example: r := Random new. [r next] reducer.) The idea is to split the work in 4 steps: 1. map to squares 2. map to windows 3. compute window sum 4. drop sums of incomplete windows First, assume you have an efficient buffer implementation, e.g. a RingBuffer class. *** Sliding Window with RingBuffer class *** | squared buffer windows tmpSum sum drop | "map to squares" squared := Reducer map: #squared. "map to windows of size = windowSize + 1" buffer := RingBuffer new: (windowSize + 1) withAll: 0. windows := Reducer map: [:each | buffer removeFirst; addLast: each; yourself]. "map to window sum" tmpSum := 0. sum := Reducer map: [:window | tmpSum := tmpSum + window last - window first]. "drop sums of incomplete windows" drop := Reducer drop: windowSize - 1. "build recipe for a collection of window sums" results := drop <> sum <> windows <> squared from: numbers. Second, to have a WS script that does not depend on a buffer implementation, consider the following code. Since it simulates the buffer, the code is more verbose. *** Sliding Window with Simulated RingBuffer *** | squared bufferSize buffer bufferIndex windows tmpSum sum drop | squared := Reducer map: #squared. bufferSize := windowSize + 1. buffer := Array new: bufferSize withAll: 0. bufferIndex := 0. windows := Reducer map: [:each | bufferIndex := (bufferIndex \\ bufferSize) + 1. buffer at: bufferIndex put: each; yourself]. tmpSum := 0. sum := Reducer map: [:window | | first last | last := bufferIndex. first := (bufferIndex \\ bufferSize) + 1. tmpSum := tmpSum + (buffer at: last) - (buffer at: first)]. drop := Reducer drop: windowSize - 1. results := drop <> sum <> windows <> squared from: numbers. *** Aggregate Results *** results reduce: [:col :each | col add: each; yourself] init: (OrderedCollection new: (numbersSize - windowSize + 1))] As promised, here is the naive algorithm. *** Naive Algorithm *** | windows sum | windows := (1 to: numbersSize - windowSize + 1) reducer map: [:i | i to: i + windowSize - 1]. sum := Reducer map: [:window | window reduce: [:sum :i | sum + (numbers at: i) squared] init: 0]. results := sum from: windows. Kind regards, Steffen _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by stephane ducasse-2
Thanks for the suggestion Stephane. Actually, I never thought about
presenting at ESUG. Sure, I'd like to contribute a presentation or a paper but I am afraid it is still too early. And, since Reducers is a spare time project and the next weeks will be busy, I won't meet the deadline in June. But maybe I'll be in beautiful Annecy in September nevertheless. =) Meanwhile, I want to mature Reducers and to keep this discussion open. I'd be happy about any feedback, ideas and contributions to Reducers. Stephane, perhaps you want to share your thoughts on improving collection iterations here, too? Is there an ongoing discussion for Pharo? If Reducers prove to be useful, a comparative article and a Pharo port will be logical steps. What do you think? Just some background: I read about Clojure's reducers library, since a friend of mine wanted to use it in his upcoming project. After wrapping my head around Rich Hickey's blog posts on reducers (links below), I fetched the source and did the initial port in a train back from Avignon to Leipzig. I turned out to be really refreshing to figure out how to do this in ST. =) http://clojure.com/blog/2012/05/08/reducers-a-library-and-model-for-collection-processing.html http://clojure.com/blog/2012/05/15/anatomy-of-reducer.html Kind regards, Steffen Am 22.05.2013, 19:20 Uhr, schrieb stephane ducasse <[hidden email]>: > + 1 > I like this discussion. We were thinking (not deeply) how to improve > collection iteration to avoid intermediate collections > without the loss of readability. > > Steffen did you consider present something at ESUG? Because we could > have some nice conversations :) > Or writing something for the workshop? I would love to read something on > that and a comparison of Xtreams. > > Stef > > On May 21, 2013, at 7:16 PM, Steven Kelly <[hidden email]> wrote: > >> As this conversation proves, one of the hardest things is to get people >> to actually try something new. It's more natural to respond with "I >> don't need it, I can cope with the basic functions I already have", or >> "I already know another add-on that offers some of the same benefits". >> Both are true, and I'm guilty on both counts. But in a sense the value >> of Reducers may be precisely that they offer something familiar to >> functional programmers, as Helge says. By giving them familiar >> vocabulary and things they value (e.g. parallelism), we make Smalltalk >> a little more palatable. And even old hands at Smalltalk may learn >> something new - after all, most agree that inject:into: always feels a >> little lacking. >> >> I'm really glad to see Steffen's work, and the Xtreams work of Michael >> and Martin. I'm not using either in anger yet, because I don't have >> enough pain from the existing base collection and stream support to >> justify adding a dependency on a non-base component. But I do think my >> brain benefitted from being stretched in undergraduate days with FP, >> and more recently with a little greenfield project using Xtreams. >> >> All the best, >> Steve >> >> From: [hidden email] on behalf of Nowak, Helge >> Sent: Tue 21/05/2013 17:00 >> To: john woods >> Cc: [hidden email] >> Subject: Re: [vwnc] Reducers in Smalltalk >> >> Dear John, >> >> in general I agree. >> >> I think the real issue is that Reducers are coming from a mindset >> different from Smalltalk: functional programming, which is rooted in >> mathematics. From my point of view functional programming looks rather >> at the “how to implement functionality” whereas Smalltalk’s philosophy >> is “what domain to model”. As a programmer of non-mathematical software >> I have an intuitive notion of what a Collection is (a collection of >> things). Yet what is a “Reducible”? Naively: something that can be >> reduced, i. e. a value?? If I am a programmer of mathematical software >> though, I may have an intuitive notion of what a Reducible is, just >> like I have a notion of vectors, matrices etc.. Having said that I >> think Reducers (a Do-er: is this OO think at all?) may have their >> place, yet not as a general concept. >> >> To not be misunderstood: I don’t judge anything like implementation >> details or claimed benefits of the one or other implementation. And I >> don’t want to belittle the work Steffen has done. I just talk about the >> philosophy behind. >> >> Cheers >> Helge >> >> Von: [hidden email] [mailto:[hidden email]] Im >> Auftrag vonjohn woods >> Gesendet: Dienstag, 21. Mai 2013 15:02 >> Cc: [hidden email] >> Betreff: Re: [vwnc] Reducers in Smalltalk >> >> >> (((1 to: 10) reducer filter: #odd) map: squared) reduce: #+ init: 0. >> >> It can be written the classic way: >> >> (((1 to: 10) select: #odd) collect: #squared) inject: 0 into: #+. >> >> Which creates two intermediate collections. Again, we can avoid them: >> >> (1 to: 10) inject: 0 into: >> [:sum :n | n odd >> ifTrue: [sum + n squared] >> ifFalse: [sum]] >> >> But now we have to think in terms of #inject:into and craft the >> behavior of >> #select: and #collect: into the block by hand. >> This style is harder to read, especially for more complex conditions >> and mappings. >> >> Actually, I'd like to completely disagree. It is, in my opinion, much >> easier to see what the latter does. The easiest to read is the most >> naive. Write the following extension methods: >> >> Given Collection>>sum, (((1 to: 10) select: #odd) collect: #squared) >> sum. >> >> Or you could use Collection>>inject:into: here instead of writing >> Collection>>sum. Or you could go the other way and implement, for >> instance Collection>>oddElements etc, and just do (1 to: 10) >> allOddElements sumOfSquares. >> >> What I really don't want to see is unnecessary complexity or in the >> code because programmers think that it will be faster or more >> efficient. For instance, one of your concerns was the creation of an >> intermediate block. But you have offered very little proof that this >> is even a problem. Who knows what the VM will do, especially if it >> does JIT optimization? or uses the GPU? >> >> As a full time performance engineer I spend a lot of time looking at >> incomprehensible code (often not Smalltalk) which has been >> 'pre-optimized' by developers, and this pre-optimization is largely >> (though not solely) responsible for its poor readability. Even where >> their efforts work (and you would be *amazed* how often they either >> have zero or even negative effects on real-world performance), the cost >> of maintaining such obscurantist code is nearly always vastly more than >> finding better hardware. In short, write your application first, then >> profile it and *then* optimize it - and only those bits which need it. >> The observed performance problems will very rarely coincide with the >> expectation of even some of the most talented developers. >> >> These Reducer things do not look self explanatory and to me that is >> adverse to the entire spirit of Smalltalk. My view would be that I >> would need to see *much* stronger evidence of *very* significant >> advantages of this approach before I would even consider using such an >> approach. >> _______________________________________________ >> vwnc mailing list >> [hidden email] >> http://lists.cs.uiuc.edu/mailman/listinfo/vwnc vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
On May 25, 2013, at 11:30 AM, Steffen Märcker <[hidden email]> wrote: > Thanks for the suggestion Stephane. Actually, I never thought about presenting at ESUG. Sure, I'd like to contribute a presentation or a paper but I am afraid it is still too early. And, since Reducers is a spare time project and the next weeks will be busy, I won't meet the deadline in June. But maybe I'll be in beautiful Annecy in September nevertheless. =) ;) just send a presentation proposal like that you will kick you to get a presentation by september :) > Meanwhile, I want to mature Reducers and to keep this discussion open. yes please continue. We need superPowers and I do not really like blocks into blocks :) > I'd be happy about any feedback, ideas and contributions to Reducers. Stephane, perhaps you want to share your thoughts on improving collection iterations here, too? Is there an ongoing discussion for Pharo? Not really we are too busy but I love this discussion. > If Reducers prove to be useful, a comparative article and a Pharo port will be logical steps. What do you think? I would love that. I could also help to write an article about them in the future (not because I want to steal ideas or need publications but because I'm slow to understand so this force myself to write nicely). So I really suggest to think about a presentation at ESUG so that after we can drink beers and discuss in front of a black board and this year ESUG will be again in a cool place with good food and energy. > > Just some background: > I read about Clojure's reducers library, since a friend of mine wanted to use it in his upcoming project. After wrapping my head around Rich Hickey's blog posts on reducers (links below), I fetched the source and did the initial port in a train back from Avignon to Leipzig. I turned out to be really refreshing to figure out how to do this in ST. =) > http://clojure.com/blog/2012/05/08/reducers-a-library-and-model-for-collection-processing.html > http://clojure.com/blog/2012/05/15/anatomy-of-reducer.html > > Kind regards, > Steffen > > > Am 22.05.2013, 19:20 Uhr, schrieb stephane ducasse <[hidden email]>: > >> + 1 >> I like this discussion. We were thinking (not deeply) how to improve collection iteration to avoid intermediate collections >> without the loss of readability. >> >> Steffen did you consider present something at ESUG? Because we could have some nice conversations :) >> Or writing something for the workshop? I would love to read something on that and a comparison of Xtreams. >> >> Stef >> >> On May 21, 2013, at 7:16 PM, Steven Kelly <[hidden email]> wrote: >> >>> As this conversation proves, one of the hardest things is to get people to actually try something new. It's more natural to respond with "I don't need it, I can cope with the basic functions I already have", or "I already know another add-on that offers some of the same benefits". Both are true, and I'm guilty on both counts. But in a sense the value of Reducers may be precisely that they offer something familiar to functional programmers, as Helge says. By giving them familiar vocabulary and things they value (e.g. parallelism), we make Smalltalk a little more palatable. And even old hands at Smalltalk may learn something new - after all, most agree that inject:into: always feels a little lacking. >>> >>> I'm really glad to see Steffen's work, and the Xtreams work of Michael and Martin. I'm not using either in anger yet, because I don't have enough pain from the existing base collection and stream support to justify adding a dependency on a non-base component. But I do think my brain benefitted from being stretched in undergraduate days with FP, and more recently with a little greenfield project using Xtreams. >>> >>> All the best, >>> Steve >>> >>> From: [hidden email] on behalf of Nowak, Helge >>> Sent: Tue 21/05/2013 17:00 >>> To: john woods >>> Cc: [hidden email] >>> Subject: Re: [vwnc] Reducers in Smalltalk >>> >>> Dear John, >>> >>> in general I agree. >>> >>> I think the real issue is that Reducers are coming from a mindset different from Smalltalk: functional programming, which is rooted in mathematics. From my point of view functional programming looks rather at the “how to implement functionality” whereas Smalltalk’s philosophy is “what domain to model”. As a programmer of non-mathematical software I have an intuitive notion of what a Collection is (a collection of things). Yet what is a “Reducible”? Naively: something that can be reduced, i. e. a value?? If I am a programmer of mathematical software though, I may have an intuitive notion of what a Reducible is, just like I have a notion of vectors, matrices etc.. Having said that I think Reducers (a Do-er: is this OO think at all?) may have their place, yet not as a general concept. >>> >>> To not be misunderstood: I don’t judge anything like implementation details or claimed benefits of the one or other implementation. And I don’t want to belittle the work Steffen has done. I just talk about the philosophy behind. >>> >>> Cheers >>> Helge >>> >>> Von: [hidden email] [mailto:[hidden email]] Im Auftrag vonjohn woods >>> Gesendet: Dienstag, 21. Mai 2013 15:02 >>> Cc: [hidden email] >>> Betreff: Re: [vwnc] Reducers in Smalltalk >>> >>> >>> (((1 to: 10) reducer filter: #odd) map: squared) reduce: #+ init: 0. >>> >>> It can be written the classic way: >>> >>> (((1 to: 10) select: #odd) collect: #squared) inject: 0 into: #+. >>> >>> Which creates two intermediate collections. Again, we can avoid them: >>> >>> (1 to: 10) inject: 0 into: >>> [:sum :n | n odd >>> ifTrue: [sum + n squared] >>> ifFalse: [sum]] >>> >>> But now we have to think in terms of #inject:into and craft the behavior of >>> #select: and #collect: into the block by hand. >>> This style is harder to read, especially for more complex conditions and mappings. >>> >>> Actually, I'd like to completely disagree. It is, in my opinion, much easier to see what the latter does. The easiest to read is the most naive. Write the following extension methods: >>> >>> Given Collection>>sum, (((1 to: 10) select: #odd) collect: #squared) sum. >>> >>> Or you could use Collection>>inject:into: here instead of writing Collection>>sum. Or you could go the other way and implement, for instance Collection>>oddElements etc, and just do (1 to: 10) allOddElements sumOfSquares. >>> >>> What I really don't want to see is unnecessary complexity or in the code because programmers think that it will be faster or more efficient. For instance, one of your concerns was the creation of an intermediate block. But you have offered very little proof that this is even a problem. Who knows what the VM will do, especially if it does JIT optimization? or uses the GPU? >>> >>> As a full time performance engineer I spend a lot of time looking at incomprehensible code (often not Smalltalk) which has been 'pre-optimized' by developers, and this pre-optimization is largely (though not solely) responsible for its poor readability. Even where their efforts work (and you would be *amazed* how often they either have zero or even negative effects on real-world performance), the cost of maintaining such obscurantist code is nearly always vastly more than finding better hardware. In short, write your application first, then profile it and *then* optimize it - and only those bits which need it. The observed performance problems will very rarely coincide with the expectation of even some of the most talented developers. >>> >>> These Reducer things do not look self explanatory and to me that is adverse to the entire spirit of Smalltalk. My view would be that I would need to see *much* stronger evidence of *very* significant advantages of this approach before I would even consider using such an approach. >>> _______________________________________________ >>> vwnc mailing list >>> [hidden email] >>> http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Steffen Märcker
I've just published the missing unit test suite for Reducers alongside
with a minor change to Reducers itself. Both, 'Reducers' and 'Reducers Tests', are at version 0.5 now. Have fun! Steffen _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Steffen Märcker
Hi, I am interested in your opinion about the design alternatives below.
The alternatives and their trade-offs are rather generic, even if the background here is Reducers. Please tell from your experience with similar designs. Preliminarily, a Reducer implements the methods #map:, #drop: etc. by creating a new reducer object with a corresponding transformer object. The latter is an instance of a subclass of ReducingFunctionTransformer. 1. Current Implementation Each operation (#map:, #drop:, etc.) needs two convenience methods, e.g., Reducer>>drop: and Reducer class>>drop: These methods just instantiate the correct transformer object (e.g., Dropping number: n) and then create a new reducer instance. + tools see that reducers implementing the methods + fast - code duplication - harder to extend with new operations - Reducers have to know all Transformer classes 2. Dynamic Factory Approach The class Reducer does not implement the convenience methods directly, but asks ReducingFunctionTransformer via #doesNotUnderstand: to create an appropriate transformer object, e.g., for #drop:. For details see the code example at the mails end. a) ReducingFunctionTransformer queries all its subclasses for an implementor of the required operation. Therefore, the subclasses implement a method that answers whether an operation is supported. + no code duplication + easy to extend (dynamically) + cleaner decoupled design - slow (about 2x) - tools to not know that Reducer supports the operations b) ReducingFunctionTransformer queries registered classes for an implementor of the required operation. Therefore, the transformer classes have to register themselves with their supported operation at ReducingFunctionTransformer (e.g., on class initialization). + no code duplication + easy to extend (statically) + cleaner decoupled design - slightly slower (no measurements) - tools to not know that Reducer supports the operations - more fragile: (de)registration on class initialization The three approaches have specific strengths in speed, extendability and tool support. Lookup time does only matter when creating a reducer. There is no impact during the actual reduction of a collection. Thus, 2a) and b) are only expensive, when many _different_ reducers are instantiated. However, this might be a common case. Which design would you prefer? I am looking forward to your answers! Regards, Steffen Reducer>>doesNotUnderstand: aMessage | aTransformer | aTransformer := [ReducingFunctionTransformer forTransformation: aMessage selector withArguments: aMessage arguments] on: NotFoundError do: [super doesNotUnderstand: aMessage]. ^self apply: aTransformer ReducingFunctionTransformer>>forTransformation: aSelector withArguments: arguments ^(self classForTransformation: aSelector) perform: aSelector withArguments: arguments ReducingFunctionTransformer>>classForTransformation: aSelector ^self allSubclasses detect: [:each | each canHandleTransformation: aSelector] Dropping>>canHandleTransformation: aSelector ^aSelector == #dropNew: _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Steffen Märcker
Hi,
I published some updates with new features today. Reducers is at version 0.6 now. What has changed lately: 1) Readable printString representation 2) Result aggregation 3) Streams support 4) Concatenation (lazy) I'd be happy to hear your thoughts and critics on the new features. Lets elaborate a bit on each point. 1) Readable printString representation -------------------------------------- Vincent Lesbros pointed out that the default printString of Reducers was not helpful in development. Now, the whole reducer chain is printed in a way similar to its construction. Blocks are printed decompiled in one line. For example: > (Reducer filter: [:each | each odd]) > <> (Reducer map: #squared) > from: (1 to: 10). is printed (without CRs) as > (Reducer filter: [:t1 | t1 odd]) > <> (Reducer map: #squared) > from: (1 to: 10) Do you find this format more handy? 2) Result aggregation --------------------- In previous versions it was tiresome to collect the results of the transformations. Over and over again, you had to write something similar to: > results := ((1 to: 10) reducer map: #squared) > reduce: [:dict :key :val | > dict > at: key put: val; > yourself] > init: Dictionary new. Result aggregation can now be done with #into: which takes either a collection, a collection class or a stream. For example, > results := ((1 to: 10) reducer map: #squared) > into: Dictionary. or > results := #(a b c) asOrderedCollection. > ((1 to: 10) reducer map: #squared) > into: results. Reducer provides the method #from:into: which comes in handy with concatenated reducers: > (Reducer filter: #odd) <> (Reducer map: #squared) > from: (1 to: 10) into: OrderedCollection. 3) Streams support ------------------ It is now possible to reduce streams (#reduce:init:) and aggregate results in a stream (#into:). For example: > results := #(a b c) copy writeStream. > ((1 to: 10) readStream reducer filter: #odd) > into: results. Reducers use streams as sources and drains but do not aim to duplicate core stream functionality. Especially the various read/write optimizations are in the responsibility of the sources respectively drains. 4) Concatenation (lazy) ----------------------- Reducibles can be concatenated efficiently (lazy) with #cat:. A concatenation yields a reducible that holds on its two source reducibles. It then simply delegates #reduce:init: to them. Thus, no combined collection is created and changes to the source collections are reflected on the concatenation. For example: > numbers := (0 to: 9) cat: (10 to: 100 by: 10). > (numbers reducer map: #squared) into: OrderedCollection Regards, Steffen _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Another nice point of #into: is that it decouples the processing of a
collection/reducible from the result representation, i.e., its type. For example, suppose you have to map elements of an OrderedCollection and store them in a Set. Using #collect: you might come up with > Set withAll: (orderedCol collect: [:each | ...]). which produces an intermediate OrderdCollection. This can avoided with Reducers by rewriting to > (orderedCol reducer map: [:each | ...]) into: Set. or > (Reducer map: [:each | ...]) from: orderedCol into: Set. Just my two cents, Steffen Am 18.07.2013, 17:43 Uhr, schrieb Steffen Märcker <[hidden email]>: > Hi, > > I published some updates with new features today. Reducers is at version > 0.6 now. What has changed lately: > > 1) Readable printString representation > 2) Result aggregation > 3) Streams support > 4) Concatenation (lazy) > > I'd be happy to hear your thoughts and critics on the new features. Lets > elaborate a bit on each point. > > 1) Readable printString representation > -------------------------------------- > Vincent Lesbros pointed out that the default printString of Reducers was > not helpful in development. Now, the whole reducer chain is printed in a > way similar to its construction. Blocks are printed decompiled in one > line. For example: > >> (Reducer filter: [:each | each odd]) >> <> (Reducer map: #squared) >> from: (1 to: 10). > > is printed (without CRs) as > >> (Reducer filter: [:t1 | t1 odd]) >> <> (Reducer map: #squared) >> from: (1 to: 10) > > Do you find this format more handy? > > 2) Result aggregation > --------------------- > In previous versions it was tiresome to collect the results of the > transformations. Over and over again, you had to write something similar > to: > >> results := ((1 to: 10) reducer map: #squared) >> reduce: [:dict :key :val | >> dict >> at: key put: val; >> yourself] >> init: Dictionary new. > > Result aggregation can now be done with #into: which takes either a > collection, a collection class or a stream. For example, > >> results := ((1 to: 10) reducer map: #squared) >> into: Dictionary. > > or > >> results := #(a b c) asOrderedCollection. >> ((1 to: 10) reducer map: #squared) >> into: results. > > Reducer provides the method #from:into: which comes in handy with > concatenated reducers: > >> (Reducer filter: #odd) <> (Reducer map: #squared) >> from: (1 to: 10) into: OrderedCollection. > > 3) Streams support > ------------------ > It is now possible to reduce streams (#reduce:init:) and aggregate > results in a stream (#into:). For example: > >> results := #(a b c) copy writeStream. >> ((1 to: 10) readStream reducer filter: #odd) >> into: results. > > Reducers use streams as sources and drains but do not aim to duplicate > core stream functionality. Especially the various read/write > optimizations are in the responsibility of the sources respectively > drains. > > 4) Concatenation (lazy) > ----------------------- > Reducibles can be concatenated efficiently (lazy) with #cat:. A > concatenation yields a reducible that holds on its two source > reducibles. It then simply delegates #reduce:init: to them. Thus, no > combined collection is created and changes to the source collections are > reflected on the concatenation. For example: > >> numbers := (0 to: 9) cat: (10 to: 100 by: 10). >> (numbers reducer map: #squared) into: OrderedCollection > > > Regards, > Steffen > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Free forum by Nabble | Edit this page |