Hi folks
In Fuel, we serialize a block closure with its state, including its outerContext. This enables to serialize a sorted collection with its sortBlock: | bytes result |
bytes := FLSerializer serializeToByteArray: (SortedCollection sortBlock: [:a :b | a > b ]). result := FLMaterializer materializeFromByteArray: bytes. result addAll: #(1 2 3);
yourself. ---> a SortedCollection(3 2 1) Here the problem: the byte array is huge! (800kb) because we are serializing unneeded context for the sort block.
We wonder how to prune it and save time and space. Thanks in advance Martín
|
2011/12/2 Martin Dias <[hidden email]>:
> Hi folks > > In Fuel, we serialize a block closure with its state, including its > outerContext. This enables to serialize a sorted collection with its > sortBlock: > > | bytes result | > bytes := FLSerializer serializeToByteArray: (SortedCollection sortBlock: [:a > :b | a > b ]). > result := FLMaterializer materializeFromByteArray: bytes. > result > addAll: #(1 2 3); > yourself. > ---> a SortedCollection(3 2 1) > > Here the problem: the byte array is huge! (800kb) because we are serializing > unneeded context for the sort block. > > We wonder how to prune it and save time and space. > > Thanks in advance > Martín In the case of such clean block, there is no need of outer context during block execution. However I don't know if implementation makes it possible to ignore the context... That's more a question directed to Eliot ;) Nicolas |
On Fri, Dec 2, 2011 at 10:20 AM, Nicolas Cellier <[hidden email]> wrote: 2011/12/2 Martin Dias <[hidden email]>: Arguably there is a bug in my closure implementation, which is that both the receiver and the method are fetched from the outerContext. That's not a bug which can be fixed without a new VM/image combination and may be something I'll look at long-term, but is something we have to live with at the moment. This means that you *do* have to serialize the outerContext. But the outerContext is used only for the receiver and method. So you don't need to full serialize the outerContext. In particular you don't need to serialize any of the outerContext's stack contents or its sender. This needs special handling, I guess in BlockClosure, to substitute a suitably reduced outerContext, but it shouldn't be hard to do. e.g.
BlockClosure methods for: '*Fuel-serialization' outerContextForSerialization ^MethodContext sender: nil
receiver outerContext receiver method: outerContext method args: #()
BlockClosure methods for: '*Fuel-serialization' outerContextForSerialization ^MethodContext
sender: nil receiver self receiver method: self method
args: #()
best, Eliot |
Thanks both. I am right to assume that if the block refers to temp vars, parameters, or whatever in another scope, then such solution won't work. I mean, if I have this example for example:
| bytes result blah | blah := 42. bytes := FLSerializer serializeToByteArray: (SortedCollection sortBlock: [:a :b | (a + blah) > b ]). Then the 'blah' is in a different context. So the mentioned solution works for "clean" closures, which are "self contained". In the other cases (such as this example), we should serialize the whole stack. Is this correct? Thanks! On Fri, Dec 2, 2011 at 7:41 PM, Eliot Miranda <[hidden email]> wrote:
-- Mariano http://marianopeck.wordpress.com |
On Fri, Dec 2, 2011 at 10:55 AM, Mariano Martinez Peck <[hidden email]> wrote: Thanks both. I am right to assume that if the block refers to temp vars, parameters, or whatever in another scope, then such solution won't work. I mean, if I have this example for example: No. The closure implementation arranges that any and all temporary variables accessed by the closure are directly accessible from the closure without accessing the outer contexts. You were there for my presentation Mariano :) And you understood the issue then right ? :) Remember the closure compiler *copies* any temporary variables that are not changed after being closed over into a closure's copiedValues, and *moves* any temporary variables that could be changed after being closed over into indirection vectors, and these indirection vectors are copied into the copied values of the closure. But, as I just said, the receiver and method are not copied (to save time at closure creation) since these are in the outerContext already. These are fetched from the outerContext when the closure is activated.
That's the trade-off, slower closure creation and larger closures and duplication if a closure refers to its method and receiver directly, faster closure creation, smaller closures, less duplication and very slightly slower activation if closures refer to their receiver and method indirectly. The additional trade-off is this issue; the indirect approach entails more problems for serialization and for implementing compile-time clean closures (since the compiler needs to create an outerContext for the method and nil for the non-existent receiver - clean closures can't refer to self).
Is this clear or as clear as mud?
best, Eliot |
On Fri, Dec 2, 2011 at 8:04 PM, Eliot Miranda <[hidden email]> wrote:
Ahhhhhhh :)
Well, yes, but I usually understand in the second or third time ;)
Excellent. So everything is inside the closure. When you say "copiedValues" I guess you use the variable part of BlockClosure, is that right?
Got it.
Much clear now. One last question. So far, then, I didn't find/think a scenario where serializing the whole stack (from a closure) make sense. So, if I just want to serialize a closure and be able to materialize it, then it is *never* necessary the whole stack, in all scenarios? or I am missing a case where the whole stack is needed? I am talking only about when serializing closures, because if I directly serialize a MethodContext, it may happen that I want the whole stack (like the example I showed in ESUG where I serialized the whole Debugger and then I continue debugging in another image). Thanks!
-- Mariano http://marianopeck.wordpress.com |
For example, would this work? | bytes result blah | blah := 42. bytes := FLSerializer serializeToByteArray: (SortedCollection sortBlock: [:a :b | (a + blah) > b. ^ blah ]). has the return something to do with this? Thanks
-- Mariano http://marianopeck.wordpress.com |
On Fri, Dec 2, 2011 at 8:18 PM, Mariano Martinez Peck <[hidden email]> wrote:
of course it doesn't make sense to do a return there. I was just an example, imagine another place where there is a blockclosure I want to serialize and there is a return inside.
-- Mariano http://marianopeck.wordpress.com |
In reply to this post by Eliot Miranda-2
On Fri, Dec 2, 2011 at 8:30 PM, Juan Vuletich <[hidden email]> wrote: Eliot Miranda wrote: Nice. Thanks Juan. I was checking your code, and that's exactly why I asked Eliot. In your method you say: isClean "A clean closure is one that doesn't really need the home context because: - It doesn't send messages to self or super - It doesn't access any instance variable - It doesn't access any outer temp - It doesn't do ^ return" ..... So... my question is, WHAT do I need to serialize if I want to be able to serialize also "non clean". I mean, for each item on that list, what do I need apart from the closure instance and the receiver and method from the outerContext ? the whole stack of contexts ? Thanks a lot in advance! -- Mariano http://marianopeck.wordpress.com |
The following method looks complicated for me. Do you know if it is still valid in latest Cog ?
Thanks :) eliotsClosureMeasurementsOn: m over: aFiveArgBlock " See senders. Or try something like: Smalltalk eliotsClosureMeasurementsOn: FileList >> #defaultContents over: [ :closuresCount :hasCopiedValuesForClosure :hasIndirectTemps :anyClosureHasCopied :anyClosureDoesUAR :anyClosureUsesSelf | (Array with: closuresCount with: hasCopiedValuesForClosure with: hasIndirectTemps with: anyClosureHasCopied with: anyClosureDoesUAR with: anyClosureUsesSelf)] From http://www.mirandabanda.org/cogblog/2008/11/14/mechanised-modifications-and-miscellaneous-measurements/ by Eliot Miranda " | s nextScanStart thisClosureHasCopied closuresCount hasIndirectTemps blkPc blkSz anyClosureHasCopied anyClosureDoesUAR anyClosureUsesSelf analyzedClosures | closuresCount := 0. hasIndirectTemps := false. anyClosureHasCopied := anyClosureDoesUAR := anyClosureUsesSelf := false. s := InstructionStream on: m. s scanFor: [ :b | b = 16r8F "16r8F = 143 closure creation" ifTrue: [ closuresCount := closuresCount + 1]. (b = 16r8A "16r8A = 138indirect temp vector creation" and: [ s followingByte <= 127]) ifTrue: [ hasIndirectTemps := true]. false]. nextScanStart := m initialPC. analyzedClosures := 0. [ analyzedClosures < closuresCount ] whileTrue: [ s pc: nextScanStart; scanFor: [ :b | b = 16r8F ]. "16r8F = 143 Search for first closure" analyzedClosures := analyzedClosures + 1. thisClosureHasCopied := s followingByte >= 16r10. anyClosureHasCopied := anyClosureHasCopied | thisClosureHasCopied. blkSz := s interpretNextInstructionFor: BlockStartLocator new. "Findout size of first closure" blkPc := s pc. s scanFor: [ :b | s pc >= (blkPc + blkSz) ifTrue: [ nextScanStart := s pc. true] ifFalse: [ b = 16r8F ifTrue: [ thisClosureHasCopied := s followingByte >= 16r10. anyClosureHasCopied := anyClosureHasCopied | thisClosureHasCopied. analyzedClosures := analyzedClosures + 1 ]. anyClosureDoesUAR := anyClosureDoesUAR or: [s willReturn and: [s willBlockReturn not]]. anyClosureUsesSelf := anyClosureUsesSelf or: [b = 16r70 "pushSelf" or: [b < 16r10 "pushInstVar" or: [(b = 16r80 and: [s followingByte <= 16r3F]) "pushInstVar" or: [(b between: 16r60 and: 16r60 + 7) "storePopInstVar" or: [(b = 16r82 and: [s followingByte <= 63]) "storePopInstVar" or: [(b = 16r81 and: [s followingByte <= 63]) "storeInstVar" or: [b = 16r84 and: [s followingByte = 160]]]]]]]]. false]]]. ^aFiveArgBlock valueWithArguments: (Array with: closuresCount with: hasIndirectTemps with: anyClosureHasCopied with: anyClosureDoesUAR with: anyClosureUsesSelf) On Fri, Dec 2, 2011 at 9:40 PM, Mariano Martinez Peck <[hidden email]> wrote:
-- Mariano http://marianopeck.wordpress.com |
In reply to this post by Mariano Martinez Peck
well, some people may just need that, but there are cases where I need it completely. I mean, I need to serialize the whole possible graph of objects. This is an example for what I am doing in my PhD. We are also experiment with remote messages and things like that and sometimes we pass a whole stack. In that case I thin it also make sense. But again, these are particular cases. I agree that most of the times the closure will be "clean" maybe you don't need the stack at all. My trick could be augmented by serializing any referenced objects, but not the stack. Yes, maybe I was not clear. When I refer to "stack" I refer all instances of ContextPart (well, subclasses) and all the objects reachable from there. Cheers, -- Mariano http://marianopeck.wordpress.com |
In reply to this post by Mariano Martinez Peck
Thank you very much Eliot and Juan for this nice discussion. I have just commit this new functionality and it seems to work fine. I ended up using Eliot's #abstractBytecodes since the one from Juan didn't work out of the box in Pharo.
On Fri, Dec 2, 2011 at 10:38 PM, Eliot Miranda <[hidden email]> wrote: Hi Juan, -- Mariano http://marianopeck.wordpress.com |
In reply to this post by Mariano Martinez Peck
Hi guys. When we implemented this possibility of pruning the closures when they were clean, one of our crazy tests started to fail. It is not that the test is very important, but I am interested in trying to understand why it fails. The test (simplied) is:
| aSortedCollection materialized | instanceVariableForTesting := false. aSortedCollection := SortedCollection sortBlock: [:a :b | instanceVariableForTesting ifTrue: [ a <= b ] ifFalse: [ a >= b ] ]. materialized := self resultOfSerializeAndMaterialize: aSortedCollection. materialized addAll: #(2 3 1). aSortedCollection addAll: #(2 3 1). Well, the thing is that the materialized SortedCollection, it is sortBLock, the closure is quite crazy and the 'instanceVariableForTesting' doesn't have the boolean but instead a #Processor WTF?? So then the #addAll: fails with a MustBeBoolean. Do you think there is something I should check for this closure? I think it may have to be related with the instanceVariable... Just for the record, what we are doing is something like: serializeReferencesOf: aBlockClosure with: aSerialization 1 to: aBlockClosure basicSize do: [ :index | aSerialization encodeReferenceTo: (aBlockClosure basicAt: index) ]. aBlockClosure isClean ifTrue: [ aSerialization stream nextPut: 1. aSerialization encodeReferenceTo: aBlockClosure method. aSerialization encodeReferenceTo: aBlockClosure receiver. ] ifFalse: [ aSerialization stream nextPut: 0. aSerialization encodeReferenceTo: aBlockClosure outerContext. ]. aSerialization encodeReferenceTo: aBlockClosure startpc. aSerialization encodeReferenceTo: aBlockClosure numArgs. ----- materializeReferencesOf: aBlockClosure with: aMaterialization | methodOrContext method context | 1 to: aBlockClosure basicSize do: [ :index | aBlockClosure basicAt: index put: (aMaterialization nextEncodedReference). ]. (aMaterialization stream next = 1) ifTrue: [ " it was a clean BlockClosure, otherwise it should have been 0" method := aMaterialization nextEncodedReference. aBlockClosure outerContext: (MethodContext sender: nil receiver: aMaterialization nextEncodedReference method: method arguments: #() ) startpc: aMaterialization nextEncodedReference numArgs: aMaterialization nextEncodedReference. ] ifFalse: [ context := aMaterialization nextEncodedReference. aBlockClosure outerContext: context startpc: aMaterialization nextEncodedReference numArgs: aMaterialization nextEncodedReference]. ----- Thanks in advance! On Sun, Dec 4, 2011 at 8:33 PM, Mariano Martinez Peck <[hidden email]> wrote: Thanks Juan. With the latest version of Eliot, the following are failing: -- Mariano http://marianopeck.wordpress.com |
Free forum by Nabble | Edit this page |