Hi. I found interesting case where tempVectors can be used in remote scenarios. The store into remote temp can be really remote (not just about outer context). I played with following example:
For the moment forget about remote thing and look into it as a normal local case: temp var here is managed indirectly through tempVector. You can see it using expression after first assignment:
So the value in fact is stored in the array instance and read from it. But because of optimization it happens out of the array control. No #at: and #at:put: messages are sent during this code. VM magically changes the state of this array (there are special bytecodes for this). Now my remote use case. Imagine that vm actually sends #at: and #at:put: messages to tempVector. Then remoting engine can transfer temp vector (as part of context) as a proxy. So on remote side the block [temp := temp + 1] will actually ask the sender (client) for the value and for the storage. So all block semantics will be supported. Temp in remote outer context will be modified. I think it would be super cool if such transparency would be possible. I played with this example using Seamless in Pharo. It already works in the way I described but due to VM optimization it does not provide expected behavior. And worse than that it actually corrupts transferred proxy because in place of array the proxy instance is materialized. This leads us to the issue with safety of tempVector operations. Following example shows how we can affect the state of tempVector using reflection:
It is cool that we can do it. But there is no any safety check in the VM level over tempVector object:
It breaks with DNU: #+ is sent to nil. Temp became nil.
Sometimes it breaks with same error. Sometimes it returns random number. I guess in these cases VM breaks memory boundary of tempVector. And two exotic cases:
It silently return 11. It does not break read only protection. But no error is signalled.
It returns #(). (In Pharo #() + 1 = #() ). I use become to check how forwarding is working in that case. (it works fine when array has correct size) How we can improve this behavior? How it would effect performance? My proposal is to send real messages to tempVector when it is not an array instance. Then image will decide what to do. Best regards, Denis |
Hi Denis, Special bytecodes don't have to be changed: just don't use them and replace by regular sends at bytecode generation (with a special compiler, or some IR translater). All can be done at image side then. Or did I miss something? Le jeu. 28 mars 2019 à 20:05, Denis Kudriashov <[hidden email]> a écrit :
|
Hi Nicolas. чт, 28 мар. 2019 г. в 19:44, Nicolas Cellier <[hidden email]>:
Sure, bytecode transformation will work. But it would be quite tricky to apply in live execution context. It would require fixing context stack to take into account updated method bytecode. Notice that I don't search for global setting to recompile all methods in image. I want this logic only for concrete method/block activation. In my scenario block is serialized and transferred together with current context. So on remote side I need to do something with materialized objects to maintain normal block semantics.
I think my examples shows a security hole in VM execution logic which allows to violate memory bounds from the image side. I did not got segfault but I would not be surprized if it would happens in some complex real live scenarios. Maybe it looks like a specially invented case but I think it is quite easy to get when using or developing low level serialization library - as soon as you by mistake or intentionally serialize context objects with some substitution logic. And considering that this hole needs to be closed it would be good opportunity to have another hook in execution engine which can be used like in my remote scenario. So back to my proposal in first mail.
|
In reply to this post by Denis Kudriashov
Just because you have got a hammer, it doesn't mean you have to use it to solve this task. Instead of trying to mangle the way the VM handles this, you should just change the compiler to emit #at: and #at:put: instead of the temp vector bytecodes. Implementing this idea could open up new possiblities. For example, you could use custom selectors like #remoteTempAt:, which you could be implemented by your proxies besides Array. Or you could introduce an Array-like class to act as temp vectors and keep Array clean of these methods. By the way, any kind of change to these primitives means that the atomicity guarantees the VM currently provides are gone if the temp vector is "remote". Levente |
In reply to this post by Denis Kudriashov
Hi Denis, On Thu, Mar 28, 2019 at 2:36 PM Denis Kudriashov <[hidden email]> wrote:
It is no different than using an inst var access bytecode on an object which doesn't have enough net vars. It is not a security hole, as much as it is something the system must use correctly to avoid crashes. The same can be done by e.g. thisContext swapSender: Point basicNew There are many such "security holes". And if you want the VM to plug them all then the VM will become very much slower.
If you want to solve this, then build a transformation for the block method when you remote a block. As others have suggested (Levente) you can transform the bytecodes into normal sends (my blog post on the entire scheme starts with implementing it using at: and at:put: before the special bytecodes are added). But making a change to all blocks breaks much of the Sista adaptive optimizer. We have to have the freedom to access indirect temp vectors via special case bytecodes if we are to be able to aggressively optimize code. If indirect temp vectors are to be treated as general purpose objects, then we are prevented from making many significant optimizations. So, as the doctor said, "don't do that".
_,,,^..^,,,_ best, Eliot |
Hi Eliot чт, 28 мар. 2019 г. в 23:29, Eliot Miranda <[hidden email]>:
Ok. I expected such answers :) but ask for the chance that some cheap trick is possible. Like my readOnly example. It shows that there is at least writebarrier check during this operation. If it would signal an error it could be used to do the job. Method transformation would be quite complex to use because It needs to be applied dynamically to live context, and it requires stack modifications on the fly. Just compiling method in advance is not appropriate for my goal. I don't want to change compiler globally or force user to do it for concrete method/class. It would be not transparent solution. Anyway thanks all for answers.
|
Hi Denis,
- it is easy to construct a transformation from tempVector bytecode blocks (TVBB) to tempVector message blocks (TVMB) because there are no suspension points in the bytecodes and the stack heights at the start and end of the bytecodes are the same as for the message versions. So some form of store indirect temp bytecode => dup (now value exists twice) dup (now value exists thrice) push indirect temp pop store into value location that was duped push index pop store into 2nd value send at:put: will reimplement. And then it’s just a matter of remapping PCs from one to the other and lengthening jumps. A day’s work or two at most If the transformation is done in the marshaller that remotes objects then it will be easy to substitute the transformed method and map any PCs in contexts and closures (the JIT does this kind of mapping routinely).
_,,,^..^,,,_ (phone) |
Hi, Don't know if it makes sense, but although the VM does not perform read-only check to write into temp vector by default, it is possible to activate such checks through a flag I introduced last year for the incremental compactor. Overhead seemed to be minimal. Best, On Sat, Mar 30, 2019 at 3:50 AM Eliot Miranda <[hidden email]> wrote:
|
Hi Clément,
|
In reply to this post by Clément Béra
Hi Clement пт, 5 апр. 2019 г., 12:42 Clément Béra <[hidden email]>:
Is it a flag to compile VM or image side? And is it a requirement for new compactor. So it will be enabled at some point by default?
|
Hi Denis,
|
Free forum by Nabble | Edit this page |