Hi all,
While digging through the implementation of FullBlockClosure with Hernan, we were wondering about full closures that ignore their outerContext. We noticed that the following method has two senders: EncoderForSistaV1>>#genPushFullClosure:numCopied:receiverOnStack:ignoreOuterContext: One of them seems to be the only one in use, and it's passing in false as defaults for both receiverOnstack and ignoreOuterContext. The other sender (BytecodeEncoder>>#sizePushFullClosure:numCopied:receiverOnStack:ignoreOuterContext:), in turn, doesn't seem to have any further senders. Unless we missed something, it looks like the outerContext will never be ignored at the moment. Similarly, the receiver is never on the stack. Is this something only Scorch can do or is this just "not yet implemented"? When can the outerContext be ignored? When does it make sense to pop the receiver from the stack? And where can we find the latest version of Scorch. Is it still the one at [1]? Cheers, Fabio [1] https://github.com/clementbera/Scorch |
Hi Fabio, On Mon, Dec 14, 2020 at 1:33 PM Fabio Niephaus <[hidden email]> wrote: Hi all, When the Sista optimizer determines that it isn't needed. i.e. this option is nover used in vanilla code but exists for an optimizing compiler to avoid the overhead in cases where it wants to avoid inlining but knows there is no real suspension point during some evaluation. Now, whether we'll ever use this facility I can't say, but it was certainly in Clément's mind to do so at some point. When does it make sense to pop the receiver from the stack? The point isn't really to pop the receiver from the stack. The point is to be able to take the closures receiver form the stack rather than it being implicitly the receiver of the current method. If closure creation gets inlined by the optimizer then there will be potentially a mismatch between the current method's receiver and an inlined closure's receiver, which necessitates having the facility to specify a distinct receiver. And where can we find the latest version If you're interested in looking at Scorch I'm very interested in collaborating. And there ius one significant modification to perform first which will make development much easier, and that is to restructure the interface between the optimizer and the image via mirrors, allowing the optimizer to be mated with an image being simulated, rather than having to be a full peer of the image it is optimizing. Cheers, _,,,^..^,,,_ best, Eliot |
Hi Eliot,
On Tue, Dec 15, 2020 at 3:06 AM Eliot Miranda <[hidden email]> wrote: > > Hi Fabio, > > On Mon, Dec 14, 2020 at 1:33 PM Fabio Niephaus <[hidden email]> wrote: >> >> Hi all, >> >> While digging through the implementation of FullBlockClosure with >> Hernan, we were wondering about full closures that ignore their >> outerContext. We noticed that the following method has two senders: >> EncoderForSistaV1>>#genPushFullClosure:numCopied:receiverOnStack:ignoreOuterContext: >> >> One of them seems to be the only one in use, and it's passing in false >> as defaults for both receiverOnstack and ignoreOuterContext. The other >> sender (BytecodeEncoder>>#sizePushFullClosure:numCopied:receiverOnStack:ignoreOuterContext:), >> in turn, doesn't seem to have any further senders. >> >> Unless we missed something, it looks like the outerContext will never >> be ignored at the moment. Similarly, the receiver is never on the >> stack. Is this something only Scorch can do or is this just "not yet >> implemented"? >> >> When can the outerContext be ignored? > > > When the Sista optimizer determines that it isn't needed. i.e. this option is nover used in vanilla code but exists for an optimizing compiler to avoid the overhead in cases where it wants to avoid inlining but knows there is no real suspension point during some evaluation. Now, whether we'll ever use this facility I can't say, but it was certainly in Clément's mind to do so at some point. >> Ok, thanks for the info! >> When does it make sense to pop the receiver from the stack? > > > The point isn't really to pop the receiver from the stack. The point is to be able to take the closures receiver form the stack rather than it being implicitly the receiver of the current method. If closure creation gets inlined by the optimizer then there will be potentially a mismatch between the current method's receiver and an inlined closure's receiver, which necessitates having the facility to specify a distinct receiver. > Makes sense, thanks! > >> >> And where can we find the latest version >> of Scorch. Is it still the one at [1]? > > > http://smalltalkhub.com/mc/ClementBera/Scorch/main Oh, isn't smalltalkhub read-only these days? > > If you're interested in looking at Scorch I'm very interested in collaborating. And there ius one significant modification to perform first which will make development much easier, and that is to restructure the interface between the optimizer and the image via mirrors, allowing the optimizer to be mated with an image being simulated, rather than having to be a full peer of the image it is optimizing. Since we turned on full closures and Sista in Squeak, I could no longer use trunk images on top of TruffleSqueak. I finally decided to bite the bullet and worked on support for both in the last two weeks. It's not that I don't want to collaborate and help evolve Scorch (it's a fascinating project!). It's just that I don't have the time to work on any significant contributions in that direction. Nonetheless, I'd love to see someone working on it again. Fabio >> >> Cheers, >> Fabio >> >> [1] https://github.com/clementbera/Scorch > > > _,,,^..^,,,_ > best, Eliot > |
Hi Fabio,
> On Dec 20, 2020, at 1:27 PM, Fabio Niephaus <[hidden email]> wrote: > > Hi Eliot, > >> On Tue, Dec 15, 2020 at 3:06 AM Eliot Miranda <[hidden email]> wrote: >> >> Hi Fabio, >> >>> On Mon, Dec 14, 2020 at 1:33 PM Fabio Niephaus <[hidden email]> wrote: >>> >>> Hi all, >>> >>> While digging through the implementation of FullBlockClosure with >>> Hernan, we were wondering about full closures that ignore their >>> outerContext. We noticed that the following method has two senders: >>> EncoderForSistaV1>>#genPushFullClosure:numCopied:receiverOnStack:ignoreOuterContext: >>> >>> One of them seems to be the only one in use, and it's passing in false >>> as defaults for both receiverOnstack and ignoreOuterContext. The other >>> sender (BytecodeEncoder>>#sizePushFullClosure:numCopied:receiverOnStack:ignoreOuterContext:), >>> in turn, doesn't seem to have any further senders. >>> >>> Unless we missed something, it looks like the outerContext will never >>> be ignored at the moment. Similarly, the receiver is never on the >>> stack. Is this something only Scorch can do or is this just "not yet >>> implemented"? >>> >>> When can the outerContext be ignored? >> >> >> When the Sista optimizer determines that it isn't needed. i.e. this option is nover used in vanilla code but exists for an optimizing compiler to avoid the overhead in cases where it wants to avoid inlining but knows there is no real suspension point during some evaluation. Now, whether we'll ever use this facility I can't say, but it was certainly in Clément's mind to do so at some point. >>> > > Ok, thanks for the info! > >>> When does it make sense to pop the receiver from the stack? >> >> >> The point isn't really to pop the receiver from the stack. The point is to be able to take the closures receiver form the stack rather than it being implicitly the receiver of the current method. If closure creation gets inlined by the optimizer then there will be potentially a mismatch between the current method's receiver and an inlined closure's receiver, which necessitates having the facility to specify a distinct receiver. >> > > Makes sense, thanks! > >> >>> >>> And where can we find the latest version >>> of Scorch. Is it still the one at [1]? >> >> >> http://smalltalkhub.com/mc/ClementBera/Scorch/main > > Oh, isn't smalltalkhub read-only these days? If it is then things need to move once time is found to work on the code again. > >> >> If you're interested in looking at Scorch I'm very interested in collaborating. And there ius one significant modification to perform first which will make development much easier, and that is to restructure the interface between the optimizer and the image via mirrors, allowing the optimizer to be mated with an image being simulated, rather than having to be a full peer of the image it is optimizing. > > Since we turned on full closures and Sista in Squeak, I could no > longer use trunk images on top of TruffleSqueak. I finally decided to > bite the bullet and worked on support for both in the last two weeks. > It's not that I don't want to collaborate and help evolve Scorch (it's > a fascinating project!). It's just that I don't have the time to work > on any significant contributions in that direction. Nonetheless, I'd > love to see someone working on it again. Ah ok, makes sense. Now you’ve implemented the Sista set I’m curious how you evaluate the design. Eliot _,,,^..^,,,_ (phone) > > Fabio > >>> >>> Cheers, >>> Fabio >>> >>> [1] https://github.com/clementbera/Scorch >> >> >> _,,,^..^,,,_ >> best, Eliot >> > |
On Tue, 22 Dec 2020 at 9:35 am, Eliot Miranda <[hidden email]> wrote: Hi Fabio, I've actually worked on another implementation for SqueakJS in the meantime (see [1]) ... here are some quick thoughts: - I like that bytecodes are ordered by the number of bytes they consume (1, 2, 3). This makes implementations a bit easier to read (e.g. looking up the number of bytes per bytecode is very simple). - I noticed that some parts are still unfinished and I couldn't always guess what the plan was. Bytecodes 0x52 and 0x5E, for example, take the extension bytes under consideration. Right now, however, they only do something if the corresponding extension is 0. That's a bit odd, but I'm guessing the plan was to add more controls...not sure which though. Extension bytes in general are a nice concept. - FullBlockClosures are very much in line with how language implementation frameworks (Truffle, RPython) imagine closures. It was therefore quite easy to support them in TruffleSqueak. However, and as I found out in this thread, the implementation is also unfinished. Right now, the outer context is never ignored, so the overhead is comparable to old closures. I thought the Squeak compiler would ignore the outer context if non-local returns are not present. Maybe that's something we should try out next? - In terms of performance, Sista makes almost no difference in TruffleSqueak. For tinyBenchmarks, the compiler output is even identical. Therefore, it seems the Graal compiler can optimise Sista just as good as the old bytecode set, which is an interesting result. I expect to observe the same when running with Scorch. The optimisations done by Scorch are probably quite similar to what Graal does, only on a much higher level and then Graal would only have less to do. The main difference though is that Scorch optimisations are persisted within the image, which could greatly improve warmup. Fabio
|
Hi Fabio:
> On 2 Jan 2021, at 12:49, Fabio Niephaus <[hidden email]> wrote: > > - In terms of performance, Sista makes almost no difference in TruffleSqueak. I would expect a benefit in terms of cold-code/interpreter performance. Do you see any benefits here? > which could greatly improve warmup. Less run-time work for optimizers is one possible benefit, but I would expect the interpreter itself to also shows some performance difference. For instance, I’d assume there’s less needed for run-time specialization in the Truffle interpreter, and the Truffle interpreter code being less polymorphic. Is that the case/measurable? Best regards Stefan -- Stefan Marr School of Computing, University of Kent https://stefan-marr.de/research/ |
In reply to this post by fniephaus
Hi Stefan,
> On Jan 2, 2021, at 6:18 AM, Stefan Marr via Squeak-dev <[hidden email]> wrote: > > Hi Fabio: > >> On 2 Jan 2021, at 12:49, Fabio Niephaus <[hidden email]> wrote: >> - In terms of performance, Sista makes almost no difference in TruffleSqueak. > > I would expect a benefit in terms of cold-code/interpreter performance. > Do you see any benefits here? There are two things here, one going on, the other not going on. Sista, the adaptive optimizer architecture, includes Scorch, the image-level speculative compiler. Scorch is not (yet) in use. Clément is at Google and I don’t have the cycles to work on it. But Sista and Scorch require the SistaV1 bytecode set, which is a set capable of expressing normal Smalltalk and optimizations, and a set that lifts limits on maximum method size, supports separate method objects for closures, etc. In Squeak we’ve moved to using the SistaV1 bytecode set but are of the order of six months to a year of engineering effort away from putting Sista the architecture into production. Doing so depends on a more productive development architecture, using the vm simulator and mirrors to allow Scorch to live alongside the simulator and optimize the image being simulated, not the current image, hence insulating the current image from bugs in Scorch. This could be a really interesting PhD and I’d love to collaborate. But if the community simply waits for me to do the work it’ll be years before its complete. I have a very full plate. Of course nothing will happen from Potsdam; they get their funding from Oracle, from Mario as I understand of, and right now that prevents competent people such as Fabio from working on Scorch itself and has them merely spectating, waiting to take advantage of something no one is putting resource into. It’s a frustrating and tragic situation. [I say tragic because my understanding of Shakespeare’s tragedies is that they are all driven by character] Some creativity on behalf of us all seems necessary if this work isn’t going to go to waste. I’m extremely frustrated with this situation. If people can’t find a way of collaborating with me and only find ways of finding what are effectively disruptive efforts then the project will eventually wither and die. I’m working for 3dIcc.com and we’re teetering on the brink. If we can get customers we can survive and I can become self-funding and may, in the remaining eight years to my seventieth birthday, have time to complete Sista; I certainly want to. But if the community could somehow find one or two people of Clément’s abilities to join me directly in the work, rather than simply spectating a body in a coma, then there is a good chance that Sista could be in production within a year or two. Yes I’m angry and frustrated. You would be too. > >> which could greatly improve warmup. > > Less run-time work for optimizers is one possible benefit, but I would expect the interpreter itself to also shows some performance difference. > For instance, I’d assume there’s less needed for run-time specialization in the Truffle interpreter, and the Truffle interpreter code being less polymorphic. > Is that the case/measurable? > > Best regards > Stefan > > > -- > Stefan Marr > School of Computing, University of Kent > https://stefan-marr.de/research/ HNY!! Eliot _,,,^..^,,,_ (phone) |
Free forum by Nabble | Edit this page |