Hello,
I've made an initial support for multiple worlds in GNU Smalltalk, each world has its own memory heap; I can bootstrap multiple kernel using the GNU Smalltalk bootstrapping process each kernel has its memory heap and I use a share nothing approach between the other worlds. I can load multiple images in other memory area too. What has changed in the VM? This is mostly a big refactoring of the _gst_mem structure and globals used by the compiler, symbols, oop registrations, and context management. I've added two primitives - mostly for testing purpose -for loading images and generating new worlds in ObjectMemory (yes it's not good). It loads an image print hello world and crash and the same happens for the generated kernel. I think that's the right way but I want your opinion on the changes. I plan to fix the multiple worlds support by : - fix the world swapping crash :) - see the impact on the vm (semaphore/events/...) - introducing a model for the worlds - the scheduling could be time shared for the worlds (Paolo what do you think)? - Multi core support I think that could be the next step may be a M:N model M images for N VM thread The code is there : [hidden email]:MrGwen/GNU-Smalltalk.git this is the process branch Cheers, Gwen _______________________________________________ help-smalltalk mailing list [hidden email] https://lists.gnu.org/mailman/listinfo/help-smalltalk |
On 06/09/2011 11:05 AM, Gwenael Casaccio wrote:
> > I think that's the right way but I want your opinion on the changes. > I plan to fix the multiple worlds support by : > - fix the world swapping crash :) > - see the impact on the vm (semaphore/events/...) > - introducing a model for the worlds > - the scheduling could be time shared for the worlds (Paolo what do > you think)? Why not parallel? M worlds == M threads, with inter-world synchronization primitives. Windows events look like a feasible model to use for synchronization primitives where you can wait for either any or all of N events to become signaled. Or Erlang-like channels too. How do you plan to move objects across worlds? Paolo _______________________________________________ help-smalltalk mailing list [hidden email] https://lists.gnu.org/mailman/listinfo/help-smalltalk |
On 06/09/2011 11:22 AM, Paolo Bonzini wrote:
> On 06/09/2011 11:05 AM, Gwenael Casaccio wrote: >> >> I think that's the right way but I want your opinion on the changes. >> I plan to fix the multiple worlds support by : >> - fix the world swapping crash :) >> - see the impact on the vm (semaphore/events/...) >> - introducing a model for the worlds >> - the scheduling could be time shared for the worlds (Paolo what do >> you think)? > > Why not parallel? M worlds == M threads, with inter-world > synchronization primitives. Windows events look like a feasible model to > use for synchronization primitives where you can wait for either any or > all of N events to become signaled. Or Erlang-like channels too. > > How do you plan to move objects across worlds? > > Paolo *** BREAKING NEWS multi core image is working BREAKING NEWS *** Gwen _______________________________________________ help-smalltalk mailing list [hidden email] https://lists.gnu.org/mailman/listinfo/help-smalltalk |
More seriously I've clean up the code and use thread local storage
instead a big struct. The multiple image support is working and bootstraping too ;-) Gwen On Fri, Jun 10, 2011 at 4:03 PM, Gwenael Casaccio <[hidden email]> wrote: > On 06/09/2011 11:22 AM, Paolo Bonzini wrote: >> >> On 06/09/2011 11:05 AM, Gwenael Casaccio wrote: >>> >>> I think that's the right way but I want your opinion on the changes. >>> I plan to fix the multiple worlds support by : >>> - fix the world swapping crash :) >>> - see the impact on the vm (semaphore/events/...) >>> - introducing a model for the worlds >>> - the scheduling could be time shared for the worlds (Paolo what do >>> you think)? >> >> Why not parallel? M worlds == M threads, with inter-world >> synchronization primitives. Windows events look like a feasible model to >> use for synchronization primitives where you can wait for either any or >> all of N events to become signaled. Or Erlang-like channels too. >> >> How do you plan to move objects across worlds? >> >> Paolo > > *** BREAKING NEWS multi core image is working BREAKING NEWS *** > > Gwen > _______________________________________________ help-smalltalk mailing list [hidden email] https://lists.gnu.org/mailman/listinfo/help-smalltalk |
Hi,
I've made test with example/Bench.st. Instead of printing the result (slower) I want to give you the number of times __tls_get_addr is called: 293 987 770 (in _gst_interpret) 53% of time of execution is spent in __tls_get_addr It's time to optimize it a bit no ? I was thinking of putting in _gst_interpret some variables at least _gst_mem Gwen On 06/14/2011 01:57 AM, Gwenaël Casaccio wrote: > More seriously I've clean up the code and use thread local storage > instead a big struct. > The multiple image support is working and bootstraping too ;-) > > Gwen > > On Fri, Jun 10, 2011 at 4:03 PM, Gwenael Casaccio<[hidden email]> wrote: >> On 06/09/2011 11:22 AM, Paolo Bonzini wrote: >>> >>> On 06/09/2011 11:05 AM, Gwenael Casaccio wrote: >>>> >>>> I think that's the right way but I want your opinion on the changes. >>>> I plan to fix the multiple worlds support by : >>>> - fix the world swapping crash :) >>>> - see the impact on the vm (semaphore/events/...) >>>> - introducing a model for the worlds >>>> - the scheduling could be time shared for the worlds (Paolo what do >>>> you think)? >>> >>> Why not parallel? M worlds == M threads, with inter-world >>> synchronization primitives. Windows events look like a feasible model to >>> use for synchronization primitives where you can wait for either any or >>> all of N events to become signaled. Or Erlang-like channels too. >>> >>> How do you plan to move objects across worlds? >>> >>> Paolo >> >> *** BREAKING NEWS multi core image is working BREAKING NEWS *** >> >> Gwen >> _______________________________________________ help-smalltalk mailing list [hidden email] https://lists.gnu.org/mailman/listinfo/help-smalltalk |
and _gst_ip
PREFETCH (120 000 000 calls of __tls_get_addr) On 06/15/2011 06:44 PM, Gwenael Casaccio wrote: > Hi, > > I've made test with example/Bench.st. Instead of printing the result > (slower) I want to give you the number of times __tls_get_addr is called: > > 293 987 770 (in _gst_interpret) > 53% of time of execution is spent in __tls_get_addr > > It's time to optimize it a bit no ? > > I was thinking of putting in _gst_interpret some variables at least > _gst_mem > > Gwen > > On 06/14/2011 01:57 AM, Gwenaël Casaccio wrote: >> More seriously I've clean up the code and use thread local storage >> instead a big struct. >> The multiple image support is working and bootstraping too ;-) >> >> Gwen >> >> On Fri, Jun 10, 2011 at 4:03 PM, Gwenael Casaccio<[hidden email]> >> wrote: >>> On 06/09/2011 11:22 AM, Paolo Bonzini wrote: >>>> >>>> On 06/09/2011 11:05 AM, Gwenael Casaccio wrote: >>>>> >>>>> I think that's the right way but I want your opinion on the changes. >>>>> I plan to fix the multiple worlds support by : >>>>> - fix the world swapping crash :) >>>>> - see the impact on the vm (semaphore/events/...) >>>>> - introducing a model for the worlds >>>>> - the scheduling could be time shared for the worlds (Paolo what do >>>>> you think)? >>>> >>>> Why not parallel? M worlds == M threads, with inter-world >>>> synchronization primitives. Windows events look like a feasible >>>> model to >>>> use for synchronization primitives where you can wait for either any or >>>> all of N events to become signaled. Or Erlang-like channels too. >>>> >>>> How do you plan to move objects across worlds? >>>> >>>> Paolo >>> >>> *** BREAKING NEWS multi core image is working BREAKING NEWS *** >>> >>> Gwen >>> > _______________________________________________ help-smalltalk mailing list [hidden email] https://lists.gnu.org/mailman/listinfo/help-smalltalk |
On Wed, Jun 15, 2011 at 21:19, Gwenael Casaccio <[hidden email]> wrote:
>> 293 987 770 (in _gst_interpret) >> 53% of time of execution is spent in __tls_get_addr >> >> It's time to optimize it a bit no ? >> >> I was thinking of putting in _gst_interpret some variables at least >> _gst_mem You can put all variables in a single struct, and save the address (&x) of that struct as a local variable in _gst_interpret. It's also interesting to try -ftls-model=local-exec. Pick the fastest of the two. :-) Finally, the penalty for static linking ought to be very low. That's the limit to which you should aim. Paolo _______________________________________________ help-smalltalk mailing list [hidden email] https://lists.gnu.org/mailman/listinfo/help-smalltalk |
On 06/19/2011 03:31 PM, Paolo Bonzini wrote:
> On Wed, Jun 15, 2011 at 21:19, Gwenael Casaccio<[hidden email]> wrote: >>> 293 987 770 (in _gst_interpret) >>> 53% of time of execution is spent in __tls_get_addr >>> >>> It's time to optimize it a bit no ? >>> >>> I was thinking of putting in _gst_interpret some variables at least >>> _gst_mem > > You can put all variables in a single struct, and save the address > (&x) of that struct as a local variable in _gst_interpret. It's also > interesting to try -ftls-model=local-exec. Pick the fastest of the > two. :-) > > Finally, the penalty for static linking ought to be very low. That's > the limit to which you should aim. > > Paolo Hehe this is what I've done :) For the -ftls-model=local-exe I've tried but I've a link error (-fPIC needed), I've added it but it failed too.. I've made another pass on the implementation with helgrind (tool for valgrind) I've removed a **lot** of threading issues I can load 20 images without any crashes :D There is still an issue with the bootstrapping but when the memory is released. I've implemented a _gst_release_mem function: - suspend the process - call the GST_ABOUT_TO_QUIT hook - scan the ot and make the oop non weak and free the memory - free the spaces and the heap Gwen _______________________________________________ help-smalltalk mailing list [hidden email] https://lists.gnu.org/mailman/listinfo/help-smalltalk |
On 06/20/2011 06:11 PM, Gwenael Casaccio wrote:
> On 06/19/2011 03:31 PM, Paolo Bonzini wrote: >> On Wed, Jun 15, 2011 at 21:19, Gwenael Casaccio<[hidden email]> wrote: >>>> 293 987 770 (in _gst_interpret) >>>> 53% of time of execution is spent in __tls_get_addr >>>> >>>> It's time to optimize it a bit no ? >>>> >>>> I was thinking of putting in _gst_interpret some variables at least >>>> _gst_mem >> >> You can put all variables in a single struct, and save the address >> (&x) of that struct as a local variable in _gst_interpret. It's also >> interesting to try -ftls-model=local-exec. Pick the fastest of the >> two. :-) >> >> Finally, the penalty for static linking ought to be very low. That's >> the limit to which you should aim. >> >> Paolo > > Hehe this is what I've done :) > > For the -ftls-model=local-exe I've tried but I've a link error (-fPIC > needed), I've added it but it failed too.. > > I've made another pass on the implementation with helgrind (tool for > valgrind) I've removed a **lot** of threading issues I can load 20 > images without any crashes :D > > There is still an issue with the bootstrapping but when the memory is > released. > Correction ** IT WORKS ** :) > I've implemented a _gst_release_mem function: > > - suspend the process > - call the GST_ABOUT_TO_QUIT hook > - scan the ot and make the oop non weak and free the memory > - free the spaces and the heap > > Gwen _______________________________________________ help-smalltalk mailing list [hidden email] https://lists.gnu.org/mailman/listinfo/help-smalltalk |
Free forum by Nabble | Edit this page |