Igor's fast become for CompiledMethods in Cog

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Igor's fast become for CompiledMethods in Cog

Mariano Martinez Peck
 
Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:

1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?
So far, I tried for example the following:
    | methods |
    methods := IdentitySet new.
    (PackageInfo allPackages select: [:each|
        (#( 'AST' 'Autotest' 'AutomaticMethodCategorizer' "'Bogus' 'CodeStats'  'Gofer' 'Metacello' 'FreeType' 'HelpSystem' 'ProfStef' 'ScriptManager' 'Zinc' 'Sound' 'Tests' 'ConfigurationOf' 'ImageForDevelopers' 'LED' 'MemoryMonitor' 'SUnit' 'TrueType' 'Monticello'  'Network' 'Refactoring' 'Regex' 'ToolBuilder'" ) anySatisfy:  [:aString  | (each packageName includesSubString: aString)])
        ] )
    do: [:aPackage |
        aPackage classes do: [:each | methods addAll: each methods ].
        ].
    methods do: [:each |
    each become: each copy
    ]

and I run the tests of that package before and after... no crash. Is there something better I could test?

2) If I prepare a nice script with some small modifications to Igor's proposal  would you take a look and integrate it if it is ok?  Just to know whether I should spend time on that or not.

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Igor's fast become for CompiledMethods in Cog

Eliot Miranda-2
 


On Tue, Jan 31, 2012 at 11:22 AM, Mariano Martinez Peck <[hidden email]> wrote:
 
Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:

1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?

Yes, almost certainly.  You'd want to do a flushCache on both methods.
 
So far, I tried for example the following:
    | methods |
    methods := IdentitySet new.
    (PackageInfo allPackages select: [:each|
        (#( 'AST' 'Autotest' 'AutomaticMethodCategorizer' "'Bogus' 'CodeStats'  'Gofer' 'Metacello' 'FreeType' 'HelpSystem' 'ProfStef' 'ScriptManager' 'Zinc' 'Sound' 'Tests' 'ConfigurationOf' 'ImageForDevelopers' 'LED' 'MemoryMonitor' 'SUnit' 'TrueType' 'Monticello'  'Network' 'Refactoring' 'Regex' 'ToolBuilder'" ) anySatisfy:  [:aString  | (each packageName includesSubString: aString)])
        ] )
    do: [:aPackage |
        aPackage classes do: [:each | methods addAll: each methods ].
        ].
    methods do: [:each |
    each become: each copy
    ]

and I run the tests of that package before and after... no crash. Is there something better I could test?

You need to test methods that are in use.  There's an "xray" primitive for peering below the line, e.g. to find out if a method exists as machine codee.  Alas I only implemented the xray primitive for contexts (see MethodContext>>xray in Cog-Tests).  There should be one for methods as well.  

 

2) If I prepare a nice script with some small modifications to Igor's proposal  would you take a look and integrate it if it is ok?  Just to know whether I should spend time on that or not.

yes.
 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Igor's fast become for CompiledMethods in Cog

Mariano Martinez Peck
 


On Tue, Jan 31, 2012 at 8:50 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 11:22 AM, Mariano Martinez Peck <[hidden email]> wrote:
 
Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:

1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?

Yes, almost certainly.  You'd want to do a flushCache on both methods.
 
So far, I tried for example the following:
    | methods |
    methods := IdentitySet new.
    (PackageInfo allPackages select: [:each|
        (#( 'AST' 'Autotest' 'AutomaticMethodCategorizer' "'Bogus' 'CodeStats'  'Gofer' 'Metacello' 'FreeType' 'HelpSystem' 'ProfStef' 'ScriptManager' 'Zinc' 'Sound' 'Tests' 'ConfigurationOf' 'ImageForDevelopers' 'LED' 'MemoryMonitor' 'SUnit' 'TrueType' 'Monticello'  'Network' 'Refactoring' 'Regex' 'ToolBuilder'" ) anySatisfy:  [:aString  | (each packageName includesSubString: aString)])
        ] )
    do: [:aPackage |
        aPackage classes do: [:each | methods addAll: each methods ].
        ].
    methods do: [:each |
    each become: each copy
    ]

and I run the tests of that package before and after... no crash. Is there something better I could test?

You need to test methods that are in use.

But if I run their tests several times before the become, shouldn't they have been jitted and in the cache?
 
 There's an "xray" primitive for peering below the line, e.g. to find out if a method exists as machine codee.  Alas I only implemented the xray primitive for contexts (see MethodContext>>xray in Cog-Tests).  There should be one for methods as well.  


Ok, I will take a look after dinner :)
 
 

2) If I prepare a nice script with some small modifications to Igor's proposal  would you take a look and integrate it if it is ok?  Just to know whether I should spend time on that or not.

yes.
 


Excellent. 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Igor's fast become for CompiledMethods in Cog

Eliot Miranda-2
 


On Tue, Jan 31, 2012 at 12:03 PM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 8:50 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 11:22 AM, Mariano Martinez Peck <[hidden email]> wrote:
 
Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:

1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?

Yes, almost certainly.  You'd want to do a flushCache on both methods.
 
So far, I tried for example the following:
    | methods |
    methods := IdentitySet new.
    (PackageInfo allPackages select: [:each|
        (#( 'AST' 'Autotest' 'AutomaticMethodCategorizer' "'Bogus' 'CodeStats'  'Gofer' 'Metacello' 'FreeType' 'HelpSystem' 'ProfStef' 'ScriptManager' 'Zinc' 'Sound' 'Tests' 'ConfigurationOf' 'ImageForDevelopers' 'LED' 'MemoryMonitor' 'SUnit' 'TrueType' 'Monticello'  'Network' 'Refactoring' 'Regex' 'ToolBuilder'" ) anySatisfy:  [:aString  | (each packageName includesSubString: aString)])
        ] )
    do: [:aPackage |
        aPackage classes do: [:each | methods addAll: each methods ].
        ].
    methods do: [:each |
    each become: each copy
    ]

and I run the tests of that package before and after... no crash. Is there something better I could test?

You need to test methods that are in use.

But if I run their tests several times before the become, shouldn't they have been jitted and in the cache?

Perhaps, perhaps not.  For a test one needs to know.  So it is best to determine this in the test rather than assume it.
 
 
 There's an "xray" primitive for peering below the line, e.g. to find out if a method exists as machine codee.  Alas I only implemented the xray primitive for contexts (see MethodContext>>xray in Cog-Tests).  There should be one for methods as well.  


Ok, I will take a look after dinner :)
 
 

2) If I prepare a nice script with some small modifications to Igor's proposal  would you take a look and integrate it if it is ok?  Just to know whether I should spend time on that or not.

yes.
 


Excellent. 

Thanks in advance,


--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Igor's fast become for CompiledMethods in Cog

Igor Stasenko
In reply to this post by Eliot Miranda-2

On 31 January 2012 20:50, Eliot Miranda <[hidden email]> wrote:

>
>
>
> On Tue, Jan 31, 2012 at 11:22 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>>
>> Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:
>>
>> 1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?
>
>
> Yes, almost certainly.  You'd want to do a flushCache on both methods.
>
are there other object types which we need to be careful with?
because i was thinking to just put a check in fast-become prim and
simply fail the prim if object type(s) to be swapped are not
supported, so user will be forced to use slow good-old #become:

--
Best regards,
Igor Stasenko.
Reply | Threaded
Open this post in threaded view
|

Re: Igor's fast become for CompiledMethods in Cog

Igor Stasenko

On 31 January 2012 21:19, Igor Stasenko <[hidden email]> wrote:

> On 31 January 2012 20:50, Eliot Miranda <[hidden email]> wrote:
>>
>>
>>
>> On Tue, Jan 31, 2012 at 11:22 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>>
>>>
>>> Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:
>>>
>>> 1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?
>>
>>
>> Yes, almost certainly.  You'd want to do a flushCache on both methods.
>>
> are there other object types which we need to be careful with?
> because i was thinking to just put a check in fast-become prim and
> simply fail the prim if object type(s) to be swapped are not
> supported, so user will be forced to use slow good-old #become:
>
otherwise, if its only  compiled methods, we can actually put flush cache in
primitive itself, so no matter who/how calling it, there will be no
chance to have inconsistent behavior.

> --
> Best regards,
> Igor Stasenko.



--
Best regards,
Igor Stasenko.
Reply | Threaded
Open this post in threaded view
|

Re: Igor's fast become for CompiledMethods in Cog

Eliot Miranda-2
In reply to this post by Igor Stasenko
 


On Tue, Jan 31, 2012 at 12:19 PM, Igor Stasenko <[hidden email]> wrote:

On 31 January 2012 20:50, Eliot Miranda <[hidden email]> wrote:
>
>
>
> On Tue, Jan 31, 2012 at 11:22 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>>
>> Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:
>>
>> 1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?
>
>
> Yes, almost certainly.  You'd want to do a flushCache on both methods.
>
are there other object types which we need to be careful with?

There are a few.  e.g. the Array literals in named primitives (because they hold target function pointers).  CompiledMethods (because they may have associated machine code).  Contexts (because they may have associated stack frames).
 
because i was thinking to just put a check in fast-become prim and
simply fail the prim if object type(s) to be swapped are not
supported, so user will be forced to use slow good-old #become:

I agree.  But you can do even better, by checking that the compiled method has a machine-code version, and/or checking that a context is "single" (has no associated stack state).  It doesn't need to fail if there isn't any special state.  Identifying the named primitive linking literals is more difficult...


--
Best regards,
Igor Stasenko.



--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Igor's fast become for CompiledMethods in Cog

Mariano Martinez Peck
 


On Tue, Jan 31, 2012 at 9:28 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 12:19 PM, Igor Stasenko <[hidden email]> wrote:

On 31 January 2012 20:50, Eliot Miranda <[hidden email]> wrote:
>
>
>
> On Tue, Jan 31, 2012 at 11:22 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>>
>> Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:
>>
>> 1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?
>
>
> Yes, almost certainly.  You'd want to do a flushCache on both methods.
>
are there other object types which we need to be careful with?

There are a few.  e.g. the Array literals in named primitives (because they hold target function pointers).  CompiledMethods (because they may have associated machine code).  Contexts (because they may have associated stack frames).
 

Eliot, I don't understand why we have these problems with the "fast become" but not with the normal one. What happens wich each of your examples with the normal become? how are they solved?
Sorry for the noob question.

 
because i was thinking to just put a check in fast-become prim and
simply fail the prim if object type(s) to be swapped are not
supported, so user will be forced to use slow good-old #become:

I agree.  But you can do even better, by checking that the compiled method has a machine-code version, and/or checking that a context is "single" (has no associated stack state).  It doesn't need to fail if there isn't any special state.  Identifying the named primitive linking literals is more difficult...


Ideally, I would love to be able to do the fast become for all of them, even if that implies doing something extra for special cass (like flushing method cache).

 

--
Best regards,
Igor Stasenko.



--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Igor's fast become for CompiledMethods in Cog

Eliot Miranda-2
 


On Tue, Jan 31, 2012 at 12:41 PM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 9:28 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 12:19 PM, Igor Stasenko <[hidden email]> wrote:

On 31 January 2012 20:50, Eliot Miranda <[hidden email]> wrote:
>
>
>
> On Tue, Jan 31, 2012 at 11:22 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>>
>> Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:
>>
>> 1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?
>
>
> Yes, almost certainly.  You'd want to do a flushCache on both methods.
>
are there other object types which we need to be careful with?

There are a few.  e.g. the Array literals in named primitives (because they hold target function pointers).  CompiledMethods (because they may have associated machine code).  Contexts (because they may have associated stack frames).
 

Eliot, I don't understand why we have these problems with the "fast become" but not with the normal one. What happens wich each of your examples with the normal become? how are they solved?

The "slow" become is implemented in terms of the GC's pointer-forwarding mechanism, which is used in normal garbage collection, not just become.  This machinery is the ObjectMemory>remap: machinery.  The JIT implements the same mapping machinery for literal objects embedded in machine code.  These include not just literals but also classes in inline-caches.  So it would seem that implementing markObject: and remap: for literals in jitted methods is all one needs to support GC and become:.  In fact, life is more complex because there is an optimization in the JIT to avoid scanning all of machine code on incremental GC.  The jit maintains a list of those methods that contain references to young objects and only scans this list on an incremental GC, and this list must be maintained correctly.  Hence there are three different remap routines in the jit, 

Cogit>mapObjectReferencesInMachineCodeForIncrementalGC
"Update all references to objects in machine code for an incremental gc.
Avoid scanning all code by using the youngReferrers list.  In an incremental
GC a method referring to young may no longer refer to young, but a method
not referring to young cannot and will not refer to young afterwards."

Cogit>mapObjectReferencesInMachineCodeForFullGC
"Update all references to objects in machine code for a full gc.  Since
the current (New)ObjectMemory GC makes everything old in a full GC
a method not referring to young will not refer to young afterwards"


Cogit>mapObjectReferencesInMachineCodeForBecome
"Update all references to objects in machine code for a become.
Unlike incrementalGC or fullGC a method that does not refer to young
may refer to young as a result of the become operation."
 
Sorry for the noob question.

It's a good question :)
 

 
because i was thinking to just put a check in fast-become prim and
simply fail the prim if object type(s) to be swapped are not
supported, so user will be forced to use slow good-old #become:

I agree.  But you can do even better, by checking that the compiled method has a machine-code version, and/or checking that a context is "single" (has no associated stack state).  It doesn't need to fail if there isn't any special state.  Identifying the named primitive linking literals is more difficult...


Ideally, I would love to be able to do the fast become for all of them, even if that implies doing something extra for special cass (like flushing method cache).

As they say, don't get caught.
 

 

--
Best regards,
Igor Stasenko.



--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Igor's fast become for CompiledMethods in Cog

Mariano Martinez Peck
 


On Tue, Jan 31, 2012 at 11:42 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 12:41 PM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 9:28 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 12:19 PM, Igor Stasenko <[hidden email]> wrote:

On 31 January 2012 20:50, Eliot Miranda <[hidden email]> wrote:
>
>
>
> On Tue, Jan 31, 2012 at 11:22 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>>
>> Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:
>>
>> 1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?
>
>
> Yes, almost certainly.  You'd want to do a flushCache on both methods.
>
are there other object types which we need to be careful with?

There are a few.  e.g. the Array literals in named primitives (because they hold target function pointers).  CompiledMethods (because they may have associated machine code).  Contexts (because they may have associated stack frames).
 

Eliot, I don't understand why we have these problems with the "fast become" but not with the normal one. What happens wich each of your examples with the normal become? how are they solved?

The "slow" become is implemented in terms of the GC's pointer-forwarding mechanism, which is used in normal garbage collection, not just become.  This machinery is the ObjectMemory>remap: machinery.  The JIT implements the same mapping machinery for literal objects embedded in machine code.  These include not just literals but also classes in inline-caches.  So it would seem that implementing markObject: and remap: for literals in jitted methods is all one needs to support GC and become:.  In fact, life is more complex because there is an optimization in the JIT to avoid scanning all of machine code on incremental GC.  The jit maintains a list of those methods that contain references to young objects and only scans this list on an incremental GC, and this list must be maintained correctly.  Hence there are three different remap routines in the jit, 

Cogit>mapObjectReferencesInMachineCodeForIncrementalGC
"Update all references to objects in machine code for an incremental gc.
Avoid scanning all code by using the youngReferrers list.  In an incremental
GC a method referring to young may no longer refer to young, but a method
not referring to young cannot and will not refer to young afterwards."

Cogit>mapObjectReferencesInMachineCodeForFullGC
"Update all references to objects in machine code for a full gc.  Since
the current (New)ObjectMemory GC makes everything old in a full GC
a method not referring to young will not refer to young afterwards"


Cogit>mapObjectReferencesInMachineCodeForBecome
"Update all references to objects in machine code for a become.
Unlike incrementalGC or fullGC a method that does not refer to young
may refer to young as a result of the become operation."



Aha. Ok. Now I see. So, let me see if I understand. So the problem of CompiledMethod gets fixed if we flush its cache. Right?
Now...if we always send #mapObjectReferencesInMachineCodeForBecome   after the "fast become" we will be updating all literals from machine code methods.
I didn't understand this one "  e.g. the Array literals in named primitives (because they hold target function pointers)"
What happens with "Contexts (because they may have associated stack frames)."  ?  should we need to flush somehow or update stack frames?

Thanks!
 
 
Sorry for the noob question.

It's a good question :)
 

 
because i was thinking to just put a check in fast-become prim and
simply fail the prim if object type(s) to be swapped are not
supported, so user will be forced to use slow good-old #become:

I agree.  But you can do even better, by checking that the compiled method has a machine-code version, and/or checking that a context is "single" (has no associated stack state).  It doesn't need to fail if there isn't any special state.  Identifying the named primitive linking literals is more difficult...


Ideally, I would love to be able to do the fast become for all of them, even if that implies doing something extra for special cass (like flushing method cache).

As they say, don't get caught.
 

 

--
Best regards,
Igor Stasenko.



--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Igor's fast become for CompiledMethods in Cog

Eliot Miranda-2
 


On Tue, Jan 31, 2012 at 3:13 PM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 11:42 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 12:41 PM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 9:28 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 12:19 PM, Igor Stasenko <[hidden email]> wrote:

On 31 January 2012 20:50, Eliot Miranda <[hidden email]> wrote:
>
>
>
> On Tue, Jan 31, 2012 at 11:22 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>>
>> Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:
>>
>> 1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?
>
>
> Yes, almost certainly.  You'd want to do a flushCache on both methods.
>
are there other object types which we need to be careful with?

There are a few.  e.g. the Array literals in named primitives (because they hold target function pointers).  CompiledMethods (because they may have associated machine code).  Contexts (because they may have associated stack frames).
 

Eliot, I don't understand why we have these problems with the "fast become" but not with the normal one. What happens wich each of your examples with the normal become? how are they solved?

The "slow" become is implemented in terms of the GC's pointer-forwarding mechanism, which is used in normal garbage collection, not just become.  This machinery is the ObjectMemory>remap: machinery.  The JIT implements the same mapping machinery for literal objects embedded in machine code.  These include not just literals but also classes in inline-caches.  So it would seem that implementing markObject: and remap: for literals in jitted methods is all one needs to support GC and become:.  In fact, life is more complex because there is an optimization in the JIT to avoid scanning all of machine code on incremental GC.  The jit maintains a list of those methods that contain references to young objects and only scans this list on an incremental GC, and this list must be maintained correctly.  Hence there are three different remap routines in the jit, 

Cogit>mapObjectReferencesInMachineCodeForIncrementalGC
"Update all references to objects in machine code for an incremental gc.
Avoid scanning all code by using the youngReferrers list.  In an incremental
GC a method referring to young may no longer refer to young, but a method
not referring to young cannot and will not refer to young afterwards."

Cogit>mapObjectReferencesInMachineCodeForFullGC
"Update all references to objects in machine code for a full gc.  Since
the current (New)ObjectMemory GC makes everything old in a full GC
a method not referring to young will not refer to young afterwards"


Cogit>mapObjectReferencesInMachineCodeForBecome
"Update all references to objects in machine code for a become.
Unlike incrementalGC or fullGC a method that does not refer to young
may refer to young as a result of the become operation."



Aha. Ok. Now I see. So, let me see if I understand. So the problem of CompiledMethod gets fixed if we flush its cache. Right?

I hope so.  The current implementation reads

CoInterpreterPrimitives>primitiveFlushCacheByMethod
"The receiver is a compiledMethod.  Clear all entries in the method lookup cache that
refer to this method, presumably because it has been redefined, overridden or removed.
Override to flush appropriate machine code caches also."
super primitiveFlushCacheByMethod.
cogit unlinkSendsTo: self stackTop

That may not be enough.  The VM may have to throw the method away.  Tests will show whether merely unlinking is sufficient.  The issue is that if the method is being used then throwing it away may involve flushing the stack activations of the method, and that makes the implementation much more complex.

 
Now...if we always send #mapObjectReferencesInMachineCodeForBecome   after the "fast become" we will be updating all literals from machine code methods.

Um, will it? The mapping is done only to references to objects that are forwarded.  If that's going to do the trick then great.  But I don't know enough about your fast become to know.


I didn't understand this one "  e.g. the Array literals in named primitives (because they hold target function pointers)"

Look at a method containing a named primitive that's in use and look at its first literal.  e.g.

(StandardFileStream >> #primRead:into:startingAt:count:) literalAt: 1 #(#FilePlugin #primitiveFileRead 0 12)

That 12 is meaningful to the VM.  See primitiveExternalCall:

External primitive methods first literals are an array of
* The module name (String | Symbol) 
* The function name (String | Symbol) 
* The session ID (SmallInteger) [OBSOLETE] 
* The function index (Integer) in the externalPrimitiveTable
 
What happens with "Contexts (because they may have associated stack frames)."  ?  should we need to flush somehow or update stack frames?

Yes.  If you smash the state of a context that has an associated stack frame then the VM will likely crash.  See senders of externalDivorceFrame:andContext: to see where the VM disassociates contexts and their stack frames when an access to a context (e.g. changing its stack pointer or pc) necessitates it.
 

Thanks!
 
 
Sorry for the noob question.

It's a good question :)
 

 
because i was thinking to just put a check in fast-become prim and
simply fail the prim if object type(s) to be swapped are not
supported, so user will be forced to use slow good-old #become:

I agree.  But you can do even better, by checking that the compiled method has a machine-code version, and/or checking that a context is "single" (has no associated stack state).  It doesn't need to fail if there isn't any special state.  Identifying the named primitive linking literals is more difficult...


Ideally, I would love to be able to do the fast become for all of them, even if that implies doing something extra for special cass (like flushing method cache).

As they say, don't get caught.
 

 

--
Best regards,
Igor Stasenko.



--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Igor's fast become for CompiledMethods in Cog

Mariano Martinez Peck
 


On Wed, Feb 1, 2012 at 12:28 AM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 3:13 PM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 11:42 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 12:41 PM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 9:28 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 12:19 PM, Igor Stasenko <[hidden email]> wrote:

On 31 January 2012 20:50, Eliot Miranda <[hidden email]> wrote:
>
>
>
> On Tue, Jan 31, 2012 at 11:22 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>>
>> Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:
>>
>> 1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?
>
>
> Yes, almost certainly.  You'd want to do a flushCache on both methods.
>
are there other object types which we need to be careful with?

There are a few.  e.g. the Array literals in named primitives (because they hold target function pointers).  CompiledMethods (because they may have associated machine code).  Contexts (because they may have associated stack frames).
 

Eliot, I don't understand why we have these problems with the "fast become" but not with the normal one. What happens wich each of your examples with the normal become? how are they solved?

The "slow" become is implemented in terms of the GC's pointer-forwarding mechanism, which is used in normal garbage collection, not just become.  This machinery is the ObjectMemory>remap: machinery.  The JIT implements the same mapping machinery for literal objects embedded in machine code.  These include not just literals but also classes in inline-caches.  So it would seem that implementing markObject: and remap: for literals in jitted methods is all one needs to support GC and become:.  In fact, life is more complex because there is an optimization in the JIT to avoid scanning all of machine code on incremental GC.  The jit maintains a list of those methods that contain references to young objects and only scans this list on an incremental GC, and this list must be maintained correctly.  Hence there are three different remap routines in the jit, 

Cogit>mapObjectReferencesInMachineCodeForIncrementalGC
"Update all references to objects in machine code for an incremental gc.
Avoid scanning all code by using the youngReferrers list.  In an incremental
GC a method referring to young may no longer refer to young, but a method
not referring to young cannot and will not refer to young afterwards."

Cogit>mapObjectReferencesInMachineCodeForFullGC
"Update all references to objects in machine code for a full gc.  Since
the current (New)ObjectMemory GC makes everything old in a full GC
a method not referring to young will not refer to young afterwards"


Cogit>mapObjectReferencesInMachineCodeForBecome
"Update all references to objects in machine code for a become.
Unlike incrementalGC or fullGC a method that does not refer to young
may refer to young as a result of the become operation."



Aha. Ok. Now I see. So, let me see if I understand. So the problem of CompiledMethod gets fixed if we flush its cache. Right?

I hope so.  The current implementation reads

CoInterpreterPrimitives>primitiveFlushCacheByMethod
"The receiver is a compiledMethod.  Clear all entries in the method lookup cache that
refer to this method, presumably because it has been redefined, overridden or removed.
Override to flush appropriate machine code caches also."
super primitiveFlushCacheByMethod.
cogit unlinkSendsTo: self stackTop

That may not be enough.  The VM may have to throw the method away.  Tests will show whether merely unlinking is sufficient.  


Eliot, I have to admit that the discussion is going further than my understandings :(

 
The issue is that if the method is being used then throwing it away may involve flushing the stack activations of the method, and that makes the implementation much more complex.


Ok, I understand that, but again I don't understand why
a) this does not happen with the "slow become". Is it again because of the forwarding table?
b) it doesn't happen when we flush methods. For example, when using TestCoverage that we put objects as methods and we use the #run:with:in: we flush the method cache... why that doens't need to flush stack activations?  just becuase in that case we are sure that the method is not being used?

 
 
Now...if we always send #mapObjectReferencesInMachineCodeForBecome   after the "fast become" we will be updating all literals from machine code methods.

Um, will it? The mapping is done only to references to objects that are forwarded.  If that's going to do the trick then great.  But I don't know enough about your fast become to know.

I think I was wrong. Obviously, #mapObjectReferencesInMachineCodeForBecome  is based in the forwarding table. Igor's solution just swaps contents... it does nothing regarding forwarding table.

Here I attach a slighly modified version of Igor code so that you can take a look, at leat to get an idea of what we are talking about. I attach the VMMaker changs and the image side. Personally, I think that performing the become with arrays is just becuause the become is slow. If become were fast, we would do the loop in image side, right?  Because of that, I think it is nice to have it in a separate primitive which fallbacks in the array.

 


I didn't understand this one "  e.g. the Array literals in named primitives (because they hold target function pointers)"

Look at a method containing a named primitive that's in use and look at its first literal.  e.g.

(StandardFileStream >> #primRead:into:startingAt:count:) literalAt: 1 #(#FilePlugin #primitiveFileRead 0 12)

That 12 is meaningful to the VM.  See primitiveExternalCall:

External primitive methods first literals are an array of
* The module name (String | Symbol) 
* The function name (String | Symbol) 
* The session ID (SmallInteger) [OBSOLETE] 
* The function index (Integer) in the externalPrimitiveTable


Yes, I know that, what I don't understand is what that can be affected by the fast become. The external primitive table doesn't have a pointer to the method, but to the function address. So even if I become a named prmitive, wouldn't the table still be correct?
In any case, I guess we can do a #flushExternalPrimitives or #flushExternalPrimitives  to avoid possible problems.  Would that help?

 
 
What happens with "Contexts (because they may have associated stack frames)."  ?  should we need to flush somehow or update stack frames?

Yes.  If you smash the state of a context that has an associated stack frame then the VM will likely crash.  See senders of externalDivorceFrame:andContext: to see where the VM disassociates contexts and their stack frames when an access to a context (e.g. changing its stack pointer or pc) necessitates it.
 

ok, and so that can happens if we become contexts?

Thanks Eliot.

 

Thanks!
 
 
Sorry for the noob question.

It's a good question :)
 

 
because i was thinking to just put a check in fast-become prim and
simply fail the prim if object type(s) to be swapped are not
supported, so user will be forced to use slow good-old #become:

I agree.  But you can do even better, by checking that the compiled method has a machine-code version, and/or checking that a context is "single" (has no associated stack state).  It doesn't need to fail if there isn't any special state.  Identifying the named primitive linking literals is more difficult...


Ideally, I would love to be able to do the fast become for all of them, even if that implies doing something extra for special cass (like flushing method cache).

As they say, don't get caught.
 

 

--
Best regards,
Igor Stasenko.



--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com


FastBecome-ImageSide.1.cs (1K) Download Attachment
FastBecome-VMMaker.1.cs (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Igor's fast become for CompiledMethods in Cog

Eliot Miranda-2
 


On Wed, Feb 1, 2012 at 8:19 AM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Wed, Feb 1, 2012 at 12:28 AM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 3:13 PM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 11:42 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 12:41 PM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 9:28 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 12:19 PM, Igor Stasenko <[hidden email]> wrote:

On 31 January 2012 20:50, Eliot Miranda <[hidden email]> wrote:
>
>
>
> On Tue, Jan 31, 2012 at 11:22 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>>
>> Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:
>>
>> 1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?
>
>
> Yes, almost certainly.  You'd want to do a flushCache on both methods.
>
are there other object types which we need to be careful with?

There are a few.  e.g. the Array literals in named primitives (because they hold target function pointers).  CompiledMethods (because they may have associated machine code).  Contexts (because they may have associated stack frames).
 

Eliot, I don't understand why we have these problems with the "fast become" but not with the normal one. What happens wich each of your examples with the normal become? how are they solved?

The "slow" become is implemented in terms of the GC's pointer-forwarding mechanism, which is used in normal garbage collection, not just become.  This machinery is the ObjectMemory>remap: machinery.  The JIT implements the same mapping machinery for literal objects embedded in machine code.  These include not just literals but also classes in inline-caches.  So it would seem that implementing markObject: and remap: for literals in jitted methods is all one needs to support GC and become:.  In fact, life is more complex because there is an optimization in the JIT to avoid scanning all of machine code on incremental GC.  The jit maintains a list of those methods that contain references to young objects and only scans this list on an incremental GC, and this list must be maintained correctly.  Hence there are three different remap routines in the jit, 

Cogit>mapObjectReferencesInMachineCodeForIncrementalGC
"Update all references to objects in machine code for an incremental gc.
Avoid scanning all code by using the youngReferrers list.  In an incremental
GC a method referring to young may no longer refer to young, but a method
not referring to young cannot and will not refer to young afterwards."

Cogit>mapObjectReferencesInMachineCodeForFullGC
"Update all references to objects in machine code for a full gc.  Since
the current (New)ObjectMemory GC makes everything old in a full GC
a method not referring to young will not refer to young afterwards"


Cogit>mapObjectReferencesInMachineCodeForBecome
"Update all references to objects in machine code for a become.
Unlike incrementalGC or fullGC a method that does not refer to young
may refer to young as a result of the become operation."



Aha. Ok. Now I see. So, let me see if I understand. So the problem of CompiledMethod gets fixed if we flush its cache. Right?

I hope so.  The current implementation reads

CoInterpreterPrimitives>primitiveFlushCacheByMethod
"The receiver is a compiledMethod.  Clear all entries in the method lookup cache that
refer to this method, presumably because it has been redefined, overridden or removed.
Override to flush appropriate machine code caches also."
super primitiveFlushCacheByMethod.
cogit unlinkSendsTo: self stackTop

That may not be enough.  The VM may have to throw the method away.  Tests will show whether merely unlinking is sufficient.  


Eliot, I have to admit that the discussion is going further than my understandings :(

So you need to write tests that verify that the VM can cope with fast becomming a jitted method, and perhaps can cope with fast becomming a jitted method with activations.  So the tests need to check that they are operating on at least one jitted method and perhaps a jitted method with an activation.  Just assuming that running code will cause the jit to compile and that activatiosn exist isn't really good enough for tests.  The VM discards methods to make room for new ones, so IMO to know that one is actually testing what one wants to test the tests need to use something like an xray primitive to know.


 
The issue is that if the method is being used then throwing it away may involve flushing the stack activations of the method, and that makes the implementation much more complex.


Ok, I understand that, but again I don't understand why
a) this does not happen with the "slow become". Is it again because of the forwarding table?

I think that an attempt to become an active context will probably cause the VM to crash.  I'd be interested in seeing a test that becommes contexts that worked on the interpreter VM.  I doubt it'll work in current Cog.  While the remap machinery (forwarding table) does update stack frames a become on a married context, between a single and a married context or become between married contexts will almost certainly break the two-way mapping between the married context(s) and its/their stack frame(s).


b) it doesn't happen when we flush methods. For example, when using TestCoverage that we put objects as methods and we use the #run:with:in: we flush the method cache... why that doens't need to flush stack activations?  just becuase in that case we are sure that the method is not being used?

Flushing the method cache doesn't throw away jitted methods, it merely clears the method lookup cache and unlinks all linked sends so that any subsequent look-up is redone.  But the VM assumes one doesn't change the contents of a compiled method after it is created, and hence the machine code is valid.  The VM plays fast and loose with CompiledMethod>>#objectAt:put:, not redoing any compilation or activation flushing when one does obejctAt:put: on a jitted method because I've got away with it so far.  I think in your case I can still get away with it since the method isn't used after it is becommed into a proxy, but you may need to use the SmalltalkImage>>voidCogVMState primitive (#214) after faulting out code to ensure that any faulted out methods are really gone.

The bottom line here is that uses of become on objects that the VM is aggressively optimising (contexts & methods) needs to be done with care and the engineering of this needs to find a compromise between generality and optimisation.  Again good tests that stress the system are essential in our being able to understand the trade-offs and implement correctly.


 
 
Now...if we always send #mapObjectReferencesInMachineCodeForBecome   after the "fast become" we will be updating all literals from machine code methods.

Um, will it? The mapping is done only to references to objects that are forwarded.  If that's going to do the trick then great.  But I don't know enough about your fast become to know.

I think I was wrong. Obviously, #mapObjectReferencesInMachineCodeForBecome  is based in the forwarding table. Igor's solution just swaps contents... it does nothing regarding forwarding table.

OK, that was what I assumed.  So fast become on contexts is only safe if the neither context is married, and only safe on methods if neither method is jitted.
 

Here I attach a slighly modified version of Igor code so that you can take a look, at leat to get an idea of what we are talking about. I attach the VMMaker changs and the image side. Personally, I think that performing the become with arrays is just becuause the become is slow. If become were fast, we would do the loop in image side, right?  Because of that, I think it is nice to have it in a separate primitive which fallbacks in the array.

I'll take a look.  Swapping become is fast right?  The loop in the normal forwarding become is in the VM because forwarding become is indeed slow.  So IMO you don't need a bulk slow become, just a pair-wise slow become.


 


I didn't understand this one "  e.g. the Array literals in named primitives (because they hold target function pointers)"

Look at a method containing a named primitive that's in use and look at its first literal.  e.g.

(StandardFileStream >> #primRead:into:startingAt:count:) literalAt: 1 #(#FilePlugin #primitiveFileRead 0 12)

That 12 is meaningful to the VM.  See primitiveExternalCall:

External primitive methods first literals are an array of
* The module name (String | Symbol) 
* The function name (String | Symbol) 
* The session ID (SmallInteger) [OBSOLETE] 
* The function index (Integer) in the externalPrimitiveTable


Yes, I know that, what I don't understand is what that can be affected by the fast become. The external primitive table doesn't have a pointer to the method, but to the function address. So even if I become a named prmitive, wouldn't the table still be correct?

If you become the first literal of a linked named primitive into something else it could potentially f**k up the VM.  You can't become that literal into anything you want and expect the VM to keep running.  For example if you become it to an Array with an invalid index in it it'll cause the VM to fetch a bogus function pointer and cal it.  The last element of the Array (if non-zero) is the index into a table of external functions (the named primitive functions), so changing its value can cause an out-of-bounds access or fetch an invalid entry etc.  Chances are the VM will crash if the named primitive is invoked.


In any case, I guess we can do a #flushExternalPrimitives or #flushExternalPrimitives  to avoid possible problems.  Would that help?

If done right.  But that's not the point.  The point is that certain objects contain sensitive state that one can't just become and expect the VM to continue running.  For example, if you were to become class Message into something not class-like then the next time the system tried to do a doesNotUnderstand: it would construct an invalid instance and boom.  So in these cases you need to "not do that".  As I've said the set of objects includes the literals at the start of named primitive methods.



 
 
What happens with "Contexts (because they may have associated stack frames)."  ?  should we need to flush somehow or update stack frames?

Yes.  If you smash the state of a context that has an associated stack frame then the VM will likely crash.  See senders of externalDivorceFrame:andContext: to see where the VM disassociates contexts and their stack frames when an access to a context (e.g. changing its stack pointer or pc) necessitates it.
 

ok, and so that can happens if we become contexts?

I'm not sure I understand what you mean.  Of course becomming contexts can f**k things up.  Try thisContext sender become: Point new.  But the above is about how the VM optimizes contexts by mapping them to stack frames. The VM maintains a complex and delicate bi-directional mapping between stack frames and contexts so that most of the time it is able to use stack frames for execution.  It intercepts inst var accesses to contexts that are married to stack frames to "do the right thing" (e.g. alter the stack frame, or discard the stack frame, remember my blog post on the scheme?).  If you become contexts carelessly then this bidirectional mapping can become corrupted and the VM will likely crash.  Right now I've got away with not checking for married contexts in become operations, presumably because no-one has tried becomming contexts.  If you're about to start becomming contents then the become operation(s) will need additional checks to prevent this corruption.

 

Thanks Eliot.

 

Thanks!
 
 
Sorry for the noob question.

It's a good question :)
 

 
because i was thinking to just put a check in fast-become prim and
simply fail the prim if object type(s) to be swapped are not
supported, so user will be forced to use slow good-old #become:

I agree.  But you can do even better, by checking that the compiled method has a machine-code version, and/or checking that a context is "single" (has no associated stack state).  It doesn't need to fail if there isn't any special state.  Identifying the named primitive linking literals is more difficult...


Ideally, I would love to be able to do the fast become for all of them, even if that implies doing something extra for special cass (like flushing method cache).

As they say, don't get caught.
 

 

--
Best regards,
Igor Stasenko.



--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Igor's fast become for CompiledMethods in Cog

Igor Stasenko

On 2 February 2012 21:23, Eliot Miranda <[hidden email]> wrote:

>
>
>
> On Wed, Feb 1, 2012 at 8:19 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>>
>>
>>
>> On Wed, Feb 1, 2012 at 12:28 AM, Eliot Miranda <[hidden email]> wrote:
>>>
>>>
>>>
>>>
>>> On Tue, Jan 31, 2012 at 3:13 PM, Mariano Martinez Peck <[hidden email]> wrote:
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jan 31, 2012 at 11:42 PM, Eliot Miranda <[hidden email]> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jan 31, 2012 at 12:41 PM, Mariano Martinez Peck <[hidden email]> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jan 31, 2012 at 9:28 PM, Eliot Miranda <[hidden email]> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jan 31, 2012 at 12:19 PM, Igor Stasenko <[hidden email]> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 31 January 2012 20:50, Eliot Miranda <[hidden email]> wrote:
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Tue, Jan 31, 2012 at 11:22 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:
>>>>>>>> >>
>>>>>>>> >> 1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > Yes, almost certainly.  You'd want to do a flushCache on both methods.
>>>>>>>> >
>>>>>>>> are there other object types which we need to be careful with?
>>>>>>>
>>>>>>>
>>>>>>> There are a few.  e.g. the Array literals in named primitives (because they hold target function pointers).  CompiledMethods (because they may have associated machine code).  Contexts (because they may have associated stack frames).
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Eliot, I don't understand why we have these problems with the "fast become" but not with the normal one. What happens wich each of your examples with the normal become? how are they solved?
>>>>>
>>>>>
>>>>> The "slow" become is implemented in terms of the GC's pointer-forwarding mechanism, which is used in normal garbage collection, not just become.  This machinery is the ObjectMemory>remap: machinery.  The JIT implements the same mapping machinery for literal objects embedded in machine code.  These include not just literals but also classes in inline-caches.  So it would seem that implementing markObject: and remap: for literals in jitted methods is all one needs to support GC and become:.  In fact, life is more complex because there is an optimization in the JIT to avoid scanning all of machine code on incremental GC.  The jit maintains a list of those methods that contain references to young objects and only scans this list on an incremental GC, and this list must be maintained correctly.  Hence there are three different remap routines in the jit,
>>>>>
>>>>> Cogit>mapObjectReferencesInMachineCodeForIncrementalGC
>>>>> "Update all references to objects in machine code for an incremental gc.
>>>>> Avoid scanning all code by using the youngReferrers list.  In an incremental
>>>>> GC a method referring to young may no longer refer to young, but a method
>>>>> not referring to young cannot and will not refer to young afterwards."
>>>>>
>>>>> Cogit>mapObjectReferencesInMachineCodeForFullGC
>>>>> "Update all references to objects in machine code for a full gc.  Since
>>>>> the current (New)ObjectMemory GC makes everything old in a full GC
>>>>> a method not referring to young will not refer to young afterwards"
>>>>>
>>>>>
>>>>> Cogit>mapObjectReferencesInMachineCodeForBecome
>>>>> "Update all references to objects in machine code for a become.
>>>>> Unlike incrementalGC or fullGC a method that does not refer to young
>>>>> may refer to young as a result of the become operation."
>>>>
>>>>
>>>>
>>>>
>>>> Aha. Ok. Now I see. So, let me see if I understand. So the problem of CompiledMethod gets fixed if we flush its cache. Right?
>>>
>>>
>>> I hope so.  The current implementation reads
>>>
>>> CoInterpreterPrimitives>primitiveFlushCacheByMethod
>>> "The receiver is a compiledMethod.  Clear all entries in the method lookup cache that
>>> refer to this method, presumably because it has been redefined, overridden or removed.
>>> Override to flush appropriate machine code caches also."
>>> super primitiveFlushCacheByMethod.
>>> cogit unlinkSendsTo: self stackTop
>>>
>>> That may not be enough.  The VM may have to throw the method away.  Tests will show whether merely unlinking is sufficient.
>>
>>
>>
>> Eliot, I have to admit that the discussion is going further than my understandings :(
>
>
> So you need to write tests that verify that the VM can cope with fast becomming a jitted method, and perhaps can cope with fast becomming a jitted method with activations.  So the tests need to check that they are operating on at least one jitted method and perhaps a jitted method with an activation.  Just assuming that running code will cause the jit to compile and that activatiosn exist isn't really good enough for tests.  The VM discards methods to make room for new ones, so IMO to know that one is actually testing what one wants to test the tests need to use something like an xray primitive to know.
>
>>
>>
>>>
>>> The issue is that if the method is being used then throwing it away may involve flushing the stack activations of the method, and that makes the implementation much more complex.
>>>
>>
>> Ok, I understand that, but again I don't understand why
>> a) this does not happen with the "slow become". Is it again because of the forwarding table?
>
>
> I think that an attempt to become an active context will probably cause the VM to crash.  I'd be interested in seeing a test that becommes contexts that worked on the interpreter VM.  I doubt it'll work in current Cog.  While the remap machinery (forwarding table) does update stack frames a become on a married context, between a single and a married context or become between married contexts will almost certainly break the two-way mapping between the married context(s) and its/their stack frame(s).
>
>
>> b) it doesn't happen when we flush methods. For example, when using TestCoverage that we put objects as methods and we use the #run:with:in: we flush the method cache... why that doens't need to flush stack activations?  just becuase in that case we are sure that the method is not being used?
>
>
> Flushing the method cache doesn't throw away jitted methods, it merely clears the method lookup cache and unlinks all linked sends so that any subsequent look-up is redone.  But the VM assumes one doesn't change the contents of a compiled method after it is created, and hence the machine code is valid.  The VM plays fast and loose with CompiledMethod>>#objectAt:put:, not redoing any compilation or activation flushing when one does obejctAt:put: on a jitted method because I've got away with it so far.  I think in your case I can still get away with it since the method isn't used after it is becommed into a proxy, but you may need to use the SmalltalkImage>>voidCogVMState primitive (#214) after faulting out code to ensure that any faulted out methods are really gone.
>
> The bottom line here is that uses of become on objects that the VM is aggressively optimising (contexts & methods) needs to be done with care and the engineering of this needs to find a compromise between generality and optimisation.  Again good tests that stress the system are essential in our being able to understand the trade-offs and implement correctly.
>
>>
>>
>>>
>>>
>>>>
>>>> Now...if we always send #mapObjectReferencesInMachineCodeForBecome   after the "fast become" we will be updating all literals from machine code methods.
>>>
>>>
>>> Um, will it? The mapping is done only to references to objects that are forwarded.  If that's going to do the trick then great.  But I don't know enough about your fast become to know.
>>
>>
>> I think I was wrong. Obviously, #mapObjectReferencesInMachineCodeForBecome  is based in the forwarding table. Igor's solution just swaps contents... it does nothing regarding forwarding table.
>
>
> OK, that was what I assumed.  So fast become on contexts is only safe if the neither context is married, and only safe on methods if neither method is jitted.
>
>>
>>
>> Here I attach a slighly modified version of Igor code so that you can take a look, at leat to get an idea of what we are talking about. I attach the VMMaker changs and the image side. Personally, I think that performing the become with arrays is just becuause the become is slow. If become were fast, we would do the loop in image side, right?  Because of that, I think it is nice to have it in a separate primitive which fallbacks in the array.
>
>
> I'll take a look.  Swapping become is fast right?  The loop in the normal forwarding become is in the VM because forwarding become is indeed slow.  So IMO you don't need a bulk slow become, just a pair-wise slow become.
>
>>
>>
>>>
>>>
>>>
>>>> I didn't understand this one "  e.g. the Array literals in named primitives (because they hold target function pointers)"
>>>
>>>
>>> Look at a method containing a named primitive that's in use and look at its first literal.  e.g.
>>>
>>> (StandardFileStream >> #primRead:into:startingAt:count:) literalAt: 1 #(#FilePlugin #primitiveFileRead 0 12)
>>>
>>> That 12 is meaningful to the VM.  See primitiveExternalCall:
>>>
>>> External primitive methods first literals are an array of
>>> * The module name (String | Symbol)
>>> * The function name (String | Symbol)
>>> * The session ID (SmallInteger) [OBSOLETE]
>>> * The function index (Integer) in the externalPrimitiveTable
>>
>>
>>
>> Yes, I know that, what I don't understand is what that can be affected by the fast become. The external primitive table doesn't have a pointer to the method, but to the function address. So even if I become a named prmitive, wouldn't the table still be correct?
>
>
> If you become the first literal of a linked named primitive into something else it could potentially f**k up the VM.  You can't become that literal into anything you want and expect the VM to keep running.  For example if you become it to an Array with an invalid index in it it'll cause the VM to fetch a bogus function pointer and cal it.  The last element of the Array (if non-zero) is the index into a table of external functions (the named primitive functions), so changing its value can cause an out-of-bounds access or fetch an invalid entry etc.  Chances are the VM will crash if the named primitive is invoked.
>

So, this can be ignored.
I can hardly see the potential scenario, where manipulating with 1-st
literal array makes any sense. And even if there is such strange use,
a become primitive cannot do anything against it, because if you pass
pair of array(s) to prim, there is no way to tell if they used as 1st
method literal or not.



>
>> In any case, I guess we can do a #flushExternalPrimitives or #flushExternalPrimitives  to avoid possible problems.  Would that help?
>
>
> If done right.  But that's not the point.  The point is that certain objects contain sensitive state that one can't just become and expect the VM to continue running.  For example, if you were to become class Message into something not class-like then the next time the system tried to do a doesNotUnderstand: it would construct an invalid instance and boom.  So in these cases you need to "not do that".  As I've said the set of objects includes the literals at the start of named primitive methods.
>
>
>>
>>
>>>
>>>
>>>>
>>>> What happens with "Contexts (because they may have associated stack frames)."  ?  should we need to flush somehow or update stack frames?
>>>
>>>
>>> Yes.  If you smash the state of a context that has an associated stack frame then the VM will likely crash.  See senders of externalDivorceFrame:andContext: to see where the VM disassociates contexts and their stack frames when an access to a context (e.g. changing its stack pointer or pc) necessitates it.
>>>
>>
>>
>> ok, and so that can happens if we become contexts?
>
>
> I'm not sure I understand what you mean.  Of course becomming contexts can f**k things up.  Try thisContext sender become: Point new.  But the above is about how the VM optimizes contexts by mapping them to stack frames. The VM maintains a complex and delicate bi-directional mapping between stack frames and contexts so that most of the time it is able to use stack frames for execution.  It intercepts inst var accesses to contexts that are married to stack frames to "do the right thing" (e.g. alter the stack frame, or discard the stack frame, remember my blog post on the scheme?).  If you become contexts carelessly then this bidirectional mapping can become corrupted and the VM will likely crash.  Right now I've got away with not checking for married contexts in become operations, presumably because no-one has tried becomming contexts.  If you're about to start becomming contents then the become operation(s) will need additional checks to prevent this corruption.
>
>
>>
>>
>> Thanks Eliot.
>>
>>
>>>>
>>>>
>>>> Thanks!
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Sorry for the noob question.
>>>>>
>>>>>
>>>>> It's a good question :)
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>
>>>>>>>> because i was thinking to just put a check in fast-become prim and
>>>>>>>> simply fail the prim if object type(s) to be swapped are not
>>>>>>>> supported, so user will be forced to use slow good-old #become:
>>>>>>>
>>>>>>>
>>>>>>> I agree.  But you can do even better, by checking that the compiled method has a machine-code version, and/or checking that a context is "single" (has no associated stack state).  It doesn't need to fail if there isn't any special state.  Identifying the named primitive linking literals is more difficult...
>>>>>>>
>>>>>>
>>>>>> Ideally, I would love to be able to do the fast become for all of them, even if that implies doing something extra for special cass (like flushing method cache).
>>>>>
>>>>>
>>>>> As they say, don't get caught.
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards,
>>>>>>>> Igor Stasenko.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> best,
>>>>>>> Eliot
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Mariano
>>>>>> http://marianopeck.wordpress.com
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> best,
>>>>> Eliot
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Mariano
>>>> http://marianopeck.wordpress.com
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> best,
>>> Eliot
>>>
>>>
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>>
>
>
>
> --
> best,
> Eliot
>
>



--
Best regards,
Igor Stasenko.
Reply | Threaded
Open this post in threaded view
|

Re: Igor's fast become for CompiledMethods in Cog

Mariano Martinez Peck
In reply to this post by Eliot Miranda-2
 


On Thu, Feb 2, 2012 at 9:23 PM, Eliot Miranda <[hidden email]> wrote:
 


On Wed, Feb 1, 2012 at 8:19 AM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Wed, Feb 1, 2012 at 12:28 AM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 3:13 PM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 11:42 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 12:41 PM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 9:28 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 31, 2012 at 12:19 PM, Igor Stasenko <[hidden email]> wrote:

On 31 January 2012 20:50, Eliot Miranda <[hidden email]> wrote:
>
>
>
> On Tue, Jan 31, 2012 at 11:22 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>>
>> Hi Eliot. Me again :)   I was checking the changes Igor did some time ago for the fast become where he basically swapped the bytes contents between the objects when they were the same size and same header type. He put such code in separate primtives and some changes in the image side to call them. I have just played with them and they seem to work. I have 2 questions for you:
>>
>> 1) Do you think that this new fast become can have problems when becoming CompiledMethods? I am asking because of the JIT/Pic. Maybe I need a flushCache or something?
>
>
> Yes, almost certainly.  You'd want to do a flushCache on both methods.
>
are there other object types which we need to be careful with?

There are a few.  e.g. the Array literals in named primitives (because they hold target function pointers).  CompiledMethods (because they may have associated machine code).  Contexts (because they may have associated stack frames).
 

Eliot, I don't understand why we have these problems with the "fast become" but not with the normal one. What happens wich each of your examples with the normal become? how are they solved?

The "slow" become is implemented in terms of the GC's pointer-forwarding mechanism, which is used in normal garbage collection, not just become.  This machinery is the ObjectMemory>remap: machinery.  The JIT implements the same mapping machinery for literal objects embedded in machine code.  These include not just literals but also classes in inline-caches.  So it would seem that implementing markObject: and remap: for literals in jitted methods is all one needs to support GC and become:.  In fact, life is more complex because there is an optimization in the JIT to avoid scanning all of machine code on incremental GC.  The jit maintains a list of those methods that contain references to young objects and only scans this list on an incremental GC, and this list must be maintained correctly.  Hence there are three different remap routines in the jit, 

Cogit>mapObjectReferencesInMachineCodeForIncrementalGC
"Update all references to objects in machine code for an incremental gc.
Avoid scanning all code by using the youngReferrers list.  In an incremental
GC a method referring to young may no longer refer to young, but a method
not referring to young cannot and will not refer to young afterwards."

Cogit>mapObjectReferencesInMachineCodeForFullGC
"Update all references to objects in machine code for a full gc.  Since
the current (New)ObjectMemory GC makes everything old in a full GC
a method not referring to young will not refer to young afterwards"


Cogit>mapObjectReferencesInMachineCodeForBecome
"Update all references to objects in machine code for a become.
Unlike incrementalGC or fullGC a method that does not refer to young
may refer to young as a result of the become operation."



Aha. Ok. Now I see. So, let me see if I understand. So the problem of CompiledMethod gets fixed if we flush its cache. Right?

I hope so.  The current implementation reads

CoInterpreterPrimitives>primitiveFlushCacheByMethod
"The receiver is a compiledMethod.  Clear all entries in the method lookup cache that
refer to this method, presumably because it has been redefined, overridden or removed.
Override to flush appropriate machine code caches also."
super primitiveFlushCacheByMethod.
cogit unlinkSendsTo: self stackTop

That may not be enough.  The VM may have to throw the method away.  Tests will show whether merely unlinking is sufficient.  


Eliot, I have to admit that the discussion is going further than my understandings :(

So you need to write tests that verify that the VM can cope with fast becomming a jitted method, and perhaps can cope with fast becomming a jitted method with activations.  So the tests need to check that they are operating on at least one jitted method and perhaps a jitted method with an activation.  Just assuming that running code will cause the jit to compile and that activatiosn exist isn't really good enough for tests.  The VM discards methods to make room for new ones, so IMO to know that one is actually testing what one wants to test the tests need to use something like an xray primitive to know.

Ok, I understand.
 


 
The issue is that if the method is being used then throwing it away may involve flushing the stack activations of the method, and that makes the implementation much more complex.


Ok, I understand that, but again I don't understand why
a) this does not happen with the "slow become". Is it again because of the forwarding table?

I think that an attempt to become an active context will probably cause the VM to crash.  I'd be interested in seeing a test that becommes contexts that worked on the interpreter VM.  I doubt it'll work in current Cog.  While the remap machinery (forwarding table) does update stack frames a become on a married context, between a single and a married context or become between married contexts will almost certainly break the two-way mapping between the married context(s) and its/their stack frame(s).

I see. So I won't become contexts :)
 


b) it doesn't happen when we flush methods. For example, when using TestCoverage that we put objects as methods and we use the #run:with:in: we flush the method cache... why that doens't need to flush stack activations?  just becuase in that case we are sure that the method is not being used?

Flushing the method cache doesn't throw away jitted methods, it merely clears the method lookup cache and unlinks all linked sends so that any subsequent look-up is redone.  But the VM assumes one doesn't change the contents of a compiled method after it is created, and hence the machine code is valid.  The VM plays fast and loose with CompiledMethod>>#objectAt:put:, not redoing any compilation or activation flushing when one does obejctAt:put: on a jitted method because I've got away with it so far.  I think in your case I can still get away with it since the method isn't used after it is becommed into a proxy, but you may need to use the SmalltalkImage>>voidCogVMState primitive (#214) after faulting out code to ensure that any faulted out methods are really gone.


Ahhh ok, that's what I didn't know. I thought that the flush not only flushes but also it discards the jitted versions. Now if I send #voidCogVMState after each become, I can become CompiledMethod without problems :) 
 
The bottom line here is that uses of become on objects that the VM is aggressively optimising (contexts & methods) needs to be done with care

Indeed.
 
and the engineering of this needs to find a compromise between generality and optimisation.  Again good tests that stress the system are essential in our being able to understand the trade-offs and implement correctly.


 
 
Now...if we always send #mapObjectReferencesInMachineCodeForBecome   after the "fast become" we will be updating all literals from machine code methods.

Um, will it? The mapping is done only to references to objects that are forwarded.  If that's going to do the trick then great.  But I don't know enough about your fast become to know.

I think I was wrong. Obviously, #mapObjectReferencesInMachineCodeForBecome  is based in the forwarding table. Igor's solution just swaps contents... it does nothing regarding forwarding table.

OK, that was what I assumed.  So fast become on contexts is only safe if the neither context is married, and only safe on methods if neither method is jitted.
 

Exactly.  But even with that limitations it may be worth it. 

 

Here I attach a slighly modified version of Igor code so that you can take a look, at leat to get an idea of what we are talking about. I attach the VMMaker changs and the image side. Personally, I think that performing the become with arrays is just becuause the become is slow. If become were fast, we would do the loop in image side, right?  Because of that, I think it is nice to have it in a separate primitive which fallbacks in the array.

I'll take a look.  Swapping become is fast right?
yes
 The loop in the normal forwarding become is in the VM because forwarding become is indeed slow.  

yes
So IMO you don't need a bulk slow become, just a pair-wise slow become.


I didn't catch this one.
 

 


I didn't understand this one "  e.g. the Array literals in named primitives (because they hold target function pointers)"

Look at a method containing a named primitive that's in use and look at its first literal.  e.g.

(StandardFileStream >> #primRead:into:startingAt:count:) literalAt: 1 #(#FilePlugin #primitiveFileRead 0 12)

That 12 is meaningful to the VM.  See primitiveExternalCall:

External primitive methods first literals are an array of
* The module name (String | Symbol) 
* The function name (String | Symbol) 
* The session ID (SmallInteger) [OBSOLETE] 
* The function index (Integer) in the externalPrimitiveTable


Yes, I know that, what I don't understand is what that can be affected by the fast become. The external primitive table doesn't have a pointer to the method, but to the function address. So even if I become a named prmitive, wouldn't the table still be correct?

If you become the first literal of a linked named primitive into something else it could potentially f**k up the VM.  You can't become that literal into anything you want and expect the VM to keep running.  For example if you become it to an Array with an invalid index in it it'll cause the VM to fetch a bogus function pointer and cal it.  The last element of the Array (if non-zero) is the index into a table of external functions (the named primitive functions), so changing its value can cause an out-of-bounds access or fetch an invalid entry etc.  Chances are the VM will crash if the named primitive is invoked.


Ok, I can imagine, and it makes sense. In my use-case at least, this doesn't happen luckily.
 

In any case, I guess we can do a #flushExternalPrimitives or #flushExternalPrimitives  to avoid possible problems.  Would that help?

If done right.  But that's not the point.  The point is that certain objects contain sensitive state that one can't just become and expect the VM to continue running.  For example, if you were to become class Message into something not class-like then the next time the system tried to do a doesNotUnderstand: it would construct an invalid instance and boom.  So in these cases you need to "not do that".  As I've said the set of objects includes the literals at the start of named primitive methods.


I understanad.
 


 
 
What happens with "Contexts (because they may have associated stack frames)."  ?  should we need to flush somehow or update stack frames?

Yes.  If you smash the state of a context that has an associated stack frame then the VM will likely crash.  See senders of externalDivorceFrame:andContext: to see where the VM disassociates contexts and their stack frames when an access to a context (e.g. changing its stack pointer or pc) necessitates it.
 

ok, and so that can happens if we become contexts?

I'm not sure I understand what you mean.  Of course becomming contexts can f**k things up.  Try thisContext sender become: Point new.  But the above is about how the VM optimizes contexts by mapping them to stack frames. The VM maintains a complex and delicate bi-directional mapping between stack frames and contexts so that most of the time it is able to use stack frames for execution.  It intercepts inst var accesses to contexts that are married to stack frames to "do the right thing" (e.g. alter the stack frame, or discard the stack frame, remember my blog post on the scheme?).  If you become contexts carelessly then this bidirectional mapping can become corrupted and the VM will likely crash.  Right now I've got away with not checking for married contexts in become operations, presumably because no-one has tried becomming contexts.  If you're about to start becomming contents then the become operation(s) will need additional checks to prevent this corruption.

I understand. I think that for the moment I will just "avoid" becoming contexts.

Thanks Eliot for your help.
 

 

Thanks Eliot.

 

Thanks!
 
 
Sorry for the noob question.

It's a good question :)
 

 
because i was thinking to just put a check in fast-become prim and
simply fail the prim if object type(s) to be swapped are not
supported, so user will be forced to use slow good-old #become:

I agree.  But you can do even better, by checking that the compiled method has a machine-code version, and/or checking that a context is "single" (has no associated stack state).  It doesn't need to fail if there isn't any special state.  Identifying the named primitive linking literals is more difficult...


Ideally, I would love to be able to do the fast become for all of them, even if that implies doing something extra for special cass (like flushing method cache).

As they say, don't get caught.
 

 

--
Best regards,
Igor Stasenko.



--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com