Bytecodes are stored in the ENVY/Manager

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Bytecodes are stored in the ENVY/Manager

Seth Berman
Hello All,

We have received a few questions about the loading behavior of imported user code into the Beta builds.
There is some non-obvious stuff going on so I would like to give some background on how the Envy/Manager is involved when you are loading code or compiling code from the smalltalk source code.

First, I'll provide one of the symptoms in case you have seen it before.
It goes something like this:
1. I exported some code from my main ENVY/Manager (.dat)
2. I imported the code into the Beta ENVY/Manager
3. The code loads in the 32-bit beta image (albeit, maybe some unmanaged namespace Transcript entries)
4. The code does NOT load in the 64-bit beta image (compiler errors).
5. Why?

The reason has to do with, what is probably on the top of my list as most-hated-feature, the fact that the result of smalltalk source compilation (i.e. bytecodes) is stored in the ENVY/Manager your image is attached to.
Since there have been multiple versions of bytecodes over the history of VAST...it is stored in the ENVY/Manager by the bytecode version level of your image (and other info like method timestamp and other details).

When you load code into your image, your image is making a request of the ENVY/Manager.
You might intuitively guess that this request is for the smalltalk source code, which your image would then compile into smalltalk bytecodes and create a resulting CompiledMethod.
But this isn't true...at least not always.

What actually happens is:
  • The first time you add some new code, the Smalltalk compiler will use the source code you created along with the current environment (required for things like resolving classes, pools and other things), to produce the compiled bytecodes
  • The source code will be stored in the ENVY/Manager as you would expect....but it also stores the resulting bytecodes, as well.
  • The next time you load that code, the default behavior is to check if the ENVY/Manager already has bytecodes available for that edition of the method and bytecode version level.
  • If it does, source compilation is not involved at all.  The bytecodes are grabbed from the ENVY/Manager and linked directly into your new CompiledMethod....as is.
  • If it does not, then it must compile it directly from source code to produce the compiled bytecodes which is used to help create your new CompiledMethod.
  • This is why folks may notice different loading behavior with their applications in 64-bit.  The 64-bit image (due to the different bytecode version level) has never seen this code before.  The ENVY/Manager has never seen this code before either for that bytecode version level, and has no cached bytecodes to offer to allow the image to bypass the compiler.  Therefore it must be compiled from source code which means the source code should be...compilable in the given environment.
  • When exporting code...the bytecodes are going with it.  This means when you import code into another ENVY/Manager....your importing the bytecodes too.  So the new Beta 32-bit image might be able to load code the 64-bit image can't because it has the same bytecode version level as the production 32-bit images....which allows it to link directly from ENVY/Manager cached bytecodes, if present.
What's the advantage of this?
  • We can only speculate, but undeniably if you skip the compiler phase and use the cached bytecode result instead...then you are doing less work.  The time savings may have been considerable back in the 90s...I'm not really sure.  But loading all features into the image and forcing it to compile from source (instead of linking in pre-existing bytecodes)....saved a negligible 3-4% in my benchmarks.  So today, I do not think performance has a very meaningful impact.
  • You could load in code that would otherwise not compile.  I consider this a disadvantage, but if one's measure of success is based on the ability of the code to load or not...then technically I must put it here.
  • From here we could jump to all sorts of obscure use cases that one might list as an advantage.  At the end of the day, they would all probably relate to the fact that the destination image can no longer compile the code from source in it's environment...but one still has the ability to load it into that image into CompiledMethod form.  As to what would happen if/when you ran that bytecode on the virtual machine...who knows?  Anything from hard GPF crashes to "it seems to work fine" simply because the code was never called.
What's the disadvantage of this?
  • You could load code that would otherwise not compile.  If the source code for the given environment that your compiling it in doesn't compile...then something is wrong.  One may have arranged things such that is doesn't matter, but I would probably think about finding another way to do it.
  • Bytecode virus.  We said exporting code brings any bytecodes along with it.  Lets say you produced a bad refactoring...either by your hands or the refactoring browser.  Maybe you moved instance variable positions and the bytecodes were not updated to reflect that.  Or maybe an inst var was deleted and the appropriate updates that reference that inst var were not applied.  Now you export your code to the main shared repository.  Others load your editions and link in your invalid bytecodes.  This can lead to obscure GPFs you might not be able to explain.
  • As stated, no significant performance benefit.
What to do?
There is a section in the "Preferences Workspace" (Transcript Menu -> Tools -> Open Preferences Workspace) called "Configuration Management".
In this section, you will see some options to instruct the image what to do in various situations involving the loading of code.
One of the options is called "EmMethodEdition useLinker: <true|false>"
By default, this option is true meaning it will link cached bytecodes from the ENVY/Manager, if it exists.
But if you set it to false, then loading a method will always involve compiling the source code.

At the very least, I would recommend turning the linker off and taking a look at how your application loads...and if it can load clean in a given environment.

-- Seth


--
You received this message because you are subscribed to the Google Groups "VA Smalltalk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/va-smalltalk.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bytecodes are stored in the ENVY/Manager

Seth Berman
Hello All,

John O'Keefe reminded me of how this business with "bytecodes stored in ENVY" played a role historically (or maybe for some folks...currently).
I'm still not a fan, but at least for me this explanation moves the concept from my mental category of "nonsense" to "acceptable".
I do wish it was not part of the normal everyday workflow with ENVY, and rather a cleanly separated feature, but it does make sense.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
From John --  

"Here is the most important reason bytecodes are stored (historically) in ENVY. It provides support for methods with hidden source. 

Back in the 'good old days', much of the base Smalltalk code from OTI (probably all the private methods and maybe even some of the public methods) was shipped with hidden source. Also, many third-party add-on products shipped with hidden source -- we have gotten some cases over the years because of this.

None of the VA Smalltalk code contains hidden source anymore, but the add-on products with hidden source are still out there."
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

So similar to 3rd party commercial dlls that don't come with the source code, 
this capability allows VA Smalltalk applications to be distributed in a similar manner...without their source code....but still executable.

Thank you John

-- Seth


On Monday, July 31, 2017 at 1:42:05 PM UTC-4, Seth Berman wrote:
Hello All,

We have received a few questions about the loading behavior of imported user code into the Beta builds.
There is some non-obvious stuff going on so I would like to give some background on how the Envy/Manager is involved when you are loading code or compiling code from the smalltalk source code.

First, I'll provide one of the symptoms in case you have seen it before.
It goes something like this:
1. I exported some code from my main ENVY/Manager (.dat)
2. I imported the code into the Beta ENVY/Manager
3. The code loads in the 32-bit beta image (albeit, maybe some unmanaged namespace Transcript entries)
4. The code does NOT load in the 64-bit beta image (compiler errors).
5. Why?

The reason has to do with, what is probably on the top of my list as most-hated-feature, the fact that the result of smalltalk source compilation (i.e. bytecodes) is stored in the ENVY/Manager your image is attached to.
Since there have been multiple versions of bytecodes over the history of VAST...it is stored in the ENVY/Manager by the bytecode version level of your image (and other info like method timestamp and other details).

When you load code into your image, your image is making a request of the ENVY/Manager.
You might intuitively guess that this request is for the smalltalk source code, which your image would then compile into smalltalk bytecodes and create a resulting CompiledMethod.
But this isn't true...at least not always.

What actually happens is:
  • The first time you add some new code, the Smalltalk compiler will use the source code you created along with the current environment (required for things like resolving classes, pools and other things), to produce the compiled bytecodes
  • The source code will be stored in the ENVY/Manager as you would expect....but it also stores the resulting bytecodes, as well.
  • The next time you load that code, the default behavior is to check if the ENVY/Manager already has bytecodes available for that edition of the method and bytecode version level.
  • If it does, source compilation is not involved at all.  The bytecodes are grabbed from the ENVY/Manager and linked directly into your new CompiledMethod....as is.
  • If it does not, then it must compile it directly from source code to produce the compiled bytecodes which is used to help create your new CompiledMethod.
  • This is why folks may notice different loading behavior with their applications in 64-bit.  The 64-bit image (due to the different bytecode version level) has never seen this code before.  The ENVY/Manager has never seen this code before either for that bytecode version level, and has no cached bytecodes to offer to allow the image to bypass the compiler.  Therefore it must be compiled from source code which means the source code should be...compilable in the given environment.
  • When exporting code...the bytecodes are going with it.  This means when you import code into another ENVY/Manager....your importing the bytecodes too.  So the new Beta 32-bit image might be able to load code the 64-bit image can't because it has the same bytecode version level as the production 32-bit images....which allows it to link directly from ENVY/Manager cached bytecodes, if present.
What's the advantage of this?
  • We can only speculate, but undeniably if you skip the compiler phase and use the cached bytecode result instead...then you are doing less work.  The time savings may have been considerable back in the 90s...I'm not really sure.  But loading all features into the image and forcing it to compile from source (instead of linking in pre-existing bytecodes)....saved a negligible 3-4% in my benchmarks.  So today, I do not think performance has a very meaningful impact.
  • You could load in code that would otherwise not compile.  I consider this a disadvantage, but if one's measure of success is based on the ability of the code to load or not...then technically I must put it here.
  • From here we could jump to all sorts of obscure use cases that one might list as an advantage.  At the end of the day, they would all probably relate to the fact that the destination image can no longer compile the code from source in it's environment...but one still has the ability to load it into that image into CompiledMethod form.  As to what would happen if/when you ran that bytecode on the virtual machine...who knows?  Anything from hard GPF crashes to "it seems to work fine" simply because the code was never called.
What's the disadvantage of this?
  • You could load code that would otherwise not compile.  If the source code for the given environment that your compiling it in doesn't compile...then something is wrong.  One may have arranged things such that is doesn't matter, but I would probably think about finding another way to do it.
  • Bytecode virus.  We said exporting code brings any bytecodes along with it.  Lets say you produced a bad refactoring...either by your hands or the refactoring browser.  Maybe you moved instance variable positions and the bytecodes were not updated to reflect that.  Or maybe an inst var was deleted and the appropriate updates that reference that inst var were not applied.  Now you export your code to the main shared repository.  Others load your editions and link in your invalid bytecodes.  This can lead to obscure GPFs you might not be able to explain.
  • As stated, no significant performance benefit.
What to do?
There is a section in the "Preferences Workspace" (Transcript Menu -> Tools -> Open Preferences Workspace) called "Configuration Management".
In this section, you will see some options to instruct the image what to do in various situations involving the loading of code.
One of the options is called "EmMethodEdition useLinker: <true|false>"
By default, this option is true meaning it will link cached bytecodes from the ENVY/Manager, if it exists.
But if you set it to false, then loading a method will always involve compiling the source code.

At the very least, I would recommend turning the linker off and taking a look at how your application loads...and if it can load clean in a given environment.

-- Seth


--
You received this message because you are subscribed to the Google Groups "VA Smalltalk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/va-smalltalk.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bytecodes are stored in the ENVY/Manager

Hans-Martin Mosner-3
Hidden source code is a nuisance, but hidden source code with bugs in it is a major PITA. I went so far as to write a VAST bytecode decompiler to reverse engineer TOPLink code that was behaving wrong. Not nice, but it allows us to continue working with a long-unmaintained product. I still remember when I first used Smalltalk-80 and was blown away by its openness regarding source code, and when I was first exposed to ENVY it was quite a shock to see that OTI hid so much of their code that fixing or extending it was basically impossible.

Hans-Martin

--
You received this message because you are subscribed to the Google Groups "VA Smalltalk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/va-smalltalk.
For more options, visit https://groups.google.com/d/optout.