The Trunk: Kernel-eem.1067.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

The Trunk: Kernel-eem.1067.mcz

commits-2
Eliot Miranda uploaded a new version of Kernel to project The Trunk:
http://source.squeak.org/trunk/Kernel-eem.1067.mcz

==================== Summary ====================

Name: Kernel-eem.1067
Author: eem
Time: 22 March 2017, 1:20:50.066536 pm
UUID: 0073e975-c0c9-4923-9702-88e51890ae33
Ancestors: Kernel-eem.1066

Add CompiledCode and CompiledBlock to the package explciitly, now that the script has created them. Provide the class comments.

=============== Diff against Kernel-eem.1066 ===============

Item was added:
+ CompiledCode variableByteSubclass: #CompiledBlock
+ instanceVariableNames: ''
+ classVariableNames: ''
+ poolDictionaries: ''
+ category: 'Kernel-Methods'!
+
+ !CompiledBlock commentStamp: 'eem 3/22/2017 12:10' prior: 0!
+ CompiledBlock instances are blocks suitable for interpretation by the virtual machine.  They are a specialization oif CompiledCode.  This requires both bytecode set and compiler support.  The V3 bytecode (EncoderForV3PlusClosures) does not provide support for CompiledBlock.  The SistaV1 set does (EncoderForSistaV1).
+
+ The last literal in a CompiledBlock is reserved for a reference to its enclosing CompiledBlock or CompiledMethod.  Super sends in CompiledBlocks must use the directed super send bytecode.  
+
+ By convention the penultimate literal of a method is either its selector or an instance of AdditionalMethodState.  AdditionalMethodState may be used to add instance variables to a method, albeit ones held in the method's AdditionalMethodState.  Subclasses of CompiledBlock that want to add state should subclass AdditionalMethodState to add the state they want, and implement methodPropertiesClass on the class side of the CompiledBlock subclass to answer the specialized subclass of AdditionalMethodState.  Enterprising programmers are encouraged to try and implement this support automatically through suitable modifications to the compiler and class builder.!

Item was added:
+ ByteArray variableByteSubclass: #CompiledCode
+ instanceVariableNames: ''
+ classVariableNames: 'LargeFrame PrimaryBytecodeSetEncoderClass SecondaryBytecodeSetEncoderClass SmallFrame'
+ poolDictionaries: ''
+ category: 'Kernel-Methods'!
+
+ !CompiledCode commentStamp: 'eem 3/22/2017 12:14' prior: 0!
+ CompiledCode instances are methods suitable for execution by the virtual machine.  Instances of CompiledCode and its subclasses are the only objects in the system that have both indexable pointer fields and indexable 8-bit integer fields.  The first part of a CompiledCode object is pointers, the second part is bytes.  CompiledCode inherits from ByteArray to avoid duplicating some of ByteArray's methods, not because a CompiledCode is-a ByteArray.
+
+ Instance variables: *indexed* (no named inst vars)
+
+ Class variables:
+ SmallFrame - the number of stack slots in a small frame Context
+ LargeFrame - the number of stack slots in a large frame Context
+ PrimaryBytecodeSetEncoderClass - the encoder class that defines the primary instruction set
+ SecondaryBytecodeSetEncoderClass - the encoder class that defines the secondary instruction set
+
+ The current format of a CompiledCode object is as follows:
+
+ header (4 or 8 bytes, SmallInteger)
+ literals (4 or 8 bytes each, Object, see "The last literal..." below)
+ bytecodes  (variable, bytes)
+ trailer (variable, bytes)
+
+ The header is a SmallInteger (which in the 32-bit system has 31 bits, and in the 64-bit system, 61 bits) in the following format:
+
+ (index 0) 15 bits: number of literals (#numLiterals)
+ (index 15)  1 bit: is optimized - reserved for methods that have been optimized by Sista
+ (index 16)  1 bit: has primitive
+ (index 17)  1 bit: whether a large frame size is needed (#frameSize => either SmallFrame or LargeFrame)
+ (index 18)  6 bits: number of temporary variables (#numTemps)
+ (index 24)  4 bits: number of arguments to the method (#numArgs)
+ (index 28)  2 bits: reserved for an access modifier (00-unused, 01-private, 10-protected, 11-public), although accessors for bit 29 exist (see #flag).
+ sign bit:  1 bit: selects the instruction set, >= 0 Primary, < 0 Secondary (#signFlag)
+
+ If the method has a primitive then the first bytecode of the method must be a callPrimitive: bytecode that encodes the primitive index.  This bytecode can encode a primitive index from 0 to 65535.
+
+ The trailer is an encoding of an instance of CompiledMethodTrailer.  It is typically used to encode the index into the source files array of the method's source, but may be used to encode other values, e.g. tempNames, source as a string, etc.  See the class CompiledMethodTrailer.
+
+ While there are disadvantages to this "flat" representation (it is impossible to add named instance variables to CompiledCode or its subclasses, but it is possible indirectly; see AdditionalMethodState) it is effective for interpreters.  It means that both bytecodes and literals can be fetched directly from a single method object, and that only one object, the method, must be saved and restored on activation and return.  A more natural representation, in which there are searate instance variables for the bytecode, and (conveniently) the literals, requires either much more work on activation and return setting up references to the literals and bytecodes, or slower access to bytecodes and literals, indirecting on each access.
+
+ The last literal of a CompiledCode object is reserved for special use by the kernel and/or the virtual machine.  In CompiledMethod instances it must either be the methodClassAssociation, used to implement super sends, or nil, if the method is anonymous. In CompiledBlock it is to be used for a reference to the enclosing method or block object.
+
+ By convention, the penultimate literal is reserved for special use by the kernel. CompiledMethod instances it must either be the method selector, or an instance of AdditionalMethodState which holds the selector and any pragmas or properties in the method.  In CompiledBlock it is reserved for use for an AdditionalMethodState.
+
+ Note that super sends in CompiledBlock instances do not use a methodClass association, but expect a directed supersend bytecode, in which the method class (the subclass of the class in which to start the lookup) is a literal.  Logically when we switch to a bytecode set that supports the directed super send bytecode, and discard the old super send bytecodes, we can use the last literal to store the selector or the enclosing method/block or an AdditionalMethodState, and the AdditionalMethodState can hold the selector and/or the enclosing method/block.!

Item was changed:
+ CompiledCode variableByteSubclass: #CompiledMethod
- ByteArray variableByteSubclass: #CompiledMethod
  instanceVariableNames: ''
+ classVariableNames: ''
- classVariableNames: 'LargeFrame PrimaryBytecodeSetEncoderClass SecondaryBytecodeSetEncoderClass SmallFrame'
  poolDictionaries: ''
  category: 'Kernel-Methods'!
 
+ !CompiledMethod commentStamp: 'eem 3/22/2017 13:17' prior: 0!
+ CompiledMethod instances are methods suitable for interpretation by the virtual machine.  They are a specialization of CompiledCode.  They represent methods, and may also, depending on the bytecode set, include nested blocks.  Bytecode sets that support non-nested blocks with use CompiledBlock instances to implement nested block methods, that are separate from their enclosing method.  This requires compiler support.
- !CompiledMethod commentStamp: 'eem 1/22/2015 15:47' prior: 0!
- CompiledMethod instances are methods suitable for interpretation by the virtual machine.  Instances of CompiledMethod and its subclasses are the only objects in the system that have both indexable pointer fields and indexable 8-bit integer fields.  The first part of a CompiledMethod is pointers, the second part is bytes.  CompiledMethod inherits from ByteArray to avoid duplicating some of ByteArray's methods, not because a CompiledMethod is-a ByteArray.
 
+ The last literal in a CompiledMethod must be its methodClassAssociation, a binding whose value is the class the method is installed in.  The methodClassAssociation is used to implement super sends.  If a method contains no super send then its methodClassAssociation may be nil (as would be the case for example of methods providing a pool of inst var accessors).  
- Class variables:
- SmallFrame - the number of stack slots in a small frame Context
- LargeFrame - the number of stack slots in a large frame Context
- PrimaryBytecodeSetEncoderClass - the encoder class that defines the primary instruction set
- SecondaryBytecodeSetEncoderClass - the encoder class that defines the secondary instruction set
 
+ By convention the penultimate literal of a method is either its selector or an instance of AdditionalMethodState.  AdditionalMethodState holds the method's selector and any pragmas and properties of the method.  AdditionalMethodState may also be used to add instance variables to a method, albeit ones held in the method's AdditionalMethodState.  Subclasses of CompiledMethod that want to add state should subclass AdditionalMethodState to add the state they want, and implement methodPropertiesClass on the class side of the CompiledMethod subclass to answer the specialized subclass of AdditionalMethodState.  Enterprising programmers are encouraged to try and implement this support automatically through suitable modifications to the compiler and class builder.!
- The current format of a CompiledMethod is as follows:
-
- header (4 or 8 bytes, SmallInteger)
- literals (4 or 8 bytes each, Object, see "The last literal..." below)
- bytecodes  (variable, bytes)
- trailer (variable, bytes)
-
- The header is a SmallInteger (which in the 32-bit system has 31 bits, and in the 64-bit system, 61 bits) in the following format:
-
- (index 0) 15 bits: number of literals (#numLiterals)
- (index 15)  1 bit: is optimized - reserved for methods that have been optimized by Sista
- (index 16)  1 bit: has primitive
- (index 17)  1 bit: whether a large frame size is needed (#frameSize => either SmallFrame or LargeFrame)
- (index 18)  6 bits: number of temporary variables (#numTemps)
- (index 24)  4 bits: number of arguments to the method (#numArgs)
- (index 28)  2 bits: reserved for an access modifier (00-unused, 01-private, 10-protected, 11-public), although accessors for bit 29 exist (see #flag).
- sign bit:  1 bit: selects the instruction set, >= 0 Primary, < 0 Secondary (#signFlag)
-
- If the method has a primitive then the first bytecode of the method must be a callPrimitive: bytecode that encodes the primitive index.
-
- The trailer is an encoding of an instance of CompiledMethodTrailer.  It is typically used to encode the index into the source files array of the method's source, but may be used to encode other values, e.g. tempNames, source as a string, etc.  See the class CompiledMethodTrailer.
-
- The last literal in a CompiledMethod must be its methodClassAssociation, a binding whose value is the class the method is installed in.  The methodClassAssociation is used to implement super sends.  If a method contains no super send then its methodClassAssociation may be nil (as would be the case for example of methods providing a pool of inst var accessors).  By convention the penultimate literal of a method is either its selector or an instance of AdditionalMethodState.  AdditionalMethodState holds any pragmas and properties of a method, but may also be used to add instance variables to a method, albeit ones held in the method's AdditionalMethodState.  Subclasses of CompiledMethod that want to add state should subclass AdditionalMethodState to add the state they want, and implement methodPropertiesClass on the class side of the CompiledMethod subclass to answer the specialized subclass of AdditionalMethodState.!


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Kernel-eem.1067.mcz

David T. Lewis
On Wed, Mar 22, 2017 at 08:21:06PM +0000, [hidden email] wrote:

> Eliot Miranda uploaded a new version of Kernel to project The Trunk:
> http://source.squeak.org/trunk/Kernel-eem.1067.mcz
>
> ==================== Summary ====================
>
> Name: Kernel-eem.1067
> Author: eem
> Time: 22 March 2017, 1:20:50.066536 pm
> UUID: 0073e975-c0c9-4923-9702-88e51890ae33
> Ancestors: Kernel-eem.1066
>
> Add CompiledCode and CompiledBlock to the package explciitly, now that the script has created them. Provide the class comments.
>

To confirm - trunk updates are going smoothly :-)

One temporary glitch is that the update to Kernel-eem.1067 is generating a merge
request dialog. I'm not certain, but I think that this would go away with an
additional update map to force update from Kernel-eem.1065 to Kernel-eem.1067.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Kernel-eem.1067.mcz

Eliot Miranda-2


On Wed, Mar 22, 2017 at 2:17 PM, David T. Lewis <[hidden email]> wrote:
On Wed, Mar 22, 2017 at 08:21:06PM +0000, [hidden email] wrote:
> Eliot Miranda uploaded a new version of Kernel to project The Trunk:
> http://source.squeak.org/trunk/Kernel-eem.1067.mcz
>
> ==================== Summary ====================
>
> Name: Kernel-eem.1067
> Author: eem
> Time: 22 March 2017, 1:20:50.066536 pm
> UUID: 0073e975-c0c9-4923-9702-88e51890ae33
> Ancestors: Kernel-eem.1066
>
> Add CompiledCode and CompiledBlock to the package explciitly, now that the script has created them. Provide the class comments.
>

To confirm - trunk updates are going smoothly :-)

One temporary glitch is that the update to Kernel-eem.1067 is generating a merge
request dialog. I'm not certain, but I think that this would go away with an
additional update map to force update from Kernel-eem.1065 to Kernel-eem.1067.

Damn, and I thought I'd figured out a way to get away with one update map :-).  So how to I generate the intermediate map and what version number should it receive?
 
Dave
_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Kernel-eem.1067.mcz

David T. Lewis
On Wed, Mar 22, 2017 at 02:39:41PM -0700, Eliot Miranda wrote:

> On Wed, Mar 22, 2017 at 2:17 PM, David T. Lewis <[hidden email]> wrote:
>
> > On Wed, Mar 22, 2017 at 08:21:06PM +0000, [hidden email] wrote:
> > > Eliot Miranda uploaded a new version of Kernel to project The Trunk:
> > > http://source.squeak.org/trunk/Kernel-eem.1067.mcz
> > >
> > > ==================== Summary ====================
> > >
> > > Name: Kernel-eem.1067
> > > Author: eem
> > > Time: 22 March 2017, 1:20:50.066536 pm
> > > UUID: 0073e975-c0c9-4923-9702-88e51890ae33
> > > Ancestors: Kernel-eem.1066
> > >
> > > Add CompiledCode and CompiledBlock to the package explciitly, now that
> > the script has created them. Provide the class comments.
> > >
> >
> > To confirm - trunk updates are going smoothly :-)
> >
> > One temporary glitch is that the update to Kernel-eem.1067 is generating a
> > merge
> > request dialog. I'm not certain, but I think that this would go away with
> > an
> > additional update map to force update from Kernel-eem.1065 to
> > Kernel-eem.1067.
> >
>
> Damn, and I thought I'd figured out a way to get away with one update map
> :-).  So how to I generate the intermediate map and what version number
> should it receive?
>

I'm not sure I have this right, so hopefully someone else can confirm or
correct me. But I think this is what is happening:

- The update-eem.400 update map loads Kernel-eem.1065, which is the version
immediately before the class hierarchy changes.

- Kernel-eem.1066 (and any later package) has the package prefix that changes
the class hierarchy. As of 1066, the changes have been made in the preamble
but not yet committed, so the package is dirty after loading 1066.

- Kernel-eem.1077 commits the actual changes. The package is dirty at this
point, hence the merge dialog.

Kernel-eem-1077 contains the prefix that makes the changes, and also the
actual changes. So I am guessing that if we add update-xxx.401 and have it
point directly to Kernel-eem.1077, then this would probably avoid the
merge dialog. But I have not tried it so I am not sure.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Kernel-eem.1067.mcz

Bert Freudenberg
On Wed, Mar 22, 2017 at 11:08 PM, David T. Lewis <[hidden email]> wrote:
On Wed, Mar 22, 2017 at 02:39:41PM -0700, Eliot Miranda wrote:
> On Wed, Mar 22, 2017 at 2:17 PM, David T. Lewis <[hidden email]> wrote:
>
> > One temporary glitch is that the update to Kernel-eem.1067 is generating a
> > merge
> > request dialog. I'm not certain, but I think that this would go away with
> > an
> > additional update map to force update from Kernel-eem.1065 to
> > Kernel-eem.1067.
> >
>
> Damn, and I thought I'd figured out a way to get away with one update map
> :-).  So how to I generate the intermediate map and what version number
> should it receive?
>

I'm not sure I have this right, so hopefully someone else can confirm or
correct me. But I think this is what is happening:

- The update-eem.400 update map loads Kernel-eem.1065, which is the version
immediately before the class hierarchy changes.

- Kernel-eem.1066 (and any later package) has the package prefix that changes
the class hierarchy. As of 1066, the changes have been made in the preamble
but not yet committed, so the package is dirty after loading 1066.

- Kernel-eem.1077 commits the actual changes. The package is dirty at this
point, hence the merge dialog.

Kernel-eem-1077 contains the prefix that makes the changes, and also the
actual changes. So I am guessing that if we add update-xxx.401 and have it
point directly to Kernel-eem.1077, then this would probably avoid the
merge dialog. But I have not tried it so I am not sure.

Dave

I did not get a merge dialog, so maybe all is fine?

Note that if Eliot had not put the migration code in a method, we wouldn't even have needed an update map. With Tim Felgentreff I did this before (but forgot to put in the inbox, sorry) and we only had a preamble like this in the Kernel package:

---------- package preamble ----------
"CompiledMethod is too dangerous to change, Monticello won't do it for us"
(Smalltalk classNamed: 'CompiledCode') ifNil: [
(ByteArray variableByteSubclass: #CompiledCode
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Kernel-Methods') setFormat: CompiledMethod format.
CompiledMethod superclass: (Smalltalk at: #CompiledCode).
CompiledMethod class superclass: (Smalltalk at: #CompiledCode) class.
(Smalltalk at: #CompiledCode) classPool: CompiledMethod classPool.
CompiledMethod classPool: Dictionary new].

This way there's no need to have separate package versions adding the script, introducing the preamble, loading the actual code, and removing the script; it's just a single version.

(just mentioning this for future migrations, what Eliot committed works fine)

- Bert -



Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Kernel-eem.1067.mcz

Eliot Miranda-2
Hi Bert,

On Thu, Mar 23, 2017 at 5:58 AM, Bert Freudenberg <[hidden email]> wrote:
On Wed, Mar 22, 2017 at 11:08 PM, David T. Lewis <[hidden email]> wrote:
On Wed, Mar 22, 2017 at 02:39:41PM -0700, Eliot Miranda wrote:
> On Wed, Mar 22, 2017 at 2:17 PM, David T. Lewis <[hidden email]> wrote:
>
> > One temporary glitch is that the update to Kernel-eem.1067 is generating a
> > merge
> > request dialog. I'm not certain, but I think that this would go away with
> > an
> > additional update map to force update from Kernel-eem.1065 to
> > Kernel-eem.1067.
> >
>
> Damn, and I thought I'd figured out a way to get away with one update map
> :-).  So how to I generate the intermediate map and what version number
> should it receive?
>

I'm not sure I have this right, so hopefully someone else can confirm or
correct me. But I think this is what is happening:

- The update-eem.400 update map loads Kernel-eem.1065, which is the version
immediately before the class hierarchy changes.

- Kernel-eem.1066 (and any later package) has the package prefix that changes
the class hierarchy. As of 1066, the changes have been made in the preamble
but not yet committed, so the package is dirty after loading 1066.

- Kernel-eem.1077 commits the actual changes. The package is dirty at this
point, hence the merge dialog.

Kernel-eem-1077 contains the prefix that makes the changes, and also the
actual changes. So I am guessing that if we add update-xxx.401 and have it
point directly to Kernel-eem.1077, then this would probably avoid the
merge dialog. But I have not tried it so I am not sure.

Dave

I did not get a merge dialog, so maybe all is fine?

Note that if Eliot had not put the migration code in a method, we wouldn't even have needed an update map. With Tim Felgentreff I did this before (but forgot to put in the inbox, sorry) and we only had a preamble like this in the Kernel package:

I don't like having complex package scripts, at least not unless senders works with them.  So I consciously chose to use the method.  One of the things that's a little weak with package scripts is knowing how long they should stay there.  The style that I used in my method
- check if the transformation has been applied and if so exit.
- apply the transformation
- check that the transformation succeeded
allows for the preamble script to invoke the method as many times as the package is loaded until the preamble script is modified

It would be nice to have some ability to look at versions of package scripts and match the changes against package versions.  I'm not sure I see a way of determining easily
- if a particular version of a package script was actually applied to an image
- when a particular version of a package script was actually applied to an image
The same criticism can be levelled at class initialisers, but their being more accessible to the standard tools makes exploring the questions easier IMO.


---------- package preamble ----------
"CompiledMethod is too dangerous to change, Monticello won't do it for us"
(Smalltalk classNamed: 'CompiledCode') ifNil: [
(ByteArray variableByteSubclass: #CompiledCode
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Kernel-Methods') setFormat: CompiledMethod format.
CompiledMethod superclass: (Smalltalk at: #CompiledCode).
CompiledMethod class superclass: (Smalltalk at: #CompiledCode) class.
(Smalltalk at: #CompiledCode) classPool: CompiledMethod classPool.
CompiledMethod classPool: Dictionary new].

This way there's no need to have separate package versions adding the script, introducing the preamble, loading the actual code, and removing the script; it's just a single version.

(just mentioning this for future migrations, what Eliot committed works fine)

You could have saved me some work ;-)
 

- Bert -







--
_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Kernel-eem.1067.mcz

Bert Freudenberg
On Sat, Mar 25, 2017 at 3:30 AM, Eliot Miranda <[hidden email]> wrote:
Hi Bert,

I don't like having complex package scripts, at least not unless senders works with them.  So I consciously chose to use the method.  One of the things that's a little weak with package scripts is knowing how long they should stay there.  The style that I used in my method
- check if the transformation has been applied and if so exit.
- apply the transformation
- check that the transformation succeeded
allows for the preamble script to invoke the method as many times as the package is loaded until the preamble script is modified

Except it will not be invoked again if the package load itself succeeded, until it is modified. We don't have a special mechanism for that, but I guess the script could raise an exception which would abort the loading. This might be okayish in a preamble script since the rest of the package has not been added to the system yet.

It would be nice to have some ability to look at versions of package scripts and match the changes against package versions.  I'm not sure I see a way of determining easily
- if a particular version of a package script was actually applied to an image
- when a particular version of a package script was actually applied to an image

I guess one could play around with concrete PackageInfo subclasses, but yes, we don't have support for that currently.
 
The same criticism can be levelled at class initialisers, but their being more accessible to the standard tools makes exploring the questions easier IMO.

I generally prefer class initializers too, but they're executed after the package loaded, not as a preamble.

- Bert -