[Draft Article] Optimizing Contexts

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[Draft Article] Optimizing Contexts

Tapple Gao
Cross-posted to:
    [hidden email]
    [hidden email]
        [hidden email]
    [hidden email]
    comp.lang.smalltalk

Hello lists. This is a draft of an article I am writing for an upcoming
series at the Weekly Squeak. It is about optimizing context creation, I
would like comments on its technical correctness, as well as suggestions
for related content I should include. I would especially like to know if
this article covers the way this is handled in systems I did not
specifically research, such as Strongtalk, and Exupery, and
non-smalltalk systems like Common Lisp, Erlang, Haskell, and OCaml,
since this topic is not really specific to object-oriented systems.

This will not be the first article of the series; however, the first two
paragraphs are the preliminary series introduction. I put them in so
that you can get an idea of what I am writing. Sections enclosed in <i>
... </i> tags are parts I am especially unsure of; I would appreciate
comments (I used i tags, because my editor, vim, highlights the enclosed
text when in HTML mode).

Look forward to this appearing in the Weekly Squeak:

http://news.squeak.org/

Thank you for your help, and enjoy the article!


=== Speeding Up Smalltalk Series, part 42: Context Optimizations ===

Ever since object-oriented systems became popular, they have had the
image of being very easy to program, but very slow in terms of execution
speed. This image and reality has plagued every popular object-oriented
system, starting with Smalltalk in the 80's and continuing on to Java
and Ruby today. However, this does not have to be the case. Early-bound
languages have gotten much faster thanks to lots of research on
compilers and computer architecture. Similarly, much research has been
done on making late-bound object-oriented systems fast, both with
software and with hardware. This research has not seen as much exposure
as processor research, and so is probably less familiar to you. This
series, "Speeding Up Smalltalk" attempts to summarize a lot of that
research and present it for your enjoyment.

Computation in Smalltalk and Self is done by sending messages to
objects. This seemingly simple operation of sending a message is done so
often and is so flexible that it is <b> always </b> the execution
bottleneck in any implementation of Smalltalk or similar object-oriented
system. Why is that?

---- Description of Context Objects ----

Every time a message is sent, four things happen:
- A search is done through the message receiver for a method that has
  been sent. This was the topic of the previous article, and will not be
  discussed here.
- A context object is allocated on the heap.
- The context object is initialized and filled with:
  - local objects, including:
    - pointers to the message arguments
    - pointers for each local variable
    - pointers for temporary storage needed for intermediate results
  - A way to reference variables defined in other scopes. This is a
    property blocks have, but methods don't
    - In Smalltalk, by a pointer to the "home context"
    - In Self, by delegate slots to the enclosing context(s)
  - A return address at which to resume execution
    - Squeak does not need this, since every context has it's own PC, so
      when returning, the PC of the returned-to context is already in
      the right place. This is equivalent to a return address, since a
      child's return address is the parents PC.
  - A return context that will become current upon return
- The newly-created context is made the "current" context for this
  thread/process

In Squeak, MethodContexts and BlockContexts are subclasses of
ContextPart. ContextPart has three instance variables:
- sender: The context that sent the message being executed. This context
  will become current when the current context returns
- pc: A pointer to the currently executing instruction
- stackp: The pointer to the execution stack. The smalltalk virtual
  machine is a stack processor. local variables are stored in the
  indexed fields of the context. stackp keeps track of which OOP is at
  the top of the stack.  For MethodContexts:

MethodContext extends it with 3 instance variables, and a variable-sized
area for holding the temporaries:
- method: the CompiledMethod holding the code being executed
- receiverMap: <i> Something to do with the names of the variables.  Not
  really sure </i>
- receiver: a pointer to "self", the receiver of the message.

---- Squeak optimization: Static Contexts ----

In Squeak, BlockContexts are partial closures (sometimes called static
blocks/closures [3]). Rather than hold their own code and environment,
BlockContexts piggyback off of another context/code pair. Home is the
environment that the block piggybacks off of. This usually works, but
there are some cases where this scheme will not work:
- <i> like what? Why not? I don't understand why this doesn't always
  work. Something to do with re-entrancy or circular references. Not
  sure why either is a problem. </i>

For BlockContexts:
- nargs: the number of arguments that the block must be called with.
- startpc: The index into <code>self home method</code> where the code
  for the block starts.
- home: the home environment

Squeak's NewCompiler has a different representation of blocks. These are
full closures, storing their own code and environment. They are
sometimes called dynamic closures to distinguish them from the above
"static closures" [3]. They contain:
- method: The code of the block (CompiledMethod2)
- environment: Captured temps   (ClosureEnvironment)

---- Squeak optimization: Context Recycling ----

To avoid having to create a new MethodContext on every message send,
Squeak keeps a pool of pre-allocated MethodContexts, and simply fills
one up when a new one is needed. To support this optimization,
MethodContexts are only created with two sizes, rather than any size
that they could support. The small size has room for <i> a few </i>
temporaries, and the large size has room for <i> a bunch </i> [4]

---- VisualWorks optimization: Context Caches ----

The most natural way to store contexts is as a linked list of objects,
but that does not map well to the hardware of many processors. Many
processors (including all the Intel varieties) have <i> several </i>
built-in instructions to mechanize the creation of stack frames, which
are similar to context objects, but have several important differences.
VisualWorks uses these native stack frames for most operations, meaning
that context creation is identical to a C function send in many cases.
Thus, the hardware stack acts as a stand-in for the context objects;
this optimization is called a context cache.

However, stack frames cannot handle context responsibilities beyond a
single call and return; when more is needed, a context object must be
allocated on the heap. The context object can either be a context in
it's own right (a stable context), or act as a proxy to the hardware
stack frame (hybrid context). There are several complications to this
dual scheme; curious readers should refer to [5].

VW and Squeak does not use a stack of context frames; instead it uses a
linked list of real context objects. This is easier, but perhaps slower.
To save on allocation costs, contexts are cached and recycled for
various method invocations.
 
<i> Not sure where to incorporate this </i>

There are three types of blocks:
- open/clean blocks (pure function with no closure)
- copying blocks
- full blocks

---- Or you can cheat... ----

The most popular object-oriented languages (C++, Java) chose to dump the
flexibility of blocks in exchange for runtime speed. These languages use
a native stack frame without providing the escape mechanism of a context
cache, so that blocks and other context behavior that does not follow
stack principles is not supported. This is a tradeoff that eliminates a
good deal of flexibility, and so will not be explored further.

---- Final words ----

Inlining solves many of the same problems as a Context cache does. If
the method body is inlined into the caller, no new context need be
created. However, inlining solves many more problems than just context
elimination; I will dedicate an entire article to it.

---- References ----

- context recycling may be discussed in tims nublue book:
  http://www.rowledge.org/tim/squeak/OE-Tour.html

- context caches are discussed in section 5.2.3 of
  http://www.kampjes.demon.co.uk/articles/exuperyDesign.pdf

[3] <i> add reference to squeak mailing list or Ian's something to
discuss the difference between static and dynamic closures </i>

[4] Squeak Class comments of <code>InstructionStream</code>,
<code>ContextPart</code>, <code> MethodContext</code>. and
<code>BlockContext</code>.

[5]: Context Management in VisualWorks 5i:
     http://wiki.cs.uiuc.edu/VisualWorks/DOWNLOAD/papers/oopsla99-contexts.pdf

--
Matthew Fulmer -- http://mtfulmer.wordpress.com/
Help improve Squeak Documentation: http://wiki.squeak.org/squeak/808
_______________________________________________
Exupery mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery
Reply | Threaded
Open this post in threaded view
|

[Draft Article] Optimizing Contexts

Bryce Kampjes

Hi, I've answered your questions below. If I've missed anything please
ask here. Overall it looks good to me.

Matthew Fulmer writes:
 > Cross-posted to:
 >     [hidden email]
 >     [hidden email]
 > [hidden email]
 >     [hidden email]
 >     comp.lang.smalltalk
 >
 > Hello lists. This is a draft of an article I am writing for an upcoming
 > series at the Weekly Squeak. It is about optimizing context creation, I
 > would like comments on its technical correctness, as well as suggestions
 > for related content I should include. I would especially like to know if
 > this article covers the way this is handled in systems I did not
 > specifically research, such as Strongtalk, and Exupery, and
 > non-smalltalk systems like Common Lisp, Erlang, Haskell, and OCaml,
 > since this topic is not really specific to object-oriented systems.
 >
 > This will not be the first article of the series; however, the first two
 > paragraphs are the preliminary series introduction. I put them in so
 > that you can get an idea of what I am writing. Sections enclosed in <i>
 > ... </i> tags are parts I am especially unsure of; I would appreciate
 > comments (I used i tags, because my editor, vim, highlights the enclosed
 > text when in HTML mode).
 >
 > Look forward to this appearing in the Weekly Squeak:
 >
 > http://news.squeak.org/
 >
 > Thank you for your help, and enjoy the article!
 >
 >
 > === Speeding Up Smalltalk Series, part 42: Context Optimizations ===
 >
 > Ever since object-oriented systems became popular, they have had the
 > image of being very easy to program, but very slow in terms of execution
 > speed. This image and reality has plagued every popular object-oriented
 > system, starting with Smalltalk in the 80's and continuing on to Java
 > and Ruby today. However, this does not have to be the case. Early-bound
 > languages have gotten much faster thanks to lots of research on
 > compilers and computer architecture. Similarly, much research has been
 > done on making late-bound object-oriented systems fast, both with
 > software and with hardware. This research has not seen as much exposure
 > as processor research, and so is probably less familiar to you. This
 > series, "Speeding Up Smalltalk" attempts to summarize a lot of that
 > research and present it for your enjoyment.
 >
 > Computation in Smalltalk and Self is done by sending messages to
 > objects. This seemingly simple operation of sending a message is done so
 > often and is so flexible that it is <b> always </b> the execution
 > bottleneck in any implementation of Smalltalk or similar object-oriented
 > system. Why is that?
 >
 > ---- Description of Context Objects ----
 >
 > Every time a message is sent, four things happen:
 > - A search is done through the message receiver for a method that has
 >   been sent. This was the topic of the previous article, and will not be
 >   discussed here.
 > - A context object is allocated on the heap.
 > - The context object is initialized and filled with:
 >   - local objects, including:
 >     - pointers to the message arguments
 >     - pointers for each local variable
 >     - pointers for temporary storage needed for intermediate results
 >   - A way to reference variables defined in other scopes. This is a
 >     property blocks have, but methods don't
 >     - In Smalltalk, by a pointer to the "home context"
 >     - In Self, by delegate slots to the enclosing context(s)
 >   - A return address at which to resume execution
 >     - Squeak does not need this, since every context has it's own PC, so
 >       when returning, the PC of the returned-to context is already in
 >       the right place. This is equivalent to a return address, since a
 >       child's return address is the parents PC.
 >   - A return context that will become current upon return
 > - The newly-created context is made the "current" context for this
 >   thread/process
 >
 > In Squeak, MethodContexts and BlockContexts are subclasses of
 > ContextPart. ContextPart has three instance variables:
 > - sender: The context that sent the message being executed. This context
 >   will become current when the current context returns
 > - pc: A pointer to the currently executing instruction
 > - stackp: The pointer to the execution stack. The smalltalk virtual
 >   machine is a stack processor. local variables are stored in the
 >   indexed fields of the context. stackp keeps track of which OOP is at
 >   the top of the stack.  For MethodContexts:
 >
 > MethodContext extends it with 3 instance variables, and a variable-sized
 > area for holding the temporaries:
 > - method: the CompiledMethod holding the code being executed
 > - receiverMap: <i> Something to do with the names of the variables.  Not
 >   really sure </i>
 > - receiver: a pointer to "self", the receiver of the message.

receiverMap isn't used in stock Squeak. It's the unused slot. Exupery
originally used it to hold the compiled program counter. This was neat
as contexts could be either interpreted or compiled. Exupery no longer
does this as the slot is needed for BlockContexts. Exupery still uses
the variable as a flag to figure out whether to run interpreted or
compiled code. Exupery will stop using the variable soon to be
compatable with the closure compiler.

The closure compiler uses the slot to hold environments which contain
variables that are shared by multiple contexts.

 > ---- Squeak optimization: Static Contexts ----
 >
 > In Squeak, BlockContexts are partial closures (sometimes called static
 > blocks/closures [3]). Rather than hold their own code and environment,
 > BlockContexts piggyback off of another context/code pair. Home is the
 > environment that the block piggybacks off of. This usually works, but
 > there are some cases where this scheme will not work:
 > - <i> like what? Why not? I don't understand why this doesn't always
 >   work. Something to do with re-entrancy or circular references. Not
 >   sure why either is a problem. </i>

A block variable only exists once for each method not for each block.

So
   aCollection do:
      [:each| self registerCallBackBlock: [each doSomething]]

Would register a call call back block for each element of aCollection
except they would all use the same variable for each so they would all
do the call back action for the last block.

The problem with recursion is the same except there's code to stop a
block from being entered twice so this at least produces an error
instead of doing the wrong thing silently.

 > For BlockContexts:
 > - nargs: the number of arguments that the block must be called with.
 > - startpc: The index into <code>self home method</code> where the code
 >   for the block starts.
 > - home: the home environment
 >
 > Squeak's NewCompiler has a different representation of blocks. These are
 > full closures, storing their own code and environment. They are
 > sometimes called dynamic closures to distinguish them from the above
 > "static closures" [3]. They contain:
 > - method: The code of the block (CompiledMethod2)
 > - environment: Captured temps   (ClosureEnvironment)
 >
 > ---- Squeak optimization: Context Recycling ----
 >
 > To avoid having to create a new MethodContext on every message send,
 > Squeak keeps a pool of pre-allocated MethodContexts, and simply fills
 > one up when a new one is needed. To support this optimization,
 > MethodContexts are only created with two sizes, rather than any size
 > that they could support. The small size has room for <i> a few </i>
 > temporaries, and the large size has room for <i> a bunch </i> [4]
 >
 > ---- VisualWorks optimization: Context Caches ----
 >
 > The most natural way to store contexts is as a linked list of objects,
 > but that does not map well to the hardware of many processors. Many
 > processors (including all the Intel varieties) have <i> several </i>
 > built-in instructions to mechanize the creation of stack frames, which
 > are similar to context objects, but have several important differences.
 > VisualWorks uses these native stack frames for most operations, meaning
 > that context creation is identical to a C function send in many cases.
 > Thus, the hardware stack acts as a stand-in for the context objects;
 > this optimization is called a context cache.
 >
 > However, stack frames cannot handle context responsibilities beyond a
 > single call and return; when more is needed, a context object must be
 > allocated on the heap. The context object can either be a context in
 > it's own right (a stable context), or act as a proxy to the hardware
 > stack frame (hybrid context). There are several complications to this
 > dual scheme; curious readers should refer to [5].
 >
 > VW and Squeak does not use a stack of context frames; instead it uses a
 > linked list of real context objects. This is easier, but perhaps slower.
 > To save on allocation costs, contexts are cached and recycled for
 > various method invocations.
 >  
 > <i> Not sure where to incorporate this </i>
 >
 > There are three types of blocks:
 > - open/clean blocks (pure function with no closure)
 > - copying blocks
 > - full blocks
 >
 > ---- Or you can cheat... ----
 >
 > The most popular object-oriented languages (C++, Java) chose to dump the
 > flexibility of blocks in exchange for runtime speed. These languages use
 > a native stack frame without providing the escape mechanism of a context
 > cache, so that blocks and other context behavior that does not follow
 > stack principles is not supported. This is a tradeoff that eliminates a
 > good deal of flexibility, and so will not be explored further.
 >
 > ---- Final words ----
 >
 > Inlining solves many of the same problems as a Context cache does. If
 > the method body is inlined into the caller, no new context need be
 > created. However, inlining solves many more problems than just context
 > elimination; I will dedicate an entire article to it.
 >
 > ---- References ----
 >
 > - context recycling may be discussed in tims nublue book:
 >   http://www.rowledge.org/tim/squeak/OE-Tour.html
 >
 > - context caches are discussed in section 5.2.3 of
 >   http://www.kampjes.demon.co.uk/articles/exuperyDesign.pdf
 >
 > [3] <i> add reference to squeak mailing list or Ian's something to
 > discuss the difference between static and dynamic closures </i>
 >
 > [4] Squeak Class comments of <code>InstructionStream</code>,
 > <code>ContextPart</code>, <code> MethodContext</code>. and
 > <code>BlockContext</code>.
 >
 > [5]: Context Management in VisualWorks 5i:
 >      http://wiki.cs.uiuc.edu/VisualWorks/DOWNLOAD/papers/oopsla99-contexts.pdf
 >
 > --
 > Matthew Fulmer -- http://mtfulmer.wordpress.com/
 > Help improve Squeak Documentation: http://wiki.squeak.org/squeak/808
 > _______________________________________________
 > Exupery mailing list
 > [hidden email]
 > http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery
_______________________________________________
Exupery mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery