Cross-posted to:
[hidden email] [hidden email] [hidden email] [hidden email] comp.lang.smalltalk Hello lists. This is a draft of an article I am writing for an upcoming series at the Weekly Squeak. It is about optimizing context creation, I would like comments on its technical correctness, as well as suggestions for related content I should include. I would especially like to know if this article covers the way this is handled in systems I did not specifically research, such as Strongtalk, and Exupery, and non-smalltalk systems like Common Lisp, Erlang, Haskell, and OCaml, since this topic is not really specific to object-oriented systems. This will not be the first article of the series; however, the first two paragraphs are the preliminary series introduction. I put them in so that you can get an idea of what I am writing. Sections enclosed in <i> ... </i> tags are parts I am especially unsure of; I would appreciate comments (I used i tags, because my editor, vim, highlights the enclosed text when in HTML mode). Look forward to this appearing in the Weekly Squeak: http://news.squeak.org/ Thank you for your help, and enjoy the article! === Speeding Up Smalltalk Series, part 42: Context Optimizations === Ever since object-oriented systems became popular, they have had the image of being very easy to program, but very slow in terms of execution speed. This image and reality has plagued every popular object-oriented system, starting with Smalltalk in the 80's and continuing on to Java and Ruby today. However, this does not have to be the case. Early-bound languages have gotten much faster thanks to lots of research on compilers and computer architecture. Similarly, much research has been done on making late-bound object-oriented systems fast, both with software and with hardware. This research has not seen as much exposure as processor research, and so is probably less familiar to you. This series, "Speeding Up Smalltalk" attempts to summarize a lot of that research and present it for your enjoyment. Computation in Smalltalk and Self is done by sending messages to objects. This seemingly simple operation of sending a message is done so often and is so flexible that it is <b> always </b> the execution bottleneck in any implementation of Smalltalk or similar object-oriented system. Why is that? ---- Description of Context Objects ---- Every time a message is sent, four things happen: - A search is done through the message receiver for a method that has been sent. This was the topic of the previous article, and will not be discussed here. - A context object is allocated on the heap. - The context object is initialized and filled with: - local objects, including: - pointers to the message arguments - pointers for each local variable - pointers for temporary storage needed for intermediate results - A way to reference variables defined in other scopes. This is a property blocks have, but methods don't - In Smalltalk, by a pointer to the "home context" - In Self, by delegate slots to the enclosing context(s) - A return address at which to resume execution - Squeak does not need this, since every context has it's own PC, so when returning, the PC of the returned-to context is already in the right place. This is equivalent to a return address, since a child's return address is the parents PC. - A return context that will become current upon return - The newly-created context is made the "current" context for this thread/process In Squeak, MethodContexts and BlockContexts are subclasses of ContextPart. ContextPart has three instance variables: - sender: The context that sent the message being executed. This context will become current when the current context returns - pc: A pointer to the currently executing instruction - stackp: The pointer to the execution stack. The smalltalk virtual machine is a stack processor. local variables are stored in the indexed fields of the context. stackp keeps track of which OOP is at the top of the stack. For MethodContexts: MethodContext extends it with 3 instance variables, and a variable-sized area for holding the temporaries: - method: the CompiledMethod holding the code being executed - receiverMap: <i> Something to do with the names of the variables. Not really sure </i> - receiver: a pointer to "self", the receiver of the message. ---- Squeak optimization: Static Contexts ---- In Squeak, BlockContexts are partial closures (sometimes called static blocks/closures [3]). Rather than hold their own code and environment, BlockContexts piggyback off of another context/code pair. Home is the environment that the block piggybacks off of. This usually works, but there are some cases where this scheme will not work: - <i> like what? Why not? I don't understand why this doesn't always work. Something to do with re-entrancy or circular references. Not sure why either is a problem. </i> For BlockContexts: - nargs: the number of arguments that the block must be called with. - startpc: The index into <code>self home method</code> where the code for the block starts. - home: the home environment Squeak's NewCompiler has a different representation of blocks. These are full closures, storing their own code and environment. They are sometimes called dynamic closures to distinguish them from the above "static closures" [3]. They contain: - method: The code of the block (CompiledMethod2) - environment: Captured temps (ClosureEnvironment) ---- Squeak optimization: Context Recycling ---- To avoid having to create a new MethodContext on every message send, Squeak keeps a pool of pre-allocated MethodContexts, and simply fills one up when a new one is needed. To support this optimization, MethodContexts are only created with two sizes, rather than any size that they could support. The small size has room for <i> a few </i> temporaries, and the large size has room for <i> a bunch </i> [4] ---- VisualWorks optimization: Context Caches ---- The most natural way to store contexts is as a linked list of objects, but that does not map well to the hardware of many processors. Many processors (including all the Intel varieties) have <i> several </i> built-in instructions to mechanize the creation of stack frames, which are similar to context objects, but have several important differences. VisualWorks uses these native stack frames for most operations, meaning that context creation is identical to a C function send in many cases. Thus, the hardware stack acts as a stand-in for the context objects; this optimization is called a context cache. However, stack frames cannot handle context responsibilities beyond a single call and return; when more is needed, a context object must be allocated on the heap. The context object can either be a context in it's own right (a stable context), or act as a proxy to the hardware stack frame (hybrid context). There are several complications to this dual scheme; curious readers should refer to [5]. VW and Squeak does not use a stack of context frames; instead it uses a linked list of real context objects. This is easier, but perhaps slower. To save on allocation costs, contexts are cached and recycled for various method invocations. <i> Not sure where to incorporate this </i> There are three types of blocks: - open/clean blocks (pure function with no closure) - copying blocks - full blocks ---- Or you can cheat... ---- The most popular object-oriented languages (C++, Java) chose to dump the flexibility of blocks in exchange for runtime speed. These languages use a native stack frame without providing the escape mechanism of a context cache, so that blocks and other context behavior that does not follow stack principles is not supported. This is a tradeoff that eliminates a good deal of flexibility, and so will not be explored further. ---- Final words ---- Inlining solves many of the same problems as a Context cache does. If the method body is inlined into the caller, no new context need be created. However, inlining solves many more problems than just context elimination; I will dedicate an entire article to it. ---- References ---- - context recycling may be discussed in tims nublue book: http://www.rowledge.org/tim/squeak/OE-Tour.html - context caches are discussed in section 5.2.3 of http://www.kampjes.demon.co.uk/articles/exuperyDesign.pdf [3] <i> add reference to squeak mailing list or Ian's something to discuss the difference between static and dynamic closures </i> [4] Squeak Class comments of <code>InstructionStream</code>, <code>ContextPart</code>, <code> MethodContext</code>. and <code>BlockContext</code>. [5]: Context Management in VisualWorks 5i: http://wiki.cs.uiuc.edu/VisualWorks/DOWNLOAD/papers/oopsla99-contexts.pdf -- Matthew Fulmer -- http://mtfulmer.wordpress.com/ Help improve Squeak Documentation: http://wiki.squeak.org/squeak/808 _______________________________________________ News mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/news |
Free forum by Nabble | Edit this page |