Hi-- contents: introduction shared variables and compilation/execution classes other shared variables the root class the special objects array method references to "Smalltalk" another reminder about live behavior transfer why do this now introduction Previously I mentioned that I wanted to rid Spoon of the system dictionary. Here's how I'm currently doing it, and, more generally, how I'm supporting shared variables in Spoon. Thanks in advance for any feedback! First, a recap of the system dictionary concept and why I don't like it. :) shared variables and compilation/execution Shared variables in Smalltalk are stored as associations; the key is a shared variable name, and the value is an object associated with that name. When the compiler compiles source for a method that refers to a shared variable's name, it attempts to find an appropriate shared-variable association for that name. It stores that association in the "literal frame" of the resulting compiled method (currently, a span of the method's bytes after the header and before the instructions). There are instructions for pushing the value of a particular shared-variable association from a method's literal frame onto the stack (or "temporary frame") of a context running that method (see Interpreter>>pushLiteralVariableBytecode, the implementation of interpreter operations 16r40 to 16r5F). Traditionally, all of these shared-variables associations are stored in dictionaries that the compiler knows about. As far as the compiler is concerned, the outermost shared-variable scope is represented by the "system dictionary", a singleton instance of the SystemDictionary class called "Smalltalk". (Just to review, note that the system dictionary has an association whose key is the symbol #Smalltalk and whose value is the system dictionary itself. This association is used in the compiler methods that make use of the system dictionary.) classes Most of the associations in the system dictionary refer to classes. The key of each such association indicates the name of the corresponding class as far as the compiler is concerned. Additionally, each class has a "name" instance variable. That is, the name of each class is stored in two distinct places: in the system dictionary, and in the class itself. In effect, the compiler's notion of a class' name and the class' own notion of its name are distinct (and possibly conflicting). I propose to make the compiler use the classes' notion of names directly, so that there is only one naming scheme, and that the classes themselves are responsible for it. To do this, instead of storing a class' name symbol in its "name" instance variable, we can just store the shared-variable association that the compiled methods use (and which used to be in the system dictionary). When the compiler wants to find a class with some name in some source a human just wrote, it can search the class hierarchy from the root (class Object). As I discussed earlier here with Ralph Johnson, it's typically not as fast as a dictionary lookup, but it's acceptable (the compiler tends not to be a part of the system that needs every cycle squeezed out of it). If the compiler finds multiple classes with some name in source submitted for compilation, it can present other information about these classes (e.g., class category or module) to the human, and ask for a choice. When already-compiled methods are transferred between systems, there is no ambiguity, since class names aren't used at all (see "another reminder about live behavior transfer" below). the root class As you might guess, this means that Spoon will not have multiple root classes. So far, all the non-primary root classes in Squeak were motivated by a desire to use method lookup failure for various "proxyish" features. I support such features in Spoon directly with the interpreter (see for example, class "Other"), so it's not necessary to have more than one root class (it's also not necessary to have the "ProtoObject" class). As for how to access the root class, there are a couple of options. We could store the root class' shared-variable association directly in methods, or we could store the root class in the "special objects array" (it could take the system dictionary's place there, in fact). the special objects array This brings me to the special objects array. :) I've always found it odd that it's chock-full of well-known and relatively unchanging things, but it doesn't have its own class and protocol. I've never liked the name "special objects array" either; it seems too vague. Metaphorically, I think the special objects array represents the grip that the interpreter has (and needs to have) on the object memory. So for Spoon I've created a class called "InterpreterGrip" whose sole instance is a collection of the objects that the interpreter knows about. I call each of these objects a "grip point". There is protocol for accessing them (for example, a "rootClass" message). I find this more pleasant than the current scheme. other shared variables Anyway, back to the system dictionary. I addressed the associations there that refer to classes, but there are others. These are the other so-called "global" variables (like Display, the primary display) as well as all the "shared pools" (like TextConstants and, strictly speaking, Undeclared). I think each global variable should be the responsibility of some class. So the primary display could be something you get by sending "primary" to DisplayScreen. Shared pools are dictionaries of shared-variable associations, similar to the system dictionary (in fact, I'd call the system dictionary just another shared pool). I know some think we should simply banish all shared pools, but I'll assume for the moment that we're keeping them. I find them useful, I just think some class should take responsibility for each one. I've added a "publishedPools" instance variable to Class, which stores all the shared pool dictionaries for which a class has responsibility (i.e., the class that introduced the pool into the system). I renamed the traditional "sharedPools" instance variable in Class to "receivedPools"; these are the pools that a class merely uses. Finally, I renamed the "classPool" instance variable to "classVariablesPool", just to be clearer. When you want to use a shared pool, you access the pool by sending a message to the responsible class, rather than relying on its name being a global variable. method references to "Smalltalk" So now we've got new homes for all the shared-variable associations which used to be reachable through the system dictionary. The other thing to do is refactor the methods which use the shared-variable association for the system dictionary itself (the methods which refer to "Smalltalk"). I'm working on this now. There are about a thousand of them in a "full" object memory, but for most of them it's clear which class should actually take responsibility. For example, there are several methods which (in my opinion) are rightly the responsibility of the Interpreter class (like the garbage collection messages). I've also written some refactoring tools that automate a lot of this (e.g., a tool which replaces the push of one literal variable with another when followed by the sending of a particular message). another reminder about live behavior transfer Some of these decisions would be problematic if we were limited to using source code ("fileouts") to transfer behavior between systems. Since Spoon can transfer methods directly, without recompilation (or even source code) and without referring to shared-variable names at all, it works (see the MethodLiteralTransmissionMarker hierarchy for details). why do this now This work was always lurking in the future, but now the issue is forced by my work on Naiad (Spoon's module system). I'm making a module which reattaches the primary display (the system is initially headless), and that meant deciding how to access it. Since access is traditionally through a global variable (Display), the can of worms was opened. :) *** Again, thanks in advance for any feedback or questions. I'm usually around on the Squeak IRC channel from 1700 to 0500 GMT, and I read the squeak-dev and Spoon lists. thanks again, -C -- Craig Latta http://netjam.org/resume |
On Fri, 28 Jul 2006 09:14:00 +0200, Craig Latta wrote:
... > shared variables and compilation/execution ... > I propose to make the compiler use the classes' notion of names > directly, so that there is only one naming scheme, and that the classes > themselves are responsible for it. To do this, instead of storing a > class' name symbol in its "name" instance variable, we can just store > the shared-variable association that the compiled methods use (and which > used to be in the system dictionary). Interesting idea. > When the compiler wants to find a class with some name in some source a > human just wrote, it can search the class hierarchy from the root (class > Object). Then you'd break with today's capabilities: if a class/subclass relation is broken, today one can use compiler to fix it. I'd suggest a search like nil systemNavigation allObjectsDo: [:oop | (oop isBehavior and: ['HtmlFileStream' = oop name]) ifTrue: [^ oop]] Also does less bytecodes than ... subclasses do: ... I always thought that the subclasses instance variable is superflous because the superclass variable serves the same purpose (via indirection). It's only convenience (and performance penalty if subclasses where not maintained). Of course, garbage collection may also have played a role: who references a subclass which doesn't have instances (which themselves are referenced). This, BTW, raises an interesting question: without a "global" which refers to your root class, what is it (in your system) that keeps classes from being garbage collected? > the root class > > As you might guess, this means that Spoon will not have multiple root > classes. So far, all the non-primary root classes in Squeak were > motivated by a desire to use method lookup failure for various > "proxyish" features. I support such features in Spoon directly with the > interpreter (see for example, class "Other"), so it's not necessary to > have more than one root class (it's also not necessary to have the > "ProtoObject" class). Living in Switzerland, I'm be more neutral. In the 90's I've reviewed a DigiTalk project here in Switzerland (someone from Georg Heeg might remember his trip to Zuerich) in which all the main components started with " nil subclass: # ". My comment: if that makes them happy ... Also, the software research community would most likely want a system with multiple root support for their experiments, FWIW. ... > other shared variables > > Anyway, back to the system dictionary. I addressed the associations > there that refer to classes, but there are others. These are the other > so-called "global" variables (like Display, the primary display) as well > as all the "shared pools" (like TextConstants and, strictly speaking, > Undeclared). I think each global variable should be the responsibility > of some class. So the primary display could be something you get by > sending "primary" to DisplayScreen. + (1 Big) > Shared pools are dictionaries of shared-variable associations, similar > to the system dictionary (in fact, I'd call the system dictionary just > another shared pool). I know some think we should simply banish all > shared pools, but I'll assume for the moment that we're keeping them. I > find them useful, I just think some class should take responsibility for > each one. I've added a "publishedPools" instance variable to Class, > which stores all the shared pool dictionaries for which a class has > responsibility (i.e., the class that introduced the pool into the > system). I renamed the traditional "sharedPools" instance variable in > Class to "receivedPools"; these are the pools that a class merely uses. > Finally, I renamed the "classPool" instance variable to > "classVariablesPool", just to be clearer. Yes, "shared" pool responsibility is long overdue. > When you want to use a shared pool, you access the pool by sending a > message to the responsible class, rather than relying on its name being > a global variable. + (1 Big) > method references to "Smalltalk" I'd appreciate you make them all go away, regardless how :-) What if the next private investor wants to have #Squeak as name instead of #Smalltalk <grin/> ... > Again, thanks in advance for any feedback or questions. And thank you for sharing your thoughts and plans. /Klaus |
Hi Klaus-- > without a "global" which refers to your root class, what is it (in > your system) that keeps classes from being garbage collected? While a class is being built, it's referenced by the class-builder machinery. Once installed, its referenced (ultimately) by the root class. The root class is a "special object"; the garbage collector already explicitly excludes special objects from reclamation. > > As you might guess, this means that Spoon will not have multiple > > root classes. > > Living in Switzerland, I'm be more neutral. In the 90's I've reviewed > a DigiTalk project here in Switzerland (someone from Georg Heeg might > remember his trip to Zuerich) in which all the main components started > with " nil subclass: # ". My comment: if that makes them happy ... > > Also, the software research community would most likely want a system > with multiple root support for their experiments, FWIW. Sure, people can add multiple root classes again if they like, I'm just saying there's no need for them (or ProtoObject) in the minimal system, and Spoon won't have them by default. thanks again, -C -- Craig Latta http://netjam.org/resume |
In reply to this post by ccrraaiigg
Hi Craig,
I have a couple of points i'd like you to make clearer: Le Vendredi 28 Juillet 2006 09:14, Craig Latta a écrit : > Hi-- > > contents: > introduction > shared variables and compilation/execution > classes > other shared variables > the root class > the special objects array > method references to "Smalltalk" > another reminder about live behavior transfer > why do this now > > introduction > > Previously I mentioned that I wanted to rid Spoon of the system > dictionary. Here's how I'm currently doing it, and, more generally, how > I'm supporting shared variables in Spoon. Thanks in advance for any > feedback! First, a recap of the system dictionary concept and why I > don't like it. :) > > shared variables and compilation/execution > > Shared variables in Smalltalk are stored as associations; the key is a > shared variable name, and the value is an object associated with that > name. When the compiler compiles source for a method that refers to a > shared variable's name, it attempts to find an appropriate > shared-variable association for that name. It stores that association in > the "literal frame" of the resulting compiled method (currently, a span > of the method's bytes after the header and before the instructions). > > There are instructions for pushing the value of a particular > shared-variable association from a method's literal frame onto the stack > (or "temporary frame") of a context running that method (see > Interpreter>>pushLiteralVariableBytecode, the implementation of > interpreter operations 16r40 to 16r5F). > > Traditionally, all of these shared-variables associations are stored in > dictionaries that the compiler knows about. As far as the compiler is > concerned, the outermost shared-variable scope is represented by the > "system dictionary", a singleton instance of the SystemDictionary class > called "Smalltalk". (Just to review, note that the system dictionary has > an association whose key is the symbol #Smalltalk and whose value is the > system dictionary itself. This association is used in the compiler > methods that make use of the system dictionary.) > > classes > > Most of the associations in the system dictionary refer to classes. The > key of each such association indicates the name of the corresponding > class as far as the compiler is concerned. Additionally, each class has > a "name" instance variable. That is, the name of each class is stored in > two distinct places: in the system dictionary, and in the class itself. > In effect, the compiler's notion of a class' name and the class' own > notion of its name are distinct (and possibly conflicting). > > I propose to make the compiler use the classes' notion of names > directly, so that there is only one naming scheme, and that the classes > themselves are responsible for it. To do this, instead of storing a > class' name symbol in its "name" instance variable, we can just store > the shared-variable association that the compiled methods use (and which > used to be in the system dictionary). > > When the compiler wants to find a class with some name in some source a > human just wrote, it can search the class hierarchy from the root (class > Object). As I discussed earlier here with Ralph Johnson, it's typically > not as fast as a dictionary lookup, but it's acceptable (the compiler > tends not to be a part of the system that needs every cycle squeezed out > of it). > > If the compiler finds multiple classes with some name in source > submitted for compilation, it can present other information about these > classes (e.g., class category or module) to the human, and ask for a > choice. When already-compiled methods are transferred between systems, > there is no ambiguity, since class names aren't used at all (see > "another reminder about live behavior transfer" below). > Compiler is no more heavily used to rebuild packages from sources, since you transfer compiled objects. You beat Namespaces with simplicity by asking user to resolve conflict by menu. It's not declarative, it's interactive. But then, how will we understand code when reading from source ? hyperlink navigation ? The only concern is if one ever wanted to rebuild a system from source... Being asked for many choices, it would be quite boring... And it would be very hard to investigate code from others... We should better not rebuild once the system grow. In case you want to merge two equivalent classes because they are doing mostly the same job, you'll have to recompile from source... Then you are exposed to name clashes and its flow of menus... I imagine you could maybe give a hint to the compiler so that it does not ask you twice the same question within the same compiling unit... Or should we have tools able to relink ? > the root class > > As you might guess, this means that Spoon will not have multiple root > classes. So far, all the non-primary root classes in Squeak were > motivated by a desire to use method lookup failure for various > "proxyish" features. I support such features in Spoon directly with the > interpreter (see for example, class "Other"), so it's not necessary to > have more than one root class (it's also not necessary to have the > "ProtoObject" class). > > As for how to access the root class, there are a couple of options. We > could store the root class' shared-variable association directly in > methods, or we could store the root class in the "special objects array" > (it could take the system dictionary's place there, in fact). > > the special objects array > > This brings me to the special objects array. :) I've always found it > odd that it's chock-full of well-known and relatively unchanging things, > but it doesn't have its own class and protocol. I've never liked the > name "special objects array" either; it seems too vague. Metaphorically, > I think the special objects array represents the grip that the > interpreter has (and needs to have) on the object memory. So for Spoon > I've created a class called "InterpreterGrip" whose sole instance is a > collection of the objects that the interpreter knows about. I call each > of these objects a "grip point". There is protocol for accessing them > (for example, a "rootClass" message). I find this more pleasant than the > current scheme. > > other shared variables > > Anyway, back to the system dictionary. I addressed the associations > there that refer to classes, but there are others. These are the other > so-called "global" variables (like Display, the primary display) as well > as all the "shared pools" (like TextConstants and, strictly speaking, > Undeclared). I think each global variable should be the responsibility > of some class. So the primary display could be something you get by > sending "primary" to DisplayScreen. > Yes but then DisplayScreen should become an abstract class with convenient factory to hook PrimaryDisplay at image startup to a concrete SqueakDisplayScreen (the squeak window), an OSDisplayScreen (if you want to use OS windows) or maybe a FakeScreen if you are headless And sure, if you drive more than one Screen, one single Display global does not make sense... Better use messages, you are right. > Shared pools are dictionaries of shared-variable associations, similar > to the system dictionary (in fact, I'd call the system dictionary just > another shared pool). I know some think we should simply banish all > shared pools, but I'll assume for the moment that we're keeping them. I > find them useful, I just think some class should take responsibility for > each one. I've added a "publishedPools" instance variable to Class, > which stores all the shared pool dictionaries for which a class has > responsibility (i.e., the class that introduced the pool into the > system). I renamed the traditional "sharedPools" instance variable in > Class to "receivedPools"; these are the pools that a class merely uses. > Finally, I renamed the "classPool" instance variable to > "classVariablesPool", just to be clearer. > > When you want to use a shared pool, you access the pool by sending a > message to the responsible class, rather than relying on its name being > a global variable. > Very true, from compiler's point of view Smalltalk is just another sharedPool like TextConstants, except it does not need being declared as poolDictionary... Something VW also generalized with Namespaces. They multiplied the SystemDIctionary, It's fun to see you take an opposite direction. Of course, SharedPool are usefull, because you can simply write (CR) in your code instead of (Text constants at: #CR), and that's more efficient in term of bytecodes, second case having same code for accessing association value, plus two message sends plus a Symbol literal... If you want to use shared pool keys in your code, you have to declare it somewhere (in class definition by now). Do you write something like (poolDictionaries: {Text constants}) instead of (poolDictionaries: 'TextConstants') ? If that is the case, i see a little problem. If we store and initialize TextConstants in a Text classVariable, how would Text use CR constant itself ? There is a bootstrap problem in class definition: ArrayedCollection subclass: #Text instanceVariableNames: 'string runs' classVariableNames: 'TextConstants' poolDictionaries: {Text constants} category: 'Collections-Text' Should TextConstants be declared and initialized in a neutral place ? Of course, if you never rebuild code but just do the bootstrap once and then only transfer resulting compiled objects, then maybe you do not bother... is that it ? Does per method meta declarations like <thisCompiler useSharedPool: Text constants> would make any sense ? like a Ada/C++/Fortran90 with package, use,... > method references to "Smalltalk" > > So now we've got new homes for all the shared-variable associations > which used to be reachable through the system dictionary. The other > thing to do is refactor the methods which use the shared-variable > association for the system dictionary itself (the methods which refer to > "Smalltalk"). I'm working on this now. There are about a thousand of > them in a "full" object memory, but for most of them it's clear which > class should actually take responsibility. For example, there are > several methods which (in my opinion) are rightly the responsibility of > the Interpreter class (like the garbage collection messages). I've also > written some refactoring tools that automate a lot of this (e.g., a tool > which replaces the push of one literal variable with another when > followed by the sending of a particular message). > True, Smalltalk being the root object, it also has been considered with a semantic slip as the ObjectMemory, the ImageSettingsRepository or the Interpreter... Sometimes we have to check for existence of a class. This can be done with (Smalltalk includesKey: #MyClass). VW has #{MyClass} construct... What is your replacement? Root class tree recursion? Of course, you cannot rely only on name anymore... > another reminder about live behavior transfer > > Some of these decisions would be problematic if we were limited to > using source code ("fileouts") to transfer behavior between systems. > Since Spoon can transfer methods directly, without recompilation (or > even source code) and without referring to shared-variable names at all, > it works (see the MethodLiteralTransmissionMarker hierarchy for details). > > why do this now > > This work was always lurking in the future, but now the issue is forced > by my work on Naiad (Spoon's module system). I'm making a module which > reattaches the primary display (the system is initially headless), and > that meant deciding how to access it. Since access is traditionally > through a global variable (Display), the can of worms was opened. :) > > *** > > Again, thanks in advance for any feedback or questions. I'm usually > around on the Squeak IRC channel from 1700 to 0500 GMT, and I read the > squeak-dev and Spoon lists. > > > thanks again, > > -C From what you explained, i do not see major weak point in your approach. It seems consistent and quite solid to me Nicolas |
Hi Nicolas-- > How will we understand code when reading [the source for a method > which uses a class whose name is used by multiple classes]? Hyperlink > navigation? That's one possibility. I expect there will be some visual cue in the rendering of the class name, so that the casual reader will know that there are other classes that also have that name. If the reader wants to know more about that class, the tools can provide more information (probably via a menu or hyperlink navigation). Keep in mind that this situation will probably be relatively rare in the first place. > The only concern is if one ever wanted to rebuild a system from > source... Being asked for many choices, it would be quite boring... > And it would be very hard to investigate code from others... Indeed, I only intend this interactive mode of use to be used in the situation of a human entering source manually. I think we can bootstrap new systems without using source, by writing out object memories directly (rather like the system tracer and the Fenix work). > In case you want to merge two equivalent classes because they are > doing mostly the same job, you'll have to recompile from source. I don't think that's true. You could just transfer all the compiled methods into a common class programmatically. Any conflicts will probably have to be resolved interactively anyway. > I imagine you could maybe give a hint to the compiler so that it does > not ask you twice the same question within the same compiling unit... Sure; I expect there will be several environment preferences related to how chatty the compiler is during interactive development. > If you want to use shared pool keys in your code, you have to declare > it somewhere (in class definition by now). Do you write something like > (poolDictionaries: {Text constants}) instead of (poolDictionaries: > 'TextConstants')? Well, first of all there will be two pool-dictionary-related instance variables in Class instead of one ("publishedPools" and "receivedPools"). > If that is the case, i see a little problem. If we store and > initialize TextConstants in a Text classVariable, how would Text use > CR constant itself? There is a bootstrap problem in class definition: > > ArrayedCollection > subclass: #Text > instanceVariableNames: 'string runs' > classVariableNames: 'TextConstants' > poolDictionaries: {Text constants} > category: 'Collections-Text' > > Should TextConstants be declared and initialized in a neutral place? I expect that a class' published pools (the pools for which it is responsible, and which it provides to the rest of the system) will be defined as part of class initialization. Thereafter, the pool objects themselves are transferred directly. The way class objects are transferred from one system to another in Spoon is much more sophisticated than using an expression like the one you cite above. I also expect that the browsing tools will provide an actual user interface for interactively defining new classes, rather than the crude way we've been doing it all this time (editing an expression and evaluating it). Finally, there won't be an inherent need to use class variables when defining or initializing shared pools. > Do per-method meta declarations like > > <thisCompiler useSharedPool: Text constants> > make any sense? It's doable. But personally, I'd find it confusing for every method to have its own compilation environment (potentially). I'd prefer to keep this decision with each class. > Sometimes we have to check for existence of a class. This can be done > with (Smalltalk includesKey: #MyClass). VW has #{MyClass} > construct... What is your replacement? Root class tree recursion? Yes, searches from the root class. > Of course, you cannot rely only on name anymore... Right, relying on the name alone was always risky! With the mechanisms you mention above, you're still not really sure you're getting a meaningful answer. What you really want to know is whether there is a class which implements a particular set of messages with a particular set of meanings (i.e., a particular interface). You might also want to ensure that this class has some expected performance characteristics, or was written by a particular set of authors, or was written at a particular point in time, etc. You can make as detailed a query as you like in the scheme I propose. > From what you explained, i do not see major weak point in your > approach. It seems consistent and quite solid to me. Thanks for reading and for your comments! -C -- Craig Latta http://netjam.org/resume |
Free forum by Nabble | Edit this page |