"Florin Mateoc" <[hidden email]> wrote in message
news:4KXM6.6403$[hidden email]... > Well, OK. > > How can you do safe unchecked accesses to an array ? (rhetorical question ;-) > One could use unchecked primitives like David Simmons implemented. They're not primitives. They are compiled and processed as ordinary Smalltalk messages. During JIT analysis and code generation processes (where a methods opcodes are used to generate native machine code), the messages are analyzed by the JIT which treats them as monomorphic messages defined in <Object>. Which in turn means it can inline them. > This raises the possibility of direct use in inappropriate circumstances, which > would take away much of Smalltalk's safety net. You can't have it both ways. Either you want the safety check and therefore you'll pay for the range check somewhere; or you don't want the safety check therefore you incur the memory corruption risks of erroroneous use. This is an old Pascal versus C argument and casting it as a Smalltalk specific issue is distracting. What is fair/appropriate to say is that one has to do by hand (via #basicNRC... examples), what is free in a statically typed and compiled language with functional/keyword-loop-statements the range check that are either statically analyzed or effectively moved outside the loop (as an invariant). The type/variable declaration extensions in SmallScript theoretically enable the compiler the same opportunity to automatically perform range checks for you outside of a loop. "" Declare a class with a 12-element Float array class name: Foo inst-fields: 'double x[12]'. "" Declare a method that loops accessing the array elements method class: Foo [ someLoop base to: someLimit do: [:i| "someExpressionInvolving x[i]" ]. ]. NOTE: SmallScript provides syntax which includes array/slice messages. Thus we can write expressions like: x[i] := 12. abc := x[i]*pi. #(1 2 3)[1] ==> 1 "" etc > A better approach would be to use special bytecodes, generated by the Smalltalk > compiler when it knows it is safe to do so. Why *yet more* bytecodes? That is an implementation solution for an interpreter. I could just as easily (and is a better approach to) say that we can (as needed) just optimize certain messages within the JIT/adaptive-compilation engine. > This would not infringe on any > safety expectations, because bytecodes are not directly used and there are zero > gurantees if you do mess with them. I totally don't understand this comment at all. Special byte-codes accomplish nothing unless we're talking about trying to improve the performance of an interpreted Smalltalk -- and if we really were worried about speed here we don't want an interpreted smalltalk, we want optimized jitted (or statically compiled) smalltalk. The samples I gave generate standard bytecodes for sending the messages #basic... The JIT/adaptive compiler, of the v4 AOS Platform on which SmallScript runs, is what makes them special. > But then , where would one use such bytecodes ? Not in the implementation of the > do: method on Array because then it would have to take a block closure as a > parameter. Ideally you would inline the array iteration at the call sites. But > how can you do that in an untyped polymorphic system ? While trying to come up > with an answer I remembered that Smalltalk already does something very similar > with its to:do: looping construct. It simply uses a reserved keyword that the > compiler recognizes. The rest is history ;-) (it seems I have to use lots of > smilies in this newsgroup, otherwise somebody will bite my head off) I suspect I don't really follow your intent with the (quasi-sarcastic) comments here. An implementation can choose inline iteration at a callsite by via compiler controlled optimization of standard message forms, or via adaptive JIT inlining, or via a linguistic declarative construct (with or without type information). The standard Smalltalk messaging construct #to:do: is only specially treated, as a monomorphic declarative expression, by Smalltalk compilers that have optimization inlining of that message enabled. For those compilers that allow this to be controlled via per-method/class/namespace directives/annotations/properties (as in our compilers), or via compiler instance flags (as in our compilers for use by tools) this is a non-issue. You seem to be equating an optimization of special message form as being the same thing as an explicit non-message based declarative construct. They are different; it is conceptually weird/wrong choose to turn off a declaration contruct whereas turning off a message optimization is fine. One could even perform an analysis of a program to determine if it contained any conflicting definitions of such inlined messages and use that to decide how a program should be compiled. A compiler controlled optimization of a standard Smalltalk message is quite a different thing from changing the language itself to provide an explicit declarative construct that is not based on the standard messaging syntax. As my previous posts SmallScript examples indicate, I have done both these things. As a result of many years of involvement in Smalltalk language design and implementation I have expended a significant amount of time considering the questions of why one should make such changes and when they are appropriate. P.S., technically, (user perception's aside) a compiler could turn off a declarative construct by generating it as some implied/specific message form. But why do that in Smalltalk when it already has the pure object messaging model. ==== As to general syntax forms where one could gain some real benefit, exploring issues of declaring "non-capturable" blocks would be valuable. I.e., most message based forms assume that a block object could be captured by a callee. Which in turn means that the block needs to be a valid object even after the caller frame (blocks home/instantiation context) has exited/returned. If there was a block declaration form that guaranteed/indicated that a block was not valid after its home/instantiation context had returned then implementations could generate blocks as callback subroutines and pass a tagged-pointer to them and avoid context instantiation costs. An optimizing JIT could then detect this pretty easily to enable a variety of safe performance optimizations. a to: b do: #[...]. The AOS Platform opcode set supports subroutines within methods. They are used extensively in optimizing the exception handling constructs to enable context sharing in a fashion just as I described above. The following example illustrates the performance of inlining blocks via subroutine mechanisms and tagged-pointers. In this example, SmallScript's compiler works with the JIT to optimize exception handling code. A general block-decl solution yielding similar performance would allow blocks to be declared as non-capturable. To illustrate the performance impact, consider this code snippet: [ 2000000 timesRepeat: [ SomeObject new. ]. ]. In SmallScript, on a 1.2GHz AMD TBird this takes about 109ms. In VW on the same machine essentially the same amount of time ~118ms. If I run this on a statically typed and compiled language (not using multi-precision arithmetic and with a GC tuned for static type information) like C# it takes about 40ms. Now, when I add in exception handling: [ 2000000 timesRepeat: [ [SomeObject new] on: Exception do: [:e| stdlog << thisContext.dump. "Or some equivalent" "handle exception here" ]. ]. ]. The SmallScript code now runs in about 120ms. In VW it takes 250ms. In C# it takes 61ms. The SmallScript subroutine based inlining (vs needing blocks) clearly plays a role here because the difference is only 11ms; whereas in VW it more than doubled the execution time. Try running that code on your favorite Smalltalk or other language and see how you do. Try this as well: [ 2000000 timesRepeat: [ [] on: Exception do: [:e| stdlog << thisContext.dump. "Or some equivalent" "handle exception here" ]. ]. ]. The above code takes 18ms in SmallScript, ~10ms in C#, and 120ms in VW. -- Dave S. [www.smallscript.net] > > Florin ...snip... |
Oops, by mistake I think I replied directly to David instead of the group.
I don't have that message anymore, so this is only an approximate reproduction In article <e4%M6.15782$%[hidden email]>, David Simmons says... > >"Florin Mateoc" <[hidden email]> wrote in message >news:4KXM6.6403$[hidden email]... >> Well, OK. >> >> How can you do safe unchecked accesses to an array ? (rhetorical question >;-) >> One could use unchecked primitives like David Simmons implemented. > >They're not primitives. They are compiled and processed as ordinary >Smalltalk messages. During JIT analysis and code generation processes (where >a methods opcodes are used to generate native machine code), the messages >are analyzed by the JIT which treats them as monomorphic messages defined in ><Object>. Which in turn means it can inline them. What I meant to say here is not that they are implemented as primitives but that they are "normal" methods directly accessible for use in development, basically part of the Smalltalk "API". Developers tend to ignore the categorization of a method as private (explicit or implicit) and use them anyway > >> This raises the possibility of direct use in inappropriate circumstances, >which >> would take away much of Smalltalk's safety net. > >You can't have it both ways. > But that is exactly what I want. >Either you want the safety check and therefore you'll pay for the range >check somewhere; or you don't want the safety check therefore you incur the >memory corruption risks of erroroneous use. This is an old Pascal versus C >argument and casting it as a Smalltalk specific issue is distracting. > It is Smalltalk specific because in Smalltalk we usually have two very distinct compilation passes: one from source to bytecodes, that happens at development time, and one from bytecodes tro native code, that happens at run time. The Smalltalk source compiler has a lot more time and resources available. It also has potentially some global knowledge of the whole program that the JIT misses. If the Smalltalk source compiler is made smarter the JIT will still have its own optimization opportunities and the end result can be much better. In thisd particular case we would delegate the optimization to the Smalltalk source compiler, so that the run time cost is zero, although we have maintained the Smalltalk "API" safety. <snip> > >Why *yet more* bytecodes? That is an implementation solution for an >interpreter. I could just as easily (and is a better approach to) say that >we can (as needed) just optimize certain messages within the >JIT/adaptive-compilation engine. > As mentioned above, the optimized bytecodes would be useful not only for the interpreter but for the JIT as well (plus they would expose more optimization opportunities for the JIT) >> This would not infringe on any >> safety expectations, because bytecodes are not directly used and there are >zero >> gurantees if you do mess with them. > >I totally don't understand this comment at all. Special byte-codes >accomplish nothing unless we're talking about trying to improve the >performance of an interpreted Smalltalk -- and if we really were worried >about speed here we don't want an interpreted smalltalk, we want optimized >jitted (or statically compiled) smalltalk. > I hope this is more clear now. >The samples I gave generate standard bytecodes for sending the messages >#basic... > >The JIT/adaptive compiler, of the v4 AOS Platform on which SmallScript runs, >is what makes them special. > >> But then , where would one use such bytecodes ? Not in the implementation >of the >> do: method on Array because then it would have to take a block closure as >a >> parameter. Ideally you would inline the array iteration at the call sites. >But >> how can you do that in an untyped polymorphic system ? While trying to >come up >> with an answer I remembered that Smalltalk already does something very >similar >> with its to:do: looping construct. It simply uses a reserved keyword that >the >> compiler recognizes. The rest is history ;-) (it seems I have to use lots >of >> smilies in this newsgroup, otherwise somebody will bite my head off) > >I suspect I don't really follow your intent with the (quasi-sarcastic) >comments here. > "The rest is history" was meant as a self-irony. The comment about my use of smilies (and implied previous lack of) was referring to one (unwarranted, I think) particularly aggressive reply that i got in this thread >An implementation can choose inline iteration at a callsite by via compiler >controlled optimization of standard message forms, or via adaptive JIT >inlining, or via a linguistic declarative construct (with or without type >information). > >The standard Smalltalk messaging construct #to:do: is only specially >treated, as a monomorphic declarative expression, by Smalltalk compilers >that have optimization inlining of that message enabled. For those compilers >that allow this to be controlled via per-method/class/namespace >directives/annotations/properties (as in our compilers), or via compiler >instance flags (as in our compilers for use by tools) this is a non-issue. > >You seem to be equating an optimization of special message form as being the >same thing as an explicit non-message based declarative construct. Exactly, thank you for stating it so clearly. The inlined messages are de facto declarative constructs. They are very different from normal messages, especially in a reflective and late-binding system such as Smalltalk. If they were normal messages, the runtime behavior could be drastically altered: maybe somebody puts some wrappers around them and some pre-methods get executed, maybe somebody dynmically removes the method from the method dictionary, maybe somebody overrides it in some subclass. None of this can happen to an optimized message (that's why we have the optimization opportunity in the first place), because the Smalltalk source compiler has fixed its behavior at development time, just like for a declarative construct. > >They are different; it is conceptually weird/wrong choose to turn off a >declaration contruct whereas turning off a message optimization is fine. Yes, it would be fine, except nobody ever does it, hardly anybody even knows they could do it. >One >could even perform an analysis of a program to determine if it contained any >conflicting definitions of such inlined messages and use that to decide how >a program should be compiled. > Except the sources have already been compiled to bytecodes when we "accepted" the methods, so the messages have disspeaared (to:do: has become indistinguishable from [] whileTrue: []). So such an analysis would need to recompile the whole universe from sources first. >A compiler controlled optimization of a standard Smalltalk message is quite >a different thing from changing the language itself to provide an explicit >declarative construct that is not based on the standard messaging syntax. As >my previous posts SmallScript examples indicate, I have done both these >things. As a result of many years of involvement in Smalltalk language >design and implementation I have expended a significant amount of time >considering the questions of why one should make such changes and when they >are appropriate. I truly appreciate your insights. However, I would submit that nobody would even notice if the Smalltalk vendors removed the possibility to de-optimize these messages. How would they be distinguishable then from declarative constructs ? Changing their official status from optimized messages to declarative constructs would be the most invisible language change in history. I think this is mostly a psichological barrier. And I further submit to you that, by taking a step back and looking at these optimized messages and asking ourselves how did we get here (to have accepted in the Smalltalk language some de facto declarative constructs) we would see that they are mostly about some basic concepts that do not map well to the message send paradigm. Now, if we (as a community) ever get there, it would be a logical step to try to see what other declarative constructs would belong to this group and try to "complete" the language with them. Like in the case of the existing optimized mesage sends, that were naturally accepted, I think a slightly larger group would be accepted as well. If, together with a perhaps better mapping to the way our mind works we also get the benefit of significantly optimizing the implementation, then I would say it is worth it <snip very interesting block optimization comments> Florin |
"Florin Mateoc" <[hidden email]> wrote in message
news:4fbN6.7680$[hidden email]... > Oops, by mistake I think I replied directly to David instead of the group. > I don't have that message anymore, so this is only an approximate reproduction > > In article <e4%M6.15782$%[hidden email]>, David Simmons > says... > > > >"Florin Mateoc" <[hidden email]> wrote in message > >news:4KXM6.6403$[hidden email]... > >> Well, OK. > >> > >> How can you do safe unchecked accesses to an array ? (rhetorical question > >;-) > >> One could use unchecked primitives like David Simmons implemented. > > > >They're not primitives. They are compiled and processed as ordinary > >Smalltalk messages. During JIT analysis and code generation processes (where > >a methods opcodes are used to generate native machine code), the messages > >are analyzed by the JIT which treats them as monomorphic messages defined in > ><Object>. Which in turn means it can inline them. > > What I meant to say here is not that they are implemented as primitives but > that they are "normal" methods directly accessible for use in development, > basically part of the Smalltalk "API". Developers tend to ignore the > categorization of a method as private (explicit or implicit) and use them anyway > > > > > >> This raises the possibility of direct use in inappropriate circumstances, > >which > >> would take away much of Smalltalk's safety net. > > > >You can't have it both ways. > > > > But that is exactly what I want. > > >Either you want the safety check and therefore you'll pay for the range > >check somewhere; or you don't want the safety check therefore you incur > >memory corruption risks of erroroneous use. This is an old Pascal versus C > >argument and casting it as a Smalltalk specific issue is distracting. > > > > It is Smalltalk specific because in Smalltalk we usually have two very distinct > compilation passes: one from source to bytecodes, that happens at development > time, and one from bytecodes tro native code, that happens at run time. The > Smalltalk source compiler has a lot more time and resources available. It also > has potentially some global knowledge of the whole program that the JIT misses. > If the Smalltalk source compiler is made smarter the JIT will still have its own > optimization opportunities and the end result can be much better. > In thisd particular case we would delegate the optimization to the Smalltalk > source compiler, so that the run time cost is zero, although we have maintained > the Smalltalk "API" safety. I still don't understand. Either safety checks are performed or they are not. If they are performed then some cost is incurred, if they are not performed some safety risk is exposed. The compilation process of Smalltalk doesn't change the end-result. Based on comments you make below, I think you want something slightly different than what you're communicating you want. In other words, you would like the compiler optimizations to generate opcodes which can be "uniquely" reverse-compiled or deciphered by the JIT as originally being a #to:do: message rather than a sequence of numeric operations and boolean test opcodes. In SmallScript/QKS-Compilers, such "signatures" for message use are already present for the tools and debuggers to display message use references, etc. I.e., you can query a method for use of #if... or #to:do: and it how and where it was used/optimized. However, that still has *no* consistently useable value for the JIT (or an interpreter) in making some preference based decision about whether to "actually" send a polymorphic #to:do: or perform the inlined code sequence. You ask why? The answer is that what is really being optimized/focused-on is *not* the #to:do: message. The optimization that is really being focused-on/performed (at least in SmallScript and previous QKS Compilers) is the inlined-optimization of the *block* closure arguments. The inlined versions of a closure can result in dramatically different code from what would be needed if they were not inlined. Therefore, the only general solution that I think achieves your goal would be to actually generate opcodes for both variants. Then the JIT could choose to generate native-machine code for whichever variant was appropriate. Alternatively, the JIT could actually request a recompilation of the method with the optimization-flags turned off (again this is a feature in SmallScript/QKS-Smalltalk compilers). Recompilation assumes that the source for the method is either available or could be reconstituted. In my images, #to:do: is used in about ~4% of all the methods. Compiling two variants adds unecessary complication to the compiler architecture and JIT mechanics -- but read on to see my rationale for assessing this whole issue. Recompilation is very fast (SmallScript compiles roughly a million lines a minute on my dev box) and it would only need to be done on demand for those methods that were not already in the in the correct #to:do: mode. The SmallScript (and QKS Smalltalk) compilers perform a wide range of optimizations including peephole optimization on the opcodes. The v4 AOS Platform's JIT is very fast at analyzing and generating optimized native machine code because the v4 opcode-instruction set was designed primarily for this purpose. The VM cache mechanism is already tracking methods, so it could readily detect when/if the standard #to:do: method was specialized somewhere (and thus not monomorphic/DNR w.r.t. numeric receivers). Now, having said all that, the next question is what is the real value of doing it. My answer is that classic-Smalltalk does *not* provide selector-namespaces. Therefore, there are many message-selectors that one simply can't replace/redefine (like #class) without breaking the frameworks. Selector name collisions (and their often global semantic impact) have little or nothing to do with compilation issues you've been raising. I would make the case to you, that given there are some selectors you cannot change/redefine because of the framework design, it is reasonable to just consider the inlined optimized messages as being part of the set of non-redefineable messages. As a developer, you may have to deal at some time or other with selector-name collisions for other (non-vendor) reasons. There is no reason why you can't define your own #customTo:do: methods and use them everywhere you deeem appropriate. CONCLUSION: The change you would like could be made, but shouldn't be. Accepting the need for this change is a slippery slope because it raises the assumption that such changes should be possible for all selectors. That is not within the implementation goals/designs of the Smalltalk language or its standard. Given the general reality of selector namespace collisions, there is no reason to treat this inlining optimization as other than being one (of many) possible cases/causes of selector-name collisions. There is a very small set of these selectors and the "real" impact/value of preventing the selector-name collision issue is very limited (at best). I.e., I suggest that the best and most realistic answer is to simply define your own selector name and use it as needed. HOW QKS-SMALLTALK HISTORICALLY ADDRESSED THIS GENERAL ISSUE: 1) QKS Smalltalk compilers support annotations. The Mac and Win32 versions have such compiler optimized messages annotated with the #monomorphic attribute. If a method selector was used in a method definition which was annotated as being monomorphic the compiler would generate a error or warning. The error would be generated if the method-definition being compiled was not itself tagged as "monomorphic", otherwise a warning would be generated. 2) Even if messages were optimized away at Smalltalk source compilation time, the compiler would still generate sufficient information to ensure that tools looking for references to such message sends could find and identify them. This included ensuring that the debugger could also identify such inlines to enable it to present them to the user-experience as ordinary message sends (breakpoints, etc). > > > <snip> > > > > >Why *yet more* bytecodes? That is an implementation solution for an > >interpreter. I could just as easily (and is a better approach to) say that > >we can (as needed) just optimize certain messages within the > >JIT/adaptive-compilation engine. > > > > As mentioned above, the optimized bytecodes would be useful not only for the > interpreter but for the JIT as well (plus they would expose more optimization > opportunities for the JIT) I still don't buy this statement. (see my previous comments on compiler's providing sufficient info to the presence of #to:do:, inlining blocks, and tool use of monomorphic method annotations). > > > >> This would not infringe on any > >> safety expectations, because bytecodes are not directly used and there are > >zero > >> gurantees if you do mess with them. > > > >I totally don't understand this comment at all. Special byte-codes > >accomplish nothing unless we're talking about trying to improve the > >performance of an interpreted Smalltalk -- and if we really were worried > >about speed here we don't want an interpreted smalltalk, we want optimized > >jitted (or statically compiled) smalltalk. > > > > I hope this is more clear now. As I mentioned just above, I don't buy the idea. So, unfortunately, it is not clear. > > >The samples I gave generate standard bytecodes for sending the messages > >#basic... > > > >The JIT/adaptive compiler, of the v4 AOS Platform on which SmallScript runs, > >is what makes them special. > > > >> But then , where would one use such bytecodes ? Not in the implementation > >of the > >> do: method on Array because then it would have to take a block closure as > >a > >> parameter. Ideally you would inline the array iteration at the call sites. > >But > >> how can you do that in an untyped polymorphic system ? While trying to > >come up > >> with an answer I remembered that Smalltalk already does something very > >similar > >> with its to:do: looping construct. It simply uses a reserved keyword that > >the > >> compiler recognizes. The rest is history ;-) (it seems I have to use lots > >of > >> smilies in this newsgroup, otherwise somebody will bite my head off) > > > >I suspect I don't really follow your intent with the (quasi-sarcastic) > >comments here. > > > > "The rest is history" was meant as a self-irony. The comment about my use of > smilies (and implied previous lack of) was referring to one (unwarranted, I > think) particularly aggressive reply that i got in this thread > > >An implementation can choose inline iteration at a callsite by via compiler > >controlled optimization of standard message forms, or via adaptive JIT > >inlining, or via a linguistic declarative construct (with or without type > >information). > > > >The standard Smalltalk messaging construct #to:do: is only specially > >treated, as a monomorphic declarative expression, by Smalltalk compilers > >that have optimization inlining of that message enabled. For those compilers > >that allow this to be controlled via per-method/class/namespace > >directives/annotations/properties (as in our compilers), or via compiler > >instance flags (as in our compilers for use by tools) this is a non-issue. > > > >You seem to be equating an optimization of special message form as being the > >same thing as an explicit non-message based declarative construct. > > Exactly, thank you for stating it so clearly. The inlined messages are de facto > declarative constructs. They are very different from normal messages, especially > in a reflective and late-binding system such as Smalltalk. If they were normal > messages, the runtime behavior could be drastically altered: maybe somebody puts > some wrappers around them and some pre-methods get executed, maybe somebody > dynmically removes the method from the method dictionary, maybe somebody > overrides it in some subclass. None of this can happen to an optimized message > (that's why we have the optimization opportunity in the first place), because > the Smalltalk source compiler has fixed its behavior at development time, just > like for a declarative construct. > > > > > >They are different; it is conceptually weird/wrong choose to turn off a > >declaration contruct whereas turning off a message optimization is fine. > > Yes, it would be fine, except nobody ever does it, hardly anybody even knows > they could do it. Not true for QKS-Smalltalk compilers. It is a basic part of the compiler switches and is supported by method annotations/directives and class properties. OTOH, it is a documentation issue -- discovering/reading about the switches and understanding why they are relevant. But, as I said above, the QKS-Smalltalk compilers generate error/warnings when a developer attempts to override a monomorphic method. So the developer knows something is up and is not left wholly unaware... QKS-Smalltalk compiler's also generate warnings for errors like #ifTrue:ifTrue: mistakes for the same reason. > > >One > >could even perform an analysis of a program to determine if it contained any > >conflicting definitions of such inlined messages and use that to decide how > >a program should be compiled. > > > > Except the sources have already been compiled to bytecodes when we "accepted" > the methods, so the messages have disspeaared (to:do: has become > indistinguishable from [] whileTrue: []). So such an analysis would need to > recompile the whole universe from sources first. Not in QKS-Smalltalks. > > > >A compiler controlled optimization of a standard Smalltalk message is quite > >a different thing from changing the language itself to provide an explicit > >declarative construct that is not based on the standard messaging syntax. As > >my previous posts SmallScript examples indicate, I have done both these > >things. As a result of many years of involvement in Smalltalk language > >design and implementation I have expended a significant amount of time > >considering the questions of why one should make such changes and when they > >are appropriate. > > I truly appreciate your insights. However, I would submit that nobody would even > notice if the Smalltalk vendors removed the possibility to de-optimize these > messages. I disagree, for all the reasons I mentioned above (relating to QKS-Smalltalk versions only). More importantly, in the last 10+ years of my interaction in the Smalltalk space as a language implementor, you're the first person to raise this issue with "needing" to specialize/define-in-a-new-class #to:do:. Which suggests to me that should the vendors go to all the effort to make some special change for #to:do: et cetera, "nobody would even notice" it. There are many other more important areas for the vendors/implementors to focus their efforts on than this esoteric issue for which there are clear selector-name collision solutions. > How would they be distinguishable then from declarative constructs ? > Changing their official status from optimized messages to declarative constructs > would be the most invisible language change in history. I think this is mostly a > psichological barrier. And I further submit to you that, by taking a step back > and looking at these optimized messages and asking ourselves how did we get here > (to have accepted in the Smalltalk language some de facto declarative > constructs) we would see that they are mostly about some basic concepts that do > not map well to the message send paradigm. Now, if we (as a community) ever get > there, it would be a logical step to try to see what other declarative > constructs would belong to this group and try to "complete" the language with > them. Like in the case of the existing optimized mesage sends, that were > naturally accepted, I think a slightly larger group would be accepted as well. > If, together with a perhaps better mapping to the way our mind works we also get > the benefit of significantly optimizing the implementation, then I would say it > is worth it > > <snip very interesting block optimization comments> > > Florin > > |
In article <_QdN6.17210$%[hidden email]>, David Simmons
says... > >> >Either you want the safety check and therefore you'll pay for the range >> >check somewhere; or you don't want the safety check therefore you incur >the >> >memory corruption risks of erroroneous use. This is an old Pascal versus >C >> >argument and casting it as a Smalltalk specific issue is distracting. >> > >> >> It is Smalltalk specific because in Smalltalk we usually have two very >distinct >> compilation passes: one from source to bytecodes, that happens at >development >> time, and one from bytecodes tro native code, that happens at run time. >The >> Smalltalk source compiler has a lot more time and resources available. It >also >> has potentially some global knowledge of the whole program that the JIT >misses. >> If the Smalltalk source compiler is made smarter the JIT will still have >its own >> optimization opportunities and the end result can be much better. >> In thisd particular case we would delegate the optimization to the >Smalltalk >> source compiler, so that the run time cost is zero, although we have >maintained >> the Smalltalk "API" safety. > >I still don't understand. Either safety checks are performed or they are >not. If they are performed then some cost is incurred, if they are not >performed some safety risk is exposed. > >The compilation process of Smalltalk doesn't change the end-result. > >Based on comments you make below, I think you want something slightly >different than what you're communicating you want. > >In other words, you would like the compiler optimizations to generate >opcodes which can be "uniquely" reverse-compiled or deciphered by the JIT as >originally being a #to:do: message rather than a sequence of numeric >operations and boolean test opcodes. > >In SmallScript/QKS-Compilers, such "signatures" for message use are already >present for the tools and debuggers to display message use references, etc. >I.e., you can query a method for use of #if... or #to:do: and it how and >where it was used/optimized. > >However, that still has *no* consistently useable value for the JIT (or an >interpreter) in making some preference based decision about whether to >"actually" send a polymorphic #to:do: or perform the inlined code sequence. > >You ask why? The answer is that what is really being optimized/focused-on >is *not* the #to:do: message. The optimization that is really being >focused-on/performed (at least in SmallScript and previous QKS Compilers) is >the inlined-optimization of the *block* closure arguments. The inlined >versions of a closure can result in dramatically different code from what >would be needed if they were not inlined. > >Therefore, the only general solution that I think achieves your goal would >be to actually generate opcodes for both variants. Then the JIT could choose >to generate native-machine code for whichever variant was appropriate. >Alternatively, the JIT could actually request a recompilation of the method >with the optimization-flags turned off (again this is a feature in >SmallScript/QKS-Smalltalk compilers). Recompilation assumes that the source >for the method is either available or could be reconstituted. Ok, I think I understand where has the miscommunication happened (perhaps you have noticed it too, if you have read my latest post in the original thread). Although I started the thread talking about #to:do: , what I was arguing for was not a different implementation/optimization for #to:do: because, as you say, the most important thing has already happened there: the literal block has been inlined. What I want is to achieve a similar thing for a #do: method and its variants, a construct that would apply to any instance of a variable class, unlike the current #do: method which on one hand takes at least a copying block (but very often a full one) as an argument, on the other hand it uses bound-checked acceses although it could avoid them entirely, and to add insult to injury is only available for subclasses of Collection, although it should be in the same conceptual category (muliplicity) with #at: and #at:put: that are implemented in Object. > >In my images, #to:do: is used in about ~4% of all the methods. Compiling two >variants adds unecessary complication to the compiler architecture and JIT >mechanics -- but read on to see my rationale for assessing this whole issue. > >Recompilation is very fast (SmallScript compiles roughly a million lines a >minute on my dev box) and it would only need to be done on demand for those >methods that were not already in the in the correct #to:do: mode. The >SmallScript (and QKS Smalltalk) compilers perform a wide range of >optimizations including peephole optimization on the opcodes. > Well, the Visuaworks compiler could certainly use some peephole optimizations on the bytecodes. Why not also try some global optimizations ? >The v4 AOS Platform's JIT is very fast at analyzing and generating optimized >native machine code because the v4 opcode-instruction set was designed >primarily for this purpose. The VM cache mechanism is already tracking >methods, so it could readily detect when/if the standard #to:do: method was >specialized somewhere (and thus not monomorphic/DNR w.r.t. numeric >receivers). > >Now, having said all that, the next question is what is the real value of >doing it. My answer is that classic-Smalltalk does *not* provide >selector-namespaces. Therefore, there are many message-selectors that one >simply can't replace/redefine (like #class) without breaking the frameworks. >Selector name collisions (and their often global semantic impact) have >little or nothing to do with compilation issues you've been raising. > >I would make the case to you, that given there are some selectors you cannot >change/redefine because of the framework design, it is reasonable to just >consider the inlined optimized messages as being part of the set of >non-redefineable messages. > I disagree here; I think the loop and conditional constructs are different at a much deeper level, so it's not just a name collision, I think many of them are (not by coincidence) already being treated differently, as they should. >As a developer, you may have to deal at some time or other with >selector-name collisions for other (non-vendor) reasons. There is no reason >why you can't define your own #customTo:do: methods and use them everywhere >you deeem appropriate. > >CONCLUSION: The change you would like could be made, but shouldn't be. >Accepting the need for this change is a slippery slope because it raises the >assumption that such changes should be possible for all selectors. I disagree here again. As I said, I think there is a very specific class of messages that needs to be treated differently > That is >not within the implementation goals/designs of the Smalltalk language or its >standard. Who cares ? Really, this is an argument I cannot buy. You yourself keep questioning some of the fundamental assumptions and this is a great thing to do. Nothing is cast in stone and the standards committee certainly did not exhaust the good ideas about what this language should be <snip> Florin |
"Florin Mateoc" <[hidden email]> wrote in message
news:3_eN6.8035$[hidden email]... > In article <_QdN6.17210$%[hidden email]>, David Simmons > says... ... snip ... > >I still don't understand. Either safety checks are performed or they are > >not. If they are performed then some cost is incurred, if they are not > >performed some safety risk is exposed. > > > >The compilation process of Smalltalk doesn't change the end-result. > > > >Based on comments you make below, I think you want something slightly > >different than what you're communicating you want. > > > >In other words, you would like the compiler optimizations to generate > >opcodes which can be "uniquely" reverse-compiled or deciphered by the JIT > >originally being a #to:do: message rather than a sequence of numeric > >operations and boolean test opcodes. > > > >In SmallScript/QKS-Compilers, such "signatures" for message use are already > >present for the tools and debuggers to display message use references, etc. > >I.e., you can query a method for use of #if... or #to:do: and it how and > >where it was used/optimized. > > > >However, that still has *no* consistently useable value for the JIT (or an > >interpreter) in making some preference based decision about whether to > >"actually" send a polymorphic #to:do: or perform the inlined code sequence. > > > >You ask why? The answer is that what is really being optimized/focused-on > >is *not* the #to:do: message. The optimization that is really being > >focused-on/performed (at least in SmallScript and previous QKS Compilers) is > >the inlined-optimization of the *block* closure arguments. The inlined > >versions of a closure can result in dramatically different code from what > >would be needed if they were not inlined. > > > >Therefore, the only general solution that I think achieves your goal would > >be to actually generate opcodes for both variants. Then the JIT could choose > >to generate native-machine code for whichever variant was appropriate. > >Alternatively, the JIT could actually request a recompilation of the method > >with the optimization-flags turned off (again this is a feature in > >SmallScript/QKS-Smalltalk compilers). Recompilation assumes that the source > >for the method is either available or could be reconstituted. > > > Ok, I think I understand where has the miscommunication happened (perhaps you > have noticed it too, if you have read my latest post in the original thread). > Although I started the thread talking about #to:do: , what I was arguing for was > not a different implementation/optimization for #to:do: because, as you say, the > most important thing has already happened there: the literal block has been > inlined. What I want is to achieve a similar thing for a #do: method and its > variants, a construct that would apply to any instance of a variable class, > unlike the current #do: method which on one hand takes at least a copying block > (but very often a full one) as an argument, on the other hand it uses > bound-checked acceses although it could avoid them entirely, and to add insult > to injury is only available for subclasses of Collection, although it should be > in the same conceptual category (muliplicity) with #at: and #at:put: that are > implemented in Object. > Ok, I think I begin to understand what you are asking for. I think the answer to what you're asking for lies in two different things: a) The ability to declare a non-capturable block. b) The ability of the JIT to adaptively inline methods. Given those two features the following could be optimized: anObject do: aBlock. a) The <aBlock> would not need a real-block or context; it would be a virtual-object (tagged) pointer that referenced an internal subroutine within the block's home context and thus it too would have been efficiently optimized. b) The #do: could be inlined by the JIT to produce an efficient iterator over the entries. NOTE: Given the speed of messaging in a JIT implementation the performance difference between inlining the #do: and just calling the #do: method is really nominal. Therefore, what you are really asking for is what I mentioned in my first responding post -- "non-captured" block declaration form. A given #do: method could be safely coded, as I described with the NRC methods. The NRC mechanism works very efficiently for any collection whose slot count will change during execution of its code sequence. This can be further enhanced as appropriate (in SmallScript and QKS Smalltalks) by annotating the method as being <$self-serialized>. Which allows the object implementation to control asychronous access to the receiver -- which might be useful if asychronous code could change the objects schema/layout. --- NET RESULT: Given a non-capturable ![...] block declaration form would enable implementation of hi-performance inlined blocks. Where such a block object would support the IValuable protocol, but would actually be a tagged virtual-object subroutine pointer and therefore would involve no context or block instantiation. This would achieve your #do: goals. SmallScript and v4 AOS Platform already have all the "under-the-hood" pieces for this. The capability is actually used in the compilation of the exception handlers and declarative "for (each...)" keyword. As my prior posts exception-handler-example illustrated it makes a huge performance difference. (Compare SmallScript's 18ms to say VW copying/dirty blocks 120ms/250ms). The only reason I don't have a "non-capturing" declaration form is that, so far, I've not found a syntax I'm happy with. That's a big deal because I know that once I find a syntax I'll use it all over the place. I've been hoping I would figure out some clever mechanism to inference or jit optimize them or I would find an acceptable syntax. The "for (each x in foo) [...]" keyword expression is actually a typed #do: method. foo do: [:x| ...]. Which supports a typed variant "for (each <Type> x in foo) [...]" keyword expression is: foo #Type.do: [:x| ...]. Where #Type.do: is a scoped message selector. Thus we can write: class name: Blah [ method "general scope" [ do: valuable ... ]. method scope: SomeTypeA [ do: valuable ... ]. method scope: SomeTypeB [ do: valuable ... ]. ]. "" Invoke the generic #do: for (each x in aBlah) ...expresssions... "OR" "" Invoke the SomeTypeA scoped #do: for (each <SomeTypeA> x in aBlah) ...expresssions... "OR" "" Invoke the SomeTypeB scoped #do: for (each <SomeTypeB> x in aBlah) [...expresssions...] "OR" "" Invoke the SomeTypeB scoped #do: aBlah #SomeTypeB.do: [:x| ...]. > > > > >In my images, #to:do: is used in about ~4% of all the methods. Compiling two > >variants adds unecessary complication to the compiler architecture and JIT > >mechanics -- but read on to see my rationale for assessing this whole issue. > > > >Recompilation is very fast (SmallScript compiles roughly a million lines a > >minute on my dev box) and it would only need to be done on demand for those > >methods that were not already in the in the correct #to:do: mode. The > >SmallScript (and QKS Smalltalk) compilers perform a wide range of > >optimizations including peephole optimization on the opcodes. > > > > Well, the Visuaworks compiler could certainly use some peephole optimizations on > the bytecodes. > Why not also try some global optimizations? Because the SmallScript/Smalltalk compiler can't make assumptions about the global state of things -- they're not static because the language is not based on a static type and compilation model. The global optimizations are made by the JIT because it is the only place where all the global state is known and changes are dynamically tracked. It detects when assumptions are changed because it is integrated with the method cache mechanisms and the meta-object facilities. It can then recompile/regenerate method native-code for situations where its global optimization assumptions have been dynamically changed. A classic example of this is the support for read/write barriers on objects. By default the JIT does not generate code for supporting read-write barriers. But, when the first request is made for such services it updates all affected code and thereafter generates support. When the last "live" requester for this service goes away, the JIT regenerates the code (a bit more lazily in case it is being toggled frequently) to not provide such services. Similar rules apply for inlining, monomorphic methods, etc. > > >The v4 AOS Platform's JIT is very fast at analyzing and generating optimized > >native machine code because the v4 opcode-instruction set was designed > >primarily for this purpose. The VM cache mechanism is already tracking > >methods, so it could readily detect when/if the standard #to:do: method was > >specialized somewhere (and thus not monomorphic/DNR w.r.t. numeric > >receivers). > > > >Now, having said all that, the next question is what is the real value of > >doing it. My answer is that classic-Smalltalk does *not* provide > >selector-namespaces. Therefore, there are many message-selectors that one > >simply can't replace/redefine (like #class) without breaking the frameworks. > >Selector name collisions (and their often global semantic impact) have > >little or nothing to do with compilation issues you've been raising. > > > >I would make the case to you, that given there are some selectors you cannot > >change/redefine because of the framework design, it is reasonable to just > >consider the inlined optimized messages as being part of the set of > >non-redefineable messages. > > > > I disagree here; I think the loop and conditional constructs are different at a > much deeper level, so it's not just a name collision, I think many of them are > (not by coincidence) already being treated differently, as they should. Hmm. I agree that control-flow is a fundamental idea. I have long argued this point regarding Smalltalk. It is one of the reasons why SmallScript has explicit control flow expressions. It is also why it was important to have tail calls to ensure that #while... constructs could actually be written in Smalltalk itself. But I disagree with the notion we need to single out the message-name-forms of methods which de-facto provide control-flow within Smalltalk. They are not any more special or different than any other selector-collision (pre-defined for some vendor or non-vendor reason) message. Control flow, the concept is fundamental. Messages that get used or defined to provide control flow are not inherently different from any other message name. Again, I would strongly suggest that the real issue is that of "blocks" and not one of "control-flow" and/or trying to categorize some set of well-defined selector-names as being somehow different from another (vendor or non-vendor) usage of well-defined selector names. The selector-name issue you originally raised is best viewed as a selector-name scoping problem. The #do: performance issue you're now after is a combination of factors that includes range-check optimizations (see my NRC notes), and non-capturable blocks (see my subroutine NCB comments). > > >As a developer, you may have to deal at some time or other with > >selector-name collisions for other (non-vendor) reasons. There is no reason > >why you can't define your own #customTo:do: methods and use them everywhere > >you deeem appropriate. > > > >CONCLUSION: The change you would like could be made, but shouldn't be. > >Accepting the need for this change is a slippery slope because it raises the > >assumption that such changes should be possible for all selectors. > > I disagree here again. As I said, I think there is a very specific class of > messages that needs to be treated differently > > > That is > >not within the implementation goals/designs of the Smalltalk language or its > >standard. > > Who cares ? Really, this is an argument I cannot buy. You yourself keep > questioning some of the fundamental assumptions and this is a great thing to do. > Nothing is cast in stone and the standards committee certainly did not exhaust > the good ideas about what this language should be You're absolutely right that I question the fundamental assumptions. But the basic philosphy of Smalltalk's simplicity/ease-of-use tempered by "pragmatism" still remains -- which was what my comment was driving at. The standard (if not the basic language MOP itself), for good reason, defines certain <Object,etc> messages and their global semantics which in turn makes them effectively mono-morphic and not refineable. It is simply not pragmatic to provide 100% MOP functionality. The performance and/or implementation costs are (until proven otherwise) unnacceptable. -- Dave S. [www.smallscript.net] > > <snip> > > Florin > > |
"David Simmons" <[hidden email]> wrote in message
news:9ZgN6.17360$%[hidden email]... > "Florin Mateoc" <[hidden email]> wrote in message > news:3_eN6.8035$[hidden email]... > > In article <_QdN6.17210$%[hidden email]>, David > Simmons > > says... > > ... snip ... > > > >I still don't understand. Either safety checks are performed or they are > > >not. If they are performed then some cost is incurred, if they are not > > >performed some safety risk is exposed. > > > > > >The compilation process of Smalltalk doesn't change the end-result. > > > > > >Based on comments you make below, I think you want something slightly > > >different than what you're communicating you want. > > > > > >In other words, you would like the compiler optimizations to generate > > >opcodes which can be "uniquely" reverse-compiled or deciphered by the > as > > >originally being a #to:do: message rather than a sequence of numeric > > >operations and boolean test opcodes. > > > > > >In SmallScript/QKS-Compilers, such "signatures" for message use are > already > > >present for the tools and debuggers to display message use references, > etc. > > >I.e., you can query a method for use of #if... or #to:do: and it how and > > >where it was used/optimized. > > > > > >However, that still has *no* consistently useable value for the JIT (or > an > > >interpreter) in making some preference based decision about whether to > > >"actually" send a polymorphic #to:do: or perform the inlined code > sequence. > > > > > >You ask why? The answer is that what is really being > optimized/focused-on > > >is *not* the #to:do: message. The optimization that is really being > > >focused-on/performed (at least in SmallScript and previous QKS > is > > >the inlined-optimization of the *block* closure arguments. The inlined > > >versions of a closure can result in dramatically different code from what > > >would be needed if they were not inlined. > > > > > >Therefore, the only general solution that I think achieves your goal > would > > >be to actually generate opcodes for both variants. Then the JIT could > choose > > >to generate native-machine code for whichever variant was appropriate. > > >Alternatively, the JIT could actually request a recompilation of the > method > > >with the optimization-flags turned off (again this is a feature in > > >SmallScript/QKS-Smalltalk compilers). Recompilation assumes that the > source > > >for the method is either available or could be reconstituted. > > > > > > Ok, I think I understand where has the miscommunication happened > you > > have noticed it too, if you have read my latest post in the original > thread). > > Although I started the thread talking about #to:do: , what I was arguing > for was > > not a different implementation/optimization for #to:do: because, as you > say, the > > most important thing has already happened there: the literal block has > been > > inlined. What I want is to achieve a similar thing for a #do: method and > its > > variants, a construct that would apply to any instance of a variable > class, > > unlike the current #do: method which on one hand takes at least a > block > > (but very often a full one) as an argument, on the other hand it uses > > bound-checked acceses although it could avoid them entirely, and to add > insult > > to injury is only available for subclasses of Collection, although it > should be > > in the same conceptual category (muliplicity) with #at: and #at:put: that > are > > implemented in Object. > > > > Ok, I think I begin to understand what you are asking for. > > I think the answer to what you're asking for lies in two different things: > > a) The ability to declare a non-capturable block. > b) The ability of the JIT to adaptively inline methods. > > Given those two features the following could be optimized: > > anObject do: aBlock. > > a) The <aBlock> would not need a real-block or context; it would be a > virtual-object (tagged) pointer that referenced an internal subroutine > within the block's home context and thus it too would have been > optimized. > > b) The #do: could be inlined by the JIT to produce an efficient iterator > over the entries. > > NOTE: Given the speed of messaging in a JIT implementation the performance > difference between inlining the #do: and just calling the #do: method is > really nominal. Therefore, what you are really asking for is what I > mentioned in my first responding post -- "non-captured" block declaration > form. > > A given #do: method could be safely coded, as I described with the NRC > methods. The NRC mechanism works very efficiently for any collection whose > slot count will change during execution of its code sequence. > > This can be further enhanced as appropriate (in SmallScript and QKS > Smalltalks) by annotating the method as being <$self-serialized>. Which > allows the object implementation to control asychronous access to the > receiver -- which might be useful if asychronous code could change the > objects schema/layout. > --- > NET RESULT: > Given a non-capturable ![...] block declaration form would enable > implementation of hi-performance inlined blocks. Where such a block object > would support the IValuable protocol, but would actually be a tagged > virtual-object subroutine pointer and therefore would involve no context > block instantiation. This would achieve your #do: goals. > > SmallScript and v4 AOS Platform already have all the "under-the-hood" pieces > for this. The capability is actually used in the compilation of the > exception handlers and declarative "for (each...)" keyword. As my prior > posts exception-handler-example illustrated it makes a huge performance > difference. (Compare SmallScript's 18ms to say VW copying/dirty blocks > 120ms/250ms). > > The only reason I don't have a "non-capturing" declaration form is that, so > far, I've not found a syntax I'm happy with. That's a big deal because I > know that once I find a syntax I'll use it all over the place. I've been > hoping I would figure out some clever mechanism to inference or jit optimize > them or I would find an acceptable syntax. > Uff, finally we made some progress in communicating ;-). But the sintax is here: Even if you don't buy the argument that the looping constructs should be declarative, a lot of what I want could be achieved with a "normal" optimized message (like #to:do:) called, let's say, #basicDo:, applicable to variable instances, implemented in Object like 1 to: self basicSize do: [:i | aBlock value: (self basicAt: i)] as a non-inlined version. This #basicDo: could be inlined together with the literal block at the call site. Instead of typing we could have variations like basicBytesDo: ... (#basicDo: would be a synonym for #basicPointersDo:) What I proposed, additionally is that we have special bytecodes to be used in a case like this, for non-checked element accesses (with no access message send at all) - the equivalent of your NRC methods (but I still think it would be more Smalltalk-like if they were bytecodes - look who's defending good old (safe) Smalltalk now ! ;-)) Since I am pretty sure that loops are _the_ major source of non-inlined blocks in a Smalltalk image, this could lead to significant general speedups, plus specialized numeric code (a sore point in Smalltalk) would be especially improved. But things should not stop here ! We should have multi-dimensional variable classes and multi-dimensional iterators as well. And now we begin to really see some more fundamental limitations in how Smalltalk deals with multiplicity. I imagine a solution based on dynamically created multi-variable classes (the first several dimensions could be predefined) <snip> > > Why not also try some global optimizations? > > Because the SmallScript/Smalltalk compiler can't make assumptions about the > global state of things -- they're not static because the language is not > based on a static type and compilation model. > > The global optimizations are made by the JIT because it is the only place > where all the global state is known and changes are dynamically tracked. It > detects when assumptions are changed because it is integrated with the > method cache mechanisms and the meta-object facilities. It can then > recompile/regenerate method native-code for situations where its global > optimization assumptions have been dynamically changed. > > A classic example of this is the support for read/write barriers on objects. > By default the JIT does not generate code for supporting read-write > barriers. But, when the first request is made for such services it updates > all affected code and thereafter generates support. When the last "live" > requester for this service goes away, the JIT regenerates the code (a bit > more lazily in case it is being toggled frequently) to not provide such > services. > > Similar rules apply for inlining, monomorphic methods, etc. > Yes, of course, but a lot of this (not quite as much as for a static language, but a lot) could be achieved at packaging time (as an intermediate step between development time and run time) as a whole program optimization, still at the bytecodes level. Especially that at packaging time the developer can also provide good optimization hints. This would be a portable, optimized form of an application (with de-optimization capablities as well). At run time this would be further optimized when translated to native code. > > > > I disagree here; I think the loop and conditional constructs are different > at a > > much deeper level, so it's not just a name collision, I think many of them > are > > (not by coincidence) already being treated differently, as they should. > > Hmm. I agree that control-flow is a fundamental idea. I have long argued > this point regarding Smalltalk. It is one of the reasons why SmallScript has > explicit control flow expressions. It is also why it was important to have > tail calls to ensure that #while... constructs could actually be written in > Smalltalk itself. > > But I disagree with the notion we need to single out the message-name-forms > of methods which de-facto provide control-flow within Smalltalk. They are > not any more special or different than any other selector-collision > (pre-defined for some vendor or non-vendor reason) message. > > Control flow, the concept is fundamental. Messages that get used or defined > to provide control flow are not inherently different from any other message > name. > > Again, I would strongly suggest that the real issue is that of "blocks" and > not one of "control-flow" and/or trying to categorize some set of > well-defined selector-names as being somehow different from another (vendor > or non-vendor) usage of well-defined selector names. > > The selector-name issue you originally raised is best viewed as a > selector-name scoping problem. > > The #do: performance issue you're now after is a combination of factors that > includes range-check optimizations (see my NRC notes), and non-capturable > blocks (see my subroutine NCB comments). > Yes, we mostly agree here. And I will grant you that there is not much to be gained by just re-categorizing these special selectors. But, like I have some doubts about how Smalltalk treats multiplicity I also have serious doubts about how it treats numbers and mathematics in general. And perhaps operators should also appear in the language (this would independently invalidate the absolute generality of message sends). At least you have implemented multi-methods. But I also think that numbers and similarly abstract entities should be essentially identity-less, as opposed to objects representing concrete things, so we should probably have less inteligent objects together with more intelligent ones. But, one step at a time... > > > > >As a developer, you may have to deal at some time or other with > > >selector-name collisions for other (non-vendor) reasons. There is no > reason > > >why you can't define your own #customTo:do: methods and use them > everywhere > > >you deeem appropriate. > > > > > >CONCLUSION: The change you would like could be made, but shouldn't be. > > >Accepting the need for this change is a slippery slope because it > the > > >assumption that such changes should be possible for all selectors. > > > > I disagree here again. As I said, I think there is a very specific class > of > > messages that needs to be treated differently > > > > > That is > > >not within the implementation goals/designs of the Smalltalk language or > its > > >standard. > > > > Who cares ? Really, this is an argument I cannot buy. You yourself keep > > questioning some of the fundamental assumptions and this is a great thing > to do. > > Nothing is cast in stone and the standards committee certainly did not > exhaust > > the good ideas about what this language should be > > You're absolutely right that I question the fundamental assumptions. > > But the basic philosphy of Smalltalk's simplicity/ease-of-use tempered by > "pragmatism" still remains -- which was what my comment was driving at. The > standard (if not the basic language MOP itself), for good reason, defines > certain <Object,etc> messages and their global semantics which in turn makes > them effectively mono-morphic and not refineable. > > It is simply not pragmatic to provide 100% MOP functionality. The > performance and/or implementation costs are (until proven otherwise) > unnacceptable. > Yes, but I strongly feel that if a MOP modification that is philosophically more satisfaying also provides opportunities for serious performance enhancements, we (the Smalltalk comunity) should go for it. Especially that, given the current circumstances, this might be a matter of survival for Smalltalk > -- Dave S. [www.smallscript.net] > Florin |
"Florin Mateoc" <[hidden email]> wrote in message
news:9e4hcc$c64$[hidden email]... > > "David Simmons" <[hidden email]> wrote in message > news:9ZgN6.17360$%[hidden email]... > > "Florin Mateoc" <[hidden email]> wrote in message > > news:3_eN6.8035$[hidden email]... > > > In article <_QdN6.17210$%[hidden email]>, David > > Simmons > > > says... > > ...snip... > > Uff, finally we made some progress in communicating ;-). > > But the syntax is here: > > Even if you don't buy the argument that the looping constructs should be > declarative, a lot of what I want could be achieved with a "normal" > optimized message (like #to:do:) called, let's say, #basicDo:, applicable to > variable instances, implemented in Object like > > 1 to: self basicSize do: [:i | aBlock value: (self basicAt: i)] > > as a non-inlined version. > > This #basicDo: could be inlined together with the literal block at the call > site. Instead of typing we could have variations like basicBytesDo: ... > (#basicDo: would be a synonym for #basicPointersDo:) What I proposed, > additionally is that we have special bytecodes to be used in a case like > this, for non-checked element accesses (with no access message send at > all) - the equivalent of your NRC methods (but I still think it would be > more Smalltalk-like if they were bytecodes - look who's defending good old > (safe) Smalltalk now ! ;-)) Ok. I feel confident that I was on the right track for what you wanted. Optimizing blocks has long been a focus for me and in building the v4 AOS Platform engine I intentionally put in subroutine support with this very goal in mind -- and proved its value to myself with the (almost zero-cost) exception mechanism implementation [I could have made it totally zero cost if I was willing to make the handler crawl more expensive]. BLOCK DEBUGGING SIDEBAR: Eliot's post mentioned something about problems with block debugging and stepping in/out. This is not an issue I'm familiar with. The QKS Smalltalk debugger supports one-shot, repeatable, counter based, and arbitrary block based breakpoints at the individual opcode level and on method enter/exit. The debugger is supplied with full information about contexts and messages by the compiler(s). The debugger has, therefore, always supported step, step-into, step-over, step-out. It also supports writing arbitrary block expressions that will be evaluated as watchpoints at every message-send/return. Those watchpoints can do anything and are responsible to return a boolean to indicate whether a dynamic breakpoint should occur. Finally, the debugger supports all these breakpoints on a per/thread (st-process) basis so you can do things like debug the debugger UI itself just by spawning the debugged version on a separate thread. -- END SIDEBAR -- As I mentioned previously, an adaptive JIT does not require special message names to accomplish your goals. What you really want is that (no-context/objects-created but has full access to local frame information) ![NCB] (non-capturable-block) [it really needs a better name]. The performance difference between a special #basicDo:, and a developer written version using NRC methods is nominal at best. The current SmallScript/v4-AOS JIT generates code that is essentially identical to what a hand coded variant would generate. And, unlike a dedicate primitive, the NRC approach has the flexibility that any developer can write their own hi-perf routines themselves as ordinary methods. When I design these kind of features I'm always looking for general themes an unifying solutions. The NRC mechanism is a general pattern that can be used by any developer. I.e., all the basic... accessors have a basicNRC... form that is more aggressively optimized. The ability to declare a block form that can be safely inlined by the compiler/jit but which also provides a OOP that can be passed is another such unifying general theme. I.e., just pass a special tagged virtual-pointer which supports the #value... messages where the pointer is actually the encoded address of a subroutine entry point back into the home context frame. Wah la, very fast direct callback. This is what the v4 AOS Platform EE provides; but it is only used for exception forms and declarative for loops. I want to enable using it anywhere via some new block-declaration syntax. As one sees in Lisp, providing annotations to tell the JIT/Compiler how aggressively vs. safely it can compile a method would also allow for such features. I.e., I could forego the new syntax and instead just allow the developer to write an annotation on the method or block level such as <$optimize=full>. aFoo do: [<$optimize=full> :v| ...]. "OR" method [<$optimize=full;$safety=none> someSelector aFoo do: [:v| ...] ... ]. I.e., real speedup will come from providing ![NCB] declarative form. aFoo do: ![:v| ...]. I just need a syntax that feels good for widespread usage and is relatively self evident. > > Since I am pretty sure that loops are _the_ major source of non-inlined > blocks in a Smalltalk image, this could lead to significant general > speedups, plus specialized numeric code (a sore point in Smalltalk) would be > especially improved. The number of key areas where non-static blocks are used (but where NCB blocks would work just fine) is, in my images at least, quite significant. Any certainly it would really improve any collection/string/stream processing. > > But things should not stop here ! > We should have multi-dimensional variable classes and multi-dimensional > iterators as well. And now we begin to really see some more fundamental > limitations in how Smalltalk deals with multiplicity. I imagine a solution > based on dynamically created multi-variable classes (the first several > dimensions could be predefined) Hmmm. SmallScript has a full set of multi-dimensional and array slice facilities. I don't know if you've seen any of my posts describing one of its new message forms #[...](...). They allow writing code like: [ anObject[a:b][c][d:] := anExpression[a:b][c][d:]. "" or less cryptic/arbitrary |aString| := 'foobar' copy. |s| := aString[`o[^r].*']. ""Where <s> is set to 'ooba'. "" Or the following aString[`o[^r].*'] := 'rontie'. "" Where <aString> is modified in place to 'frontier'. ]. > > > <snip> > > > > > Why not also try some global optimizations? > > > > Because the SmallScript/Smalltalk compiler can't make assumptions about > the > > global state of things -- they're not static because the language is not > > based on a static type and compilation model. > > > > The global optimizations are made by the JIT because it is the only > > where all the global state is known and changes are dynamically tracked. > It > > detects when assumptions are changed because it is integrated with the > > method cache mechanisms and the meta-object facilities. It can then > > recompile/regenerate method native-code for situations where its global > > optimization assumptions have been dynamically changed. > > > > A classic example of this is the support for read/write barriers on > objects. > > By default the JIT does not generate code for supporting read-write > > barriers. But, when the first request is made for such services it > > all affected code and thereafter generates support. When the last "live" > > requester for this service goes away, the JIT regenerates the code (a bit > > more lazily in case it is being toggled frequently) to not provide such > > services. > > > > Similar rules apply for inlining, monomorphic methods, etc. > > > > Yes, of course, but a lot of this (not quite as much as for a static > language, but a lot) could be achieved at packaging time (as an intermediate > step between development time and run time) as a whole program optimization, > still at the bytecodes level. I don't see how any of this can be achieved except dynamically at execution time. At packaging time one has no idea of what the clients of these packages will want to do. I.e., what aspect-oriented programming facilities (MOP services) they might require. A classic use of the read/write barriers is to provide distributed object proxies or trivially build object-relational mappings through instances of arbitrary classes that were never designed with that thought in mind. > Especially that at packaging time the > developer can also provide good optimization hints. This would be a > portable, optimized form of an application (with de-optimization capablities > as well). At run time this would be further optimized when translated to > native code. As to the optimization hints, I'm totally with you here. That is one of the key (usage) features of the whole annotation and property system (both of which are fully developer extensible). > > > > > > > I disagree here; I think the loop and conditional constructs are > different > > at a > > > much deeper level, so it's not just a name collision, I think many of > them > > are > > > (not by coincidence) already being treated differently, as they should. > > > > Hmm. I agree that control-flow is a fundamental idea. I have long argued > > this point regarding Smalltalk. It is one of the reasons why SmallScript > has > > explicit control flow expressions. It is also why it was important to have > > tail calls to ensure that #while... constructs could actually be written > in > > Smalltalk itself. > > > > But I disagree with the notion we need to single out the > message-name-forms > > of methods which de-facto provide control-flow within Smalltalk. They are > > not any more special or different than any other selector-collision > > (pre-defined for some vendor or non-vendor reason) message. > > > > Control flow, the concept is fundamental. Messages that get used or > defined > > to provide control flow are not inherently different from any other > message > > name. > > > > Again, I would strongly suggest that the real issue is that of "blocks" > and > > not one of "control-flow" and/or trying to categorize some set of > > well-defined selector-names as being somehow different from another > (vendor > > or non-vendor) usage of well-defined selector names. > > > > The selector-name issue you originally raised is best viewed as a > > selector-name scoping problem. > > > > The #do: performance issue you're now after is a combination of factors > that > > includes range-check optimizations (see my NRC notes), and > > blocks (see my subroutine NCB comments). > > > > Yes, we mostly agree here. And I will grant you that there is not much to be > gained by just re-categorizing these special selectors. > But, like I have some doubts about how Smalltalk treats multiplicity I also > have serious doubts about how it treats numbers and mathematics in general. Your making an argument for primitive value types. That radical a change could put us right into the problem space Java has (and to a lesser degree C# for that matter). Using annotations for those numerically intensive/related methods could accomplish the same goal without changing the language. a) SmallScript allows providing type information on variables, expressions, etc. The type information can be suggestive or imperative. I.e., you can hint or you can tell the compiler to trust and assume. b) A given method or block of code can be annotated. The JIT can then use such annotations to make optimization decisions. One such annotation might be (making this up) <$optimize=assume-arithmetic-value-types>. This might tell the compiler to assume that all #(+ - * / ...) messages were numeric elements. We might also use this to turn off overflow checking. Or tell it to assume/convert all numeric operations to use floats, etc. Combined with the optional type information one can then write methods/expressions that the compiler+jit can optimize just like a statically typed and compiled language. > And perhaps operators should also appear in the language (this would > independently invalidate the absolute generality of message sends). At least > you have implemented multi-methods. But I also think that numbers and > similarly abstract entities should be essentially identity-less, as opposed > to objects representing concrete things, so we should probably have less > inteligent objects together with more intelligent ones. > But, one step at a time... I'm not quite sure what you mean by entity-less. I did propose and partially implement a notion of numeric registers. I.e., classes like FloatRegister, UInt32Register, ... where all operations like #(+ / - ...) overwrite the receiver with the return value. This would enable creating efficient Int32, Int64, Float facilities but would complicate the coding style and therefore is not ideal; so I've let it sit for more thought because I've felt there were other more attractive solutions: a) Use annotations and typing b) Since the AOS Platform is designed to support multiple languages just use one of those other languages (when they're available) to write the numeric methods. > > > > > > > > >As a developer, you may have to deal at some time or other with > > > >selector-name collisions for other (non-vendor) reasons. There is no > > reason > > > >why you can't define your own #customTo:do: methods and use them > > everywhere > > > >you deeem appropriate. > > > > > > > >CONCLUSION: The change you would like could be made, but shouldn't > > > >Accepting the need for this change is a slippery slope because it > raises > > the > > > >assumption that such changes should be possible for all selectors. > > > > > > I disagree here again. As I said, I think there is a very specific class > > of > > > messages that needs to be treated differently > > > > > > > That is > > > >not within the implementation goals/designs of the Smalltalk language > or > > its > > > >standard. > > > > > > Who cares ? Really, this is an argument I cannot buy. You yourself > > > questioning some of the fundamental assumptions and this is a great > thing > > to do. > > > Nothing is cast in stone and the standards committee certainly did not > > exhaust > > > the good ideas about what this language should be > > > > You're absolutely right that I question the fundamental assumptions. > > > > But the basic philosphy of Smalltalk's simplicity/ease-of-use tempered > > "pragmatism" still remains -- which was what my comment was driving at. > The > > standard (if not the basic language MOP itself), for good reason, defines > > certain <Object,etc> messages and their global semantics which in turn > makes > > them effectively mono-morphic and not refineable. > > > > It is simply not pragmatic to provide 100% MOP functionality. The > > performance and/or implementation costs are (until proven otherwise) > > unnacceptable. > > > > Yes, but I strongly feel that if a MOP modification that is > more satisfaying also provides opportunities for serious performance > enhancements, we (the Smalltalk comunity) should go for it. Especially that, > given the current circumstances, this might be a matter of survival for > Smalltalk I certainly agree with the former (MOP comments). I don't agree with the latter (survival comments). If anything, one of Smalltalk's biggest problems getting widespread attention/usage vis-a-vis other up and coming languages because there is such a big gap between its grammatical, syntactical, conceptual mechanics and those one finds in the predominant/mainstream body of programming languages. Obviously I'm heavily biased here since my belief is part of what fueled the SmallScript design and its syntactic flexibility is directly focused on narrowing/eliminating that gap. -- Dave S. [www.smallscript.net] > > > > -- Dave S. [www.smallscript.net] > > > > Florin > > |
"David Simmons" <[hidden email]> wrote in message
news:KdnN6.18311$%[hidden email]... > "Florin Mateoc" <[hidden email]> wrote in message > news:9e4hcc$c64$[hidden email]... > > > > "David Simmons" <[hidden email]> wrote in message > > news:9ZgN6.17360$%[hidden email]... > > > "Florin Mateoc" <[hidden email]> wrote in message > > > news:3_eN6.8035$[hidden email]... > > > > In article <_QdN6.17210$%[hidden email]>, David > > > Simmons > > > > says... ...snip... > > Ok. I feel confident that I was on the right track for what you wanted. Yes, we are in full dialog ;-) > Optimizing blocks has long been a focus for me and in building the v4 AOS > Platform engine I intentionally put in subroutine support with this very > goal in mind -- and proved its value to myself with the (almost zero-cost) > exception mechanism implementation [I could have made it totally zero cost > if I was willing to make the handler crawl more expensive]. > > BLOCK DEBUGGING SIDEBAR: Eliot's post mentioned something about problems > with block debugging and stepping in/out. This is not an issue I'm familiar > with. The QKS Smalltalk debugger supports one-shot, repeatable, counter > based, and arbitrary block based breakpoints at the individual opcode level > and on method enter/exit. The debugger is supplied with full information > about contexts and messages by the compiler(s). The debugger has, therefore, > always supported step, step-into, step-over, step-out. It also supports > writing arbitrary block expressions that will be evaluated as watchpoints at > every message-send/return. Those watchpoints can do anything and are > responsible to return a boolean to indicate whether a dynamic breakpoint > should occur. Finally, the debugger supports all these breakpoints on a > per/thread (st-process) basis so you can do things like debug the debugger > UI itself just by spawning the debugged version on a separate thread. > -- END SIDEBAR -- I just wrote a reply to Eliot. I am not sure that I understand correctly what Eliot meant, though. Anyway, you can debug the debugger in VisualWorks as well. > > As I mentioned previously, an adaptive JIT does not require special message > names to accomplish your goals. What you really want is that > (no-context/objects-created but has full access to local frame information) > ![NCB] (non-capturable-block) [it really needs a better name]. > > The performance difference between a special #basicDo:, and a developer > written version using NRC methods is nominal at best. The current > SmallScript/v4-AOS JIT generates code that is essentially identical to what > a hand coded variant would generate. And, unlike a dedicate primitive, the > NRC approach has the flexibility that any developer can write their own > hi-perf routines themselves as ordinary methods. > Yes, but SmallScript has departed significantly from the Smalltak model. The special #basicDo: would work within the the existing systems as well, with a kind of modifications that are "socially acceptable" in the Smalltalk community: modifications to the compiler, adding a few bytecodes. I agree that performance-wise they would be equivalent, but if they can be done as Smalltalk, why are extensive changes like SmallScript needed ? > > > > But things should not stop here ! > > We should have multi-dimensional variable classes and multi-dimensional > > iterators as well. And now we begin to really see some more fundamental > > limitations in how Smalltalk deals with multiplicity. I imagine a solution > > based on dynamically created multi-variable classes (the first several > > dimensions could be predefined) > > Hmmm. SmallScript has a full set of multi-dimensional and array slice > facilities. I don't know if you've seen any of my posts describing one of > its new message forms #[...](...). > > They allow writing code like: > [ > anObject[a:b][c][d:] := anExpression[a:b][c][d:]. > "" or less cryptic/arbitrary > > |aString| := 'foobar' copy. > > |s| := aString[`o[^r].*']. > > ""Where <s> is set to 'ooba'. > > "" Or the following > aString[`o[^r].*'] := 'rontie'. > > "" Where <aString> is modified in place to 'frontier'. > ]. > Nice, and yes, I think that in-place modifications are another important missing piece in Smalltalk. > > > > > > <snip> > > > > Yes, of course, but a lot of this (not quite as much as for a static > > language, but a lot) could be achieved at packaging time (as an > intermediate > > step between development time and run time) as a whole program > optimization, > > still at the bytecodes level. > > I don't see how any of this can be achieved except dynamically at > time. At packaging time one has no idea of what the clients of these > packages will want to do. I.e., what aspect-oriented programming facilities > (MOP services) they might require. A classic use of the read/write barriers > is to provide distributed object proxies or trivially build > object-relational mappings through instances of arbitrary classes that were > never designed with that thought in mind. > But some of these optimization hints can be exactly directed to address the dynamism problem at packaging time: "no dynamic classes" or "no dynamic methods" Such directives would directly apply for a vast majority of the deployed applications, because the dynamism is mostly used in development, not at run time. Even those applications that are truly dynamic would considerably benefit from applying such constraints on chunks of the application that are static. And an advantage (that you ignored in this argument) is that the application thus optimized is still in a portable, bytecode form. Another powerful optimization that is not portable but still applicable at packaging time is to select (deployment-)platform specific implementation classes These optimization directives are generic, they don't require annotations or any change to the language, but still they could account for very significant global optimizations. > > > > Yes, we mostly agree here. And I will grant you that there is not much to > be > > gained by just re-categorizing these special selectors. > > But, like I have some doubts about how Smalltalk treats multiplicity I > also > > have serious doubts about how it treats numbers and mathematics in > general. > > Your making an argument for primitive value types. That radical a change > could put us right into the problem space Java has (and to a lesser degree > C# for that matter). > Not really. I am making an argument for implementation classes, that would only be selected at packaging time as part of the optimization process - the overflow from immediates to LargeIntegers would create different implementation objects depending on this, you could also select what kind of LimitePrecisionReals to use. Then, an optimizer that would determine when a value can be modified in-place and when not (and use different methods accordingly) - this would not mean anything when the values are immediates, but it would make a difference when they are not Again, a lot of these things could be achieved without annotations. And for numerics I seriously doubt that anybody changes the arithmetic classes dynamically at run time. > > And perhaps operators should also appear in the language (this would > > independently invalidate the absolute generality of message sends). At > least > > you have implemented multi-methods. But I also think that numbers and > > similarly abstract entities should be essentially identity-less, as > opposed > > to objects representing concrete things, so we should probably have less > > inteligent objects together with more intelligent ones. > > But, one step at a time... > > I'm not quite sure what you mean by entity-less. Something that does not have an identity can be modified in-place for example, like you did >I did propose and partially > implement a notion of numeric registers. I.e., classes like FloatRegister, > UInt32Register, ... where all operations like #(+ / - ...) overwrite the > receiver with the return value. This would enable creating efficient Int32, > Int64, Float facilities but would complicate the coding style and therefore > is not ideal; Yes, this is similar with what I was saying about implementation classes, and no, they don't need to appear in the user code, therefore they don't complicate the coding style. > so I've let it sit for more thought because I've felt there > were other more attractive solutions: > > a) Use annotations and typing > > b) Since the AOS Platform is designed to support multiple languages just use > one of those other languages (when they're available) to write the numeric > methods. > more attractive to whom ? ;-) As a general approach, I think that we should use higher and higher level languages, but that this should not incur any penalties. Whatever I can hand optimize, the machine should be able to do it for me as well. So if I can manually inline source for speed, the machine should find the places where it is safe to do so and do the same transformation. If I can manually optimize the bytecodes, the machine should do that also. And the translation to native code should also generate competitive assembly. > > Yes, but I strongly feel that if a MOP modification that is > philosophically > > more satisfaying also provides opportunities for serious performance > > enhancements, we (the Smalltalk comunity) should go for it. Especially > that, > > given the current circumstances, this might be a matter of survival for > > Smalltalk > > I certainly agree with the former (MOP comments). > > I don't agree with the latter (survival comments). If anything, one of > Smalltalk's biggest problems getting widespread attention/usage vis-a-vis > other up and coming languages because there is such a big gap between its > grammatical, syntactical, conceptual mechanics and those one finds in the > predominant/mainstream body of programming languages. > The survival comment is related to my belief that Smalltalk should be at least as good as Java for any kind of optimization. In reality, the Smalltalk community has lost many good developers to Java, not because Java is better, but for very real work opportunities considerations. I feel personally offended and challenged if Java is more performant in any area, because Java pisses me off with its static types and primitive types decisions - I think they are bad, premature design-time speed optimizations, not only inconvenient, but unnecessary. I would like to prove this, not only assert it. On the other hand, there is a lot of effort going on to improve Java performance. If we let this unchallenged, the moving from Smalltalk to Java decisions could always be justified with braindead benchmarks. Since I came to North America in 1995, I worked for three consecutive companies that were originally purely Smalltalk shops and they moved to Java. I am at the forth now and the only reason why our project survives as a Smalltalk project is that it is much too complex (and dynamic) to be migrated - therefore nobody dared to suggest it could be re-written in something else. And the battle is today, for developer mind-share in the current context. I don't think in 500 years archeologists will look back and say: "wow, Smalltalk was a real gem", therefore we can all relax and keep polishing, because Smalltalk's immortality is guaranteed. > -- Dave S. [www.smallscript.net] Florin |
"Florin Mateoc" <[hidden email]> wrote in message
news:9e6f95$l3t$[hidden email]... > ...snip... > > > > Ok. I feel confident that I was on the right track for what you wanted. > > Yes, we are in full dialog ;-) :) ...snip... > > > > BLOCK DEBUGGING SIDEBAR: Eliot's post mentioned something about problems > > with block debugging and stepping in/out. This is not an issue I'm > familiar > > with. The QKS Smalltalk debugger supports one-shot, repeatable, counter > > based, and arbitrary block based breakpoints at the individual opcode > level > > and on method enter/exit. The debugger is supplied with full information > > about contexts and messages by the compiler(s). The debugger has, > therefore, > > always supported step, step-into, step-over, step-out. It also supports > > writing arbitrary block expressions that will be evaluated as > at > > every message-send/return. Those watchpoints can do anything and are > > responsible to return a boolean to indicate whether a dynamic breakpoint > > should occur. Finally, the debugger supports all these breakpoints on a > > per/thread (st-process) basis so you can do things like debug the debugger > > UI itself just by spawning the debugged version on a separate thread. > > -- END SIDEBAR -- > > I just wrote a reply to Eliot. I am not sure that I understand correctly > what Eliot meant, though. > Anyway, you can debug the debugger in VisualWorks as well. I was more thinking about debugging the UI of the debugger itself. But, I'm assuming you're saying that in VW you do things like putting a (persistent) breakpoint on the step function to look at it and then stepping through its code. > > > > > As I mentioned previously, an adaptive JIT does not require special > message > > names to accomplish your goals. What you really want is that > > (no-context/objects-created but has full access to local frame > information) > > ![NCB] (non-capturable-block) [it really needs a better name]. > > > > The performance difference between a special #basicDo:, and a developer > > written version using NRC methods is nominal at best. The current > > SmallScript/v4-AOS JIT generates code that is essentially identical to > what > > a hand coded variant would generate. And, unlike a dedicate primitive, > > NRC approach has the flexibility that any developer can write their own > > hi-perf routines themselves as ordinary methods. > > > > Yes, but SmallScript has departed significantly from the Smalltak model. Hmm. Maybe. But I do not characterize it as a departure since it supports all the smalltalk language and semantics. It is an evolution or extension (i.e., it is smalltalk and behaves like smalltalk but it can do more). > The > special #basicDo: would work within the the existing systems as well, with a > kind of modifications that are "socially acceptable" in the Smalltalk > community: modifications to the compiler, adding a few bytecodes. I don't follow your argument here. You're advocating a new "primitive" method called #basicDo: and saying that is socially acceptable. But on the other hand saying that the new (as you termed them) "primitive" methods for #(basicNRCSlotAt: basicNRCSlotAt:put:) somehow make SmallScript a significant departure from the Smalltalk model? How do you figure that? In either case we're talking about adding some new "primitive" methods and nothing else. > I agree that performance-wise they would be equivalent, but if they can be > done as Smalltalk, why are extensive changes like SmallScript needed ? Something is quite confused here. It is true that SmallScript has many features not found in Smalltalk. But, none of those features are relevent to this NRC discussion. In other words, the NRC facilities can be put, today, into any smalltalk without changing the language or using any smallscript syntactic changes. I intentionally designed them that way because there was no reason not to, and I am always eager to see Smalltalk (as a language) get improved. My major point is that with the presence of those "primitive" methods, the need for a "primitive" for #basicDo: is eliminated. More importantly, the NRC methods enable a much broader scope for re-use and application. Any developer can use them to write their own versions of a #basicDo: method [ basicDo: valuable self basicNamedSize+1 to: self basicSlotSize do: [:i| valuable value: (self basicNRCSlotAt: i) ]. ]. > > > > > > > > But things should not stop here ! > > > We should have multi-dimensional variable classes and multi-dimensional > > > iterators as well. And now we begin to really see some more fundamental > > > limitations in how Smalltalk deals with multiplicity. I imagine a > solution > > > based on dynamically created multi-variable classes (the first several > > > dimensions could be predefined) > > > > Hmmm. SmallScript has a full set of multi-dimensional and array slice > > facilities. I don't know if you've seen any of my posts describing one of > > its new message forms #[...](...). > > > > They allow writing code like: > > [ > > anObject[a:b][c][d:] := anExpression[a:b][c][d:]. > > "" or less cryptic/arbitrary > > > > |aString| := 'foobar' copy. > > > > |s| := aString[`o[^r].*']. > > > > ""Where <s> is set to 'ooba'. > > > > "" Or the following > > aString[`o[^r].*'] := 'rontie'. > > > > "" Where <aString> is modified in place to 'frontier'. > > ]. > > > > Nice, and yes, I think that in-place modifications are another important > missing piece in Smalltalk. They are not missing in all Smalltalks. (Q: What are your criteria for determining what Smalltalk supports or not?) NOTE: SmallScript is an evolutionary extension to Smalltalk and its design and from scratch development was begun by me in 1998. But, QKS Smalltalk (which is and has been since 1991 a pure smalltalk) has always allowed in place modifications. It all depends on what the underlying object model for the given smalltalk implementation is. Our object model also allows objects to be resized and restructured very efficiently. This is an integrated part of the garbage collector architecture just like support for auto-pinned and explicitly pinning objects; and managed string/character storage as NULL terminated so it can be passed in FFI calls to external code (like C/C++) without marshalling. QKS Smalltalk has always supported read/write barriers and pre-bind delegation. It has supported asychronous hardware interrupts and timers. It has always been natively multi-threaded. It has always had namespaces. It has always supported real-time package generation and loading with schema migration support and refactoring facilities. So those features could be characterized as having been Smalltalk features since ~1992; they just have not been available in every dialect. Having those features in the virtual machine (implementation) X does not make it "not-smalltalk". What is probably fair to say is that "not all smalltalks" have XYZ feature. Which, as an example, is still true today if you talk about proper/full block closure semantics. I.e., There are still a couple major dialects which do not support full block closure semantics. Therefore, one could "fairly" say that Smalltalk does not support closures. But I think most smalltalkers (who know what closures are) would say that Smalltalk does have closure support. > > > > > > > > > > <snip> > > > > > > Yes, of course, but a lot of this (not quite as much as for a static > > > language, but a lot) could be achieved at packaging time (as an > > intermediate > > > step between development time and run time) as a whole program > > optimization, > > > still at the bytecodes level. > > > > I don't see how any of this can be achieved except dynamically at > execution > > time. At packaging time one has no idea of what the clients of these > > packages will want to do. I.e., what aspect-oriented programming > facilities > > (MOP services) they might require. A classic use of the read/write > barriers > > is to provide distributed object proxies or trivially build > > object-relational mappings through instances of arbitrary classes that > were > > never designed with that thought in mind. > > > > But some of these optimization hints can be exactly directed to address > dynamism problem at packaging time: "no dynamic classes" or "no dynamic > methods" Yes, but it is not necessary and provides little or no value in a JIT world. After all, the JIT has to look at the whole of the running system (not just what some subset of the possible set of all active packages "claim"). The term "claim" is critical here, but a "safe" system with some measure of "security" has to perform its own validation independent of the package claims. > Such directives would directly apply for a vast majority of the deployed > applications, because the dynamism is mostly used in development, not at run > time. Not true at all. It is true that explicit developer intended dynamism is not common today (and is technically challenging because it is so poorly supported in most languages and runtime systems). But, the moment you look at component architectures, schema migration issues, versioning, and complex systems composed of components the rules all change. In that (highly prevalent and increasing demanded) kind of environment the actual execution engine and software management facilities *do* have to support dynamic characteristics to get their job done. Amazing though it may seem on the surface of it, the recognition and use of dynamic facilities is a central/fundamental part of the Microsoft.NET architecture. But upon reflection of Microsoft's experience with "DLL Hell" and managing COM and related integration of system elements it is not surprising at all. The point is that there is limited value in providing such information in the package. The primary value of such information would be for *human* reference and that might be misleading if the code actually does perform such features/things but some human erroneously labelled the package to say it didn't. It is a *bad* implementation constraint [that has no benefit] to actually have the IL (intermediate language) code in the package be generated such that it *cannot* be used in a dynamically modified environment. > Even those applications that are truly dynamic would considerably benefit > from applying such constraints on chunks of the application that are static. > And an advantage (that you ignored in this argument) is that the application > thus optimized is still in a portable, bytecode form. No I don't think I missed the argument, quite the opposite. I understand the argument but recognize there is a broader and more fundamental view. In a jitted world, bytecode/intermediate-language forms serve a different purpose and role than in an interpretted world. The actual final/static compilation portion of the code generation process is performed by the JIT not the source compiler. The IL form should capture all the necessary contextual and semantic information to enable the JIT to fulfill its role in managing static compilation concerns. It is wrong for the source compiler to handle that area. As a thought experiment, consider the notion of <pointer/int> representation. If a source compiler assumes that an object model and an opcode set represents pointers as ints that are 32-bit little-endian forms, it could generate *wrong* code in many circumstances for running on a platform with 64-bit big-endian forms. There are many, many other issues like this where the final decision about the end-code result should not be determined until execution time. As processors get more and more sophisticated (witness IA64 instruction set) it is increasingly important (to benefit from the new designs) that many decisions get left to the JIT based on its analysis of IL. > Another powerful optimization that is not portable but still applicable at > packaging time is to select (deployment-)platform specific implementation > classes Sure, this is provided for in both the SmallScript's system and in Microsoft.NET's architecture. But the goal is to keep special case version management to a minimum so that it is only used where it adds unique value. In general, platform/deployment/version specific code is a maintenance/installation nightmare (hence the Microsoft term "DLL Hell"). > > These optimization directives are generic, they don't require annotations or > any change to the language, but still they could account for very > significant global optimizations. Your arguments regarding global optimization as a principle are right on. My point is that the right place for such optimization is in the IL consumer portion of the execution/deployment part of a modern sofware platform. I.e., if a sophisticated JIT system is present, it is the IL consumer that should make such decisions. If such a system might not be present (like in embedded or small devices) then the IL form should be useable by a simple jit or a sophisticated jit should pre-process the IL to a suitable IL or native form. But in no event should the source compiler eliminate information that would be valuable to retain in the IL because the source compiler's role in such a IL based JIT system is different from a classic source to native code compiler. The thing to remember is that IL is the language neutral representation of a program. There may be many source languages all of which get mapped to IL. So only the IL consumer can perform adequate optimization of the final integrated/active IL based elements. IL -- Intermediate Language -- Opcodes+ObjectModelInfo. > > > > > > > Yes, we mostly agree here. And I will grant you that there is not much > to > > be > > > gained by just re-categorizing these special selectors. > > > But, like I have some doubts about how Smalltalk treats multiplicity I > > also > > > have serious doubts about how it treats numbers and mathematics in > > general. > > > > Your making an argument for primitive value types. That radical a change > > could put us right into the problem space Java has (and to a lesser > > C# for that matter). > > > > Not really. I am making an argument for implementation classes, that would > only be selected at packaging time as part of the optimization process - the > overflow from immediates to LargeIntegers would create different > implementation objects depending on this, you could also select what kind of > LimitePrecisionReals to use. > Then, an optimizer that would determine when a value can be modified > in-place and when not (and use different methods accordingly) - this would > not mean anything when the values are immediates, but it would make a > difference when they are not > Again, a lot of these things could be achieved without annotations. And for > numerics I seriously doubt that anybody changes the arithmetic classes > dynamically at run time. I don't follow the rationale. You're suggesting a solution based on "annotating" packages as being the way to go, but "annotating" methods/classes/namespaces/projects as not being appropriate. If so, then we are arguing over the question of granularity of control. Keep in mind that in SmallScript and in QKS Smalltalk, packages can have their own "agent" behavior (methods and obviously state). A package contains an aribitrary collection of objects (including [recursively] other packages) and data (or [recursively] entire databases) which also includes dependency information and manifests of applicability (version info, signatures, etc). SmalltalkAgents (~1991 QKS Smalltalk system) was based on the idea of creating packages (agents) that could migrate in real time across environments. Having had experience working with the capabilities of such packages for a long time, made clear to me that an additional (more structured/formal design/deployment) unit was required. As a result, in SmallScript, I added the first class notion of <Modules> to the language. First a module is a class, not an arbitrary set of objects and data with some manifest and active code. A module can contain all the things that packages contain (including packages), but its primary purpose is to provide a structured unit for managing the sharing and packaging of design and related declarative elements. Becaused it has more specific and focused goals it provides a richer architecture for handling development and deployment related activities. > > > > > And perhaps operators should also appear in the language (this would > > > independently invalidate the absolute generality of message sends). At > > least > > > you have implemented multi-methods. But I also think that numbers and > > > similarly abstract entities should be essentially identity-less, as > > opposed > > > to objects representing concrete things, so we should probably have less > > > inteligent objects together with more intelligent ones. > > > But, one step at a time... > > > > I'm not quite sure what you mean by entity-less. > > Something that does not have an identity can be modified in-place for > example, like you did. I think you're describing a value-type. I.e., something for which there is no unique pointer/id which can be shared by multiple entities (and which is [typically] not self describing). While not particularly germaine, it is worth noting that: pragmatically, as virtual objects, this is how SmallIntegers, Characters, and WeakId objects are handled in SmallScript/QKS-Smalltalk. > > >I did propose and partially > > implement a notion of numeric registers. I.e., classes like FloatRegister, > > UInt32Register, ... where all operations like #(+ / - ...) overwrite the > > receiver with the return value. This would enable creating efficient > Int32, > > Int64, Float facilities but would complicate the coding style and > therefore > > is not ideal; > > Yes, this is similar with what I was saying about implementation classes, > and no, they don't need to appear in the user code, therefore they don't > complicate the coding style. > > > so I've let it sit for more thought because I've felt there > > were other more attractive solutions: > > > > a) Use annotations and typing > > > > b) Since the AOS Platform is designed to support multiple languages just > use > > one of those other languages (when they're available) to write the > > methods. > > > > more attractive to whom ? ;-) Me at the least. But also I believe that there is some relatively large body of developers who would share my view. But rather than focus on that question, let me try to refocus this issue. a) Smalltalk assumes a numeric model of multi-precision arithmetic for the classes it assigns to numeric literal declarations. b) As a result, and expression like (10 * a) requires multi-precision arithmetic processing. c) If we want to change those rules to assume, say int64 processing, or 64/80 bit float processing then we are looking at a very different numeric model and set of rules. d) In languages with required typing of variables/expressions and numeric operators, we don't have this problem. The 100% presence of static type information enables full disambiguation of this situation. e) But in Smalltalk, we do not (100% of the time) have the information to ascertain statically what is intended. So, either we add optional typing/annotations to source or we externally control the decision by enabling "modes" of compilation for source via compiler flags that tools/humans can use. f) In designing my compilers I've provided for both models by building the mechanics to support both on a common compiler architecture. But I have not actually implemented full static typing analysis and optimization features for numerics. g) The v4 AOS Platform (which is SmallScripts reference platform) is designed to support multiple languages on a common IL (opcode) system and execution engine. The Microsoft.NET platform is designed to support multiple languages on a common IL (opcode) system and execution engine. The Microsoft platform has many different characteristics and market presence factors. Given the wide range of languages that are already well suited to highly optimized static type processing and numerics facilities, it is not unreasonable to consider using those languages to write numeric code. After all, there is *no* complexity, performane penalty, or interop issues when they are both generating common IL and running on a common (runtime/object-model) execution engine. The tools for such environments, specifically thinking of VisualStudio.NET, are designed to host and support these multiple languages and allow fluid debugging of a mixed language solution. > As a general approach, I think that we should use higher and higher level > languages, but that this should not incur any penalties. Whatever I can hand > optimize, the machine should be able to do it for me as well. So if I can > manually inline source for speed, the machine should find the places where > it is safe to do so and do the same transformation. If I can manually > optimize the bytecodes, the machine should do that also. And the translation > to native code should also generate competitive assembly. > > > > Yes, but I strongly feel that if a MOP modification that is > > philosophically > > > more satisfaying also provides opportunities for serious performance > > > enhancements, we (the Smalltalk comunity) should go for it. Especially > > that, > > > given the current circumstances, this might be a matter of survival for > > > Smalltalk > > > > I certainly agree with the former (MOP comments). > > > > I don't agree with the latter (survival comments). If anything, one of > > Smalltalk's biggest problems getting widespread attention/usage vis-a-vis > > other up and coming languages because there is such a big gap between its > > grammatical, syntactical, conceptual mechanics and those one finds in the > > predominant/mainstream body of programming languages. > > > > The survival comment is related to my belief that Smalltalk should be at > least as good as Java for any kind of optimization. In reality, the > Smalltalk community has lost many good developers to Java, not because Java > is better, but for very real work opportunities considerations. > I feel personally offended and challenged if Java is more performant in any > area, because Java pisses me off with its static types and primitive types > decisions - I think they are bad, premature design-time speed optimizations, > not only inconvenient, but unnecessary. I would like to prove this, not only > assert it. > > On the other hand, there is a lot of effort going on to improve Java > performance. If we let this unchallenged, the moving from Smalltalk to Java > decisions could always be justified with braindead benchmarks. Since I came > to North America in 1995, I worked for three consecutive companies that were > originally purely Smalltalk shops and they moved to Java. I am at the forth > now and the only reason why our project survives as a Smalltalk project is > that it is much too complex (and dynamic) to be migrated - therefore nobody > dared to suggest it could be re-written in something else. > And the battle is today, for developer mind-share in the current context. I > don't think in 500 years archeologists will look back and say: "wow, > Smalltalk was a real gem", therefore we can all relax and keep polishing, > because Smalltalk's immortality is guaranteed. Well that is a very different view than I take on Smalltalk or Java. I try to be pragmatic about these things and just look at numbers and mindshare. I think Java is "loosely put" irrelevant and not the real target or issue at all. The issue is what languages are in the mainstream with enough development being done to provide healthy, sustained, growth. Of those, which are the most vibrant or successful and why? I look at the many millions of developers using the Microsoft VisualStudio family and their languages. I look at the many millions of users of moderately new languages like Perl, PHP, Python, JavaScript, etc. I look at the 10's or 100's of thousands of users of more seemingly obscure (but relatively new) scripting languages. I look at the level of financial interest and technical interest in these languages and try to identify the characteristics that are breathing life and vitality into them. None of them are Java, and all of them are doing incredibly well and if Smalltalk were in a similar position Java would not "loosely speaking" be relevant. Furthermore, if Smalltalk had "whatever it takes" to play in their leagues, the other characteristics of Smalltalk would enable it to be a genuine (business) competitor for Java's marketspace. That's the relevence of "SmallScript" and my efforts in developing it to take Smalltalk to a new level. -- Dave S. [www.smallscript.net] > > > -- Dave S. [www.smallscript.net] > > Florin > > |
In reply to this post by David Simmons
"David Simmons" <[hidden email]> wrote in message
news:KdnN6.18311$%[hidden email]... > "Florin Mateoc" <[hidden email]> wrote in message > news:9e4hcc$c64$[hidden email]... > > > > "David Simmons" <[hidden email]> wrote in message > > news:9ZgN6.17360$%[hidden email]... > > > "Florin Mateoc" <[hidden email]> wrote in message > > > news:3_eN6.8035$[hidden email]... > > > > In article <_QdN6.17210$%[hidden email]>, David > > > Simmons > > > > says... ...snip... > > Ok. I feel confident that I was on the right track for what you wanted. Yes, we are in full dialog ;-) > Optimizing blocks has long been a focus for me and in building the v4 AOS > Platform engine I intentionally put in subroutine support with this very > goal in mind -- and proved its value to myself with the (almost zero-cost) > exception mechanism implementation [I could have made it totally zero cost > if I was willing to make the handler crawl more expensive]. > > BLOCK DEBUGGING SIDEBAR: Eliot's post mentioned something about problems > with block debugging and stepping in/out. This is not an issue I'm familiar > with. The QKS Smalltalk debugger supports one-shot, repeatable, counter > based, and arbitrary block based breakpoints at the individual opcode level > and on method enter/exit. The debugger is supplied with full information > about contexts and messages by the compiler(s). The debugger has, therefore, > always supported step, step-into, step-over, step-out. It also supports > writing arbitrary block expressions that will be evaluated as watchpoints at > every message-send/return. Those watchpoints can do anything and are > responsible to return a boolean to indicate whether a dynamic breakpoint > should occur. Finally, the debugger supports all these breakpoints on a > per/thread (st-process) basis so you can do things like debug the debugger > UI itself just by spawning the debugged version on a separate thread. > -- END SIDEBAR -- I just wrote a reply to Eliot. I am not sure that I understand correctly what Eliot meant, though. Anyway, you can debug the debugger in VisualWorks as well. > > As I mentioned previously, an adaptive JIT does not require special message > names to accomplish your goals. What you really want is that > (no-context/objects-created but has full access to local frame information) > ![NCB] (non-capturable-block) [it really needs a better name]. > > The performance difference between a special #basicDo:, and a developer > written version using NRC methods is nominal at best. The current > SmallScript/v4-AOS JIT generates code that is essentially identical to what > a hand coded variant would generate. And, unlike a dedicate primitive, the > NRC approach has the flexibility that any developer can write their own > hi-perf routines themselves as ordinary methods. > Yes, but SmallScript has departed significantly from the Smalltak model. The special #basicDo: would work within the the existing systems as well, with a kind of modifications that are "socially acceptable" in the Smalltalk community: modifications to the compiler, adding a few bytecodes. I agree that performance-wise they would be equivalent, but if they can be done as Smalltalk, why are extensive changes like SmallScript needed ? > > > > But things should not stop here ! > > We should have multi-dimensional variable classes and multi-dimensional > > iterators as well. And now we begin to really see some more fundamental > > limitations in how Smalltalk deals with multiplicity. I imagine a solution > > based on dynamically created multi-variable classes (the first several > > dimensions could be predefined) > > Hmmm. SmallScript has a full set of multi-dimensional and array slice > facilities. I don't know if you've seen any of my posts describing one of > its new message forms #[...](...). > > They allow writing code like: > [ > anObject[a:b][c][d:] := anExpression[a:b][c][d:]. > "" or less cryptic/arbitrary > > |aString| := 'foobar' copy. > > |s| := aString[`o[^r].*']. > > ""Where <s> is set to 'ooba'. > > "" Or the following > aString[`o[^r].*'] := 'rontie'. > > "" Where <aString> is modified in place to 'frontier'. > ]. > Nice, and yes, I think that in-place modifications are another important missing piece in Smalltalk. > > > > > > <snip> > > > > Yes, of course, but a lot of this (not quite as much as for a static > > language, but a lot) could be achieved at packaging time (as an > intermediate > > step between development time and run time) as a whole program > optimization, > > still at the bytecodes level. > > I don't see how any of this can be achieved except dynamically at > time. At packaging time one has no idea of what the clients of these > packages will want to do. I.e., what aspect-oriented programming facilities > (MOP services) they might require. A classic use of the read/write barriers > is to provide distributed object proxies or trivially build > object-relational mappings through instances of arbitrary classes that were > never designed with that thought in mind. > But some of these optimization hints can be exactly directed to address the dynamism problem at packaging time: "no dynamic classes" or "no dynamic methods" Such directives would directly apply for a vast majority of the deployed applications, because the dynamism is mostly used in development, not at run time. Even those applications that are truly dynamic would considerably benefit from applying such constraints on chunks of the application that are static. And an advantage (that you ignored in this argument) is that the application thus optimized is still in a portable, bytecode form. Another powerful optimization that is not portable but still applicable at packaging time is to select (deployment-)platform specific implementation classes These optimization directives are generic, they don't require annotations or any change to the language, but still they could account for very significant global optimizations. > > > > Yes, we mostly agree here. And I will grant you that there is not much to > be > > gained by just re-categorizing these special selectors. > > But, like I have some doubts about how Smalltalk treats multiplicity I > also > > have serious doubts about how it treats numbers and mathematics in > general. > > Your making an argument for primitive value types. That radical a change > could put us right into the problem space Java has (and to a lesser degree > C# for that matter). > Not really. I am making an argument for implementation classes, that would only be selected at packaging time as part of the optimization process - the overflow from immediates to LargeIntegers would create different implementation objects depending on this, you could also select what kind of LimitePrecisionReals to use. Then, an optimizer that would determine when a value can be modified in-place and when not (and use different methods accordingly) - this would not mean anything when the values are immediates, but it would make a difference when they are not Again, a lot of these things could be achieved without annotations. And for numerics I seriously doubt that anybody changes the arithmetic classes dynamically at run time. > > And perhaps operators should also appear in the language (this would > > independently invalidate the absolute generality of message sends). At > least > > you have implemented multi-methods. But I also think that numbers and > > similarly abstract entities should be essentially identity-less, as > opposed > > to objects representing concrete things, so we should probably have less > > inteligent objects together with more intelligent ones. > > But, one step at a time... > > I'm not quite sure what you mean by entity-less. Something that does not have an identity can be modified in-place for example, like you did >I did propose and partially > implement a notion of numeric registers. I.e., classes like FloatRegister, > UInt32Register, ... where all operations like #(+ / - ...) overwrite the > receiver with the return value. This would enable creating efficient Int32, > Int64, Float facilities but would complicate the coding style and therefore > is not ideal; Yes, this is similar with what I was saying about implementation classes, and no, they don't need to appear in the user code, therefore they don't complicate the coding style. > so I've let it sit for more thought because I've felt there > were other more attractive solutions: > > a) Use annotations and typing > > b) Since the AOS Platform is designed to support multiple languages just use > one of those other languages (when they're available) to write the numeric > methods. > more attractive to whom ? ;-) As a general approach, I think that we should use higher and higher level languages, but that this should not incur any penalties. Whatever I can hand optimize, the machine should be able to do it for me as well. So if I can manually inline source for speed, the machine should find the places where it is safe to do so and do the same transformation. If I can manually optimize the bytecodes, the machine should do that also. And the translation to native code should also generate competitive assembly. > > Yes, but I strongly feel that if a MOP modification that is > philosophically > > more satisfaying also provides opportunities for serious performance > > enhancements, we (the Smalltalk comunity) should go for it. Especially > that, > > given the current circumstances, this might be a matter of survival for > > Smalltalk > > I certainly agree with the former (MOP comments). > > I don't agree with the latter (survival comments). If anything, one of > Smalltalk's biggest problems getting widespread attention/usage vis-a-vis > other up and coming languages because there is such a big gap between its > grammatical, syntactical, conceptual mechanics and those one finds in the > predominant/mainstream body of programming languages. > The survival comment is related to my belief that Smalltalk should be at least as good as Java for any kind of optimization. In reality, the Smalltalk community has lost many good developers to Java, not because Java is better, but for very real work opportunities considerations. I feel personally offended and challenged if Java is more performant in any area, because Java pisses me off with its static types and primitive types decisions - I think they are bad, premature design-time speed optimizations, not only inconvenient, but unnecessary. I would like to prove this, not only assert it. On the other hand, there is a lot of effort going on to improve Java performance. If we let this unchallenged, the moving from Smalltalk to Java decisions could always be justified with braindead benchmarks. Since I came to North America in 1995, I worked for three consecutive companies that were originally purely Smalltalk shops and they moved to Java. I am at the forth now and the only reason why our project survives as a Smalltalk project is that it is much too complex (and dynamic) to be migrated - therefore nobody dared to suggest it could be re-written in something else. And the battle is today, for developer mind-share in the current context. I don't think in 500 years archeologists will look back and say: "wow, Smalltalk was a real gem", therefore we can all relax and keep polishing, because Smalltalk's immortality is guaranteed. > -- Dave S. [www.smallscript.net] Florin |
In reply to this post by David Simmons
[hidden email] (David Simmons) wrote (abridged):
> I.e., real speedup will come from providing ![NCB] declarative form. > > aFoo do: ![:v| ...]. > > I just need a syntax that feels good for widespread usage and is > relatively self evident. It sounds like there are two sides to this: non-capturable blocks, and methods which do not capture their block arguments. We can annotate the blocks or we can annotate the methods. For some methods, we can tell by inspecting the implementation that they don't capture their arguments. We can use type-inference to propagate that information around. So if we go the method route, we can potentially optimise without needing (user) annotations at all. Presumably if we go the annotated-block route, there is a risk someone will provide an NCB block to a routine which does, in fact, attempt to capture it. That would be a runtime error. It sounds like the method route is better on both counts, and also compatible with standard Smalltalk. Is it not feasible in practice? Dave Harris, Nottingham, UK | "Weave a circle round him thrice, [hidden email] | And close your eyes with holy dread, | For he on honey dew hath fed http://www.bhresearch.co.uk/ | And drunk the milk of Paradise." |
In reply to this post by David Simmons
"David Simmons" <[hidden email]> wrote in message
news:KsBN6.19423$%[hidden email]... > "Florin Mateoc" <[hidden email]> wrote in message > news:9e6f95$l3t$[hidden email]... > > > ...snip... > > > > > > Ok. I feel confident that I was on the right track for what you wanted. > > > > Yes, we are in full dialog ;-) > > :) > More and more I find that we are thinking about exactly the same kind of capabilities and optimizations offered by the ensemble virtualMachine+JIT+packager+smalltalkCompiler (I was going to mention object resizing and restructuring, but you did it before I had the chance ;-) Significant differences remain in where to assign the responsibilities: - on one hand, I would like to modify (or enhance) Smalltalk (as visible to the developer) as little as possible, to keep Smalltalk development simple, and keep the optimizations under the hood. I believe some visible enhancements are needed, but you have promoted a lot of the optimization machinery at source code level, basically forcing the developer to make a lot of early optimization decisions and write non-generic code, which is bad in general. - on the other hand, we have switched positions when it comes to packaging versus JIT time optimizations. You would like to defer them all to the JIT, which has advantages - more adaptive and potentially more robust, as you have described. ROBUSTNESS SIDEBAR: I think accidental dynamic behavior is not desirable and from my point of view, as a developer, I would like to be notified (even with a walkback) that my assumptions have been invalidated. This would be a bug in the sense that I did not understand the system's behavior and I better fix it. -- END SIDEBAR -- While it is true that ideally optimizations are best deferred as much as possible, there are some real resources and time constraints. A packager/type inferencer/optimizer has relatively infinite time and resources to do whole program optimizations. A JIT has to do it on the fly. The whole hot-spot idea relies exactly on the JIT's superficiality, doing some of the more sophisticated optimizations only where it knows it counts most, as a complement (because even the hot-spot has the same kind of run time constraints like the JIT). Well, the packager does not have any of these hard constraints, I think this is a very strong argument for pre-computing the optimizations. Some of them would be simply impossible at run time. ..snip... > But, I'm assuming you're saying that in VW you do things like putting a > (persistent) breakpoint on the step function to look at it and then stepping > through its code. > Yes, that's about it, except Terry's debugger has non-persistent breakpoints as well > > Yes, but SmallScript has departed significantly from the Smalltak model. > > Hmm. Maybe. But I do not characterize it as a departure since it supports > all the smalltalk language and semantics. It is an evolution or extension > (i.e., it is smalltalk and behaves like smalltalk but it can do more). > I very much agree with what Chris Uppal has said about this. Furthermore, I believe that Smalltalk could be "fixed" as well as SmallScript does it but in a much more incremental manner, with most of the changes happening under the hood. ..snip.. > I don't follow your argument here. You're advocating a new "primitive" > method called #basicDo: and saying that is socially acceptable. But on the > other hand saying that the new (as you termed them) "primitive" methods for > #(basicNRCSlotAt: basicNRCSlotAt:put:) somehow make SmallScript a > significant departure from the Smalltalk model? > #basicDo: would be exactly the same kind of primitive method like #to:do:, and it would be safe in the sense that it could not be used inadvertently and lead to an undebuggable crash. #basicNRCSlotAt: on the other hand, could. > QKS Smalltalk has always supported read/write barriers and pre-bind > delegation. It has supported asychronous hardware interrupts and timers. It > has always been natively multi-threaded. It has always had namespaces. It > has always supported real-time package generation and loading with schema > migration support and refactoring facilities. So those features could be > characterized as having been Smalltalk features since ~1992; they just have > not been available in every dialect. > > Having those features in the virtual machine (implementation) X does not > make it "not-smalltalk". What is probably fair to say is that "not all > smalltalks" have XYZ feature. > I perfectly agree. Let's put more of these things in the "generic" Smalltalk virtual machine. > > I don't follow the rationale. You're suggesting a solution based on > "annotating" packages as being the way to go, but "annotating" > methods/classes/namespaces/projects as not being appropriate. > > If so, then we are arguing over the question of granularity of control. It is a rationale based on moment in the development cycle, not on granularity. When you package your application, you are done with the development (at least you think you are ;-)), when you write your classes you are still in flux. If you annotate them then, you basically make a premature optimization. If you only annotate them when you are done with the development (at packaging time), you have a lot more work to do, in many more places, and error-prone. A tool should be able to do it for you (which is what the packager/type inferencer would do). > in mind that in SmallScript and in QKS Smalltalk, packages can have their > own "agent" behavior (methods and obviously state). A package contains an > aribitrary collection of objects (including [recursively] other packages) > and data (or [recursively] entire databases) which also includes dependency > information and manifests of applicability (version info, signatures, etc). > > SmalltalkAgents (~1991 QKS Smalltalk system) was based on the idea of > creating packages (agents) that could migrate in real time across > environments. Having had experience working with the capabilities of such > packages for a long time, made clear to me that an additional (more > structured/formal design/deployment) unit was required. > > As a result, in SmallScript, I added the first class notion of <Modules> to > the language. First a module is a class, not an arbitrary set of objects and > data with some manifest and active code. A module can contain all the things > that packages contain (including packages), but its primary purpose is to > provide a structured unit for managing the sharing and packaging of design > and related declarative elements. Becaused it has more specific and focused > goals it provides a richer architecture for handling development and > deployment related activities. > This is very interesting but it is obviously a different meaning for the word package. > > > > > > I'm not quite sure what you mean by entity-less. > > > > Something that does not have an identity can be modified in-place for > > example, like you did. > > I think you're describing a value-type. I.e., something for which there is > no unique pointer/id which can be shared by multiple entities (and which is > [typically] not self describing). > > While not particularly germaine, it is worth noting that: pragmatically, as > virtual objects, this is how SmallIntegers, Characters, and WeakId objects > are handled in SmallScript/QKS-Smalltalk. > Not only, I was having in mind the distinction made by somebody in the smalltalk versus lisp thread, that in lisp you have less inteligent objects that do not host methods (and participate in multi-methods). I made the additional step that these objects are (should be) identity-less, in a sense like immediates are identity-less, true, but immediates are just a speed optimization. LargeIntegers are not less numbers than SmallIntegers and should also be identity-less > But rather than focus on that question, let me try to refocus this issue. > > a) Smalltalk assumes a numeric model of multi-precision arithmetic for the > classes it assigns to numeric literal declarations. > > b) As a result, and expression like (10 * a) requires multi-precision > arithmetic processing. > > c) If we want to change those rules to assume, say int64 processing, or > 64/80 bit float processing then we are looking at a very different numeric > model and set of rules. > > d) In languages with required typing of variables/expressions and numeric > operators, we don't have this problem. The 100% presence of static type > information enables full disambiguation of this situation. > > e) But in Smalltalk, we do not (100% of the time) have the information to > ascertain statically what is intended. So, either we add optional > typing/annotations to source or we externally control the decision by > enabling "modes" of compilation for source via compiler flags that > tools/humans can use. > > f) In designing my compilers I've provided for both models by building the > mechanics to support both on a common compiler architecture. But I have > actually implemented full static typing analysis and optimization features > for numerics. > > g) The v4 AOS Platform (which is SmallScripts reference platform) is > designed to support multiple languages on a common IL (opcode) system and > execution engine. The Microsoft.NET platform is designed to support multiple > languages on a common IL (opcode) system and execution engine. The Microsoft > platform has many different characteristics and market presence factors. > > Given the wide range of languages that are already well suited to highly > optimized static type processing and numerics facilities, it is not > unreasonable to consider using those languages to write numeric code. After > all, there is *no* complexity, performane penalty, or interop issues when > they are both generating common IL and running on a common > (runtime/object-model) execution engine. > These arguments may be suited for the AOS platform, obviously they are not for Smalltalk in general. Plus, I think there are serious advantages in using a single language for developing an application. For both human comprehesion and tools. There is never an external module that fits perfectly your needs. How do you tinker with it if it is written in another language ? Do you learn that language first ? How does global optimization work across languages (not on the AOS platform) ? > > Well that is a very different view than I take on Smalltalk or Java. I try > to be pragmatic about these things and just look at numbers and mindshare. I > think Java is "loosely put" irrelevant and not the real target or issue at > all. > I used Java as an example, I also (and I think the community should as well) feel challenged by C# and others. > The issue is what languages are in the mainstream with enough development > being done to provide healthy, sustained, growth. Of those, which are the > most vibrant or successful and why? > > I look at the many millions of developers using the Microsoft VisualStudio > family and their languages. I look at the many millions of users of > moderately new languages like Perl, PHP, Python, JavaScript, etc. I look at > the 10's or 100's of thousands of users of more seemingly obscure (but > relatively new) scripting languages. > > I look at the level of financial interest and technical interest in these > languages and try to identify the characteristics that are breathing life > and vitality into them. > > None of them are Java, and all of them are doing incredibly well and if > Smalltalk were in a similar position Java would not "loosely speaking" be > relevant. Furthermore, if Smalltalk had "whatever it takes" to play in > leagues, the other characteristics of Smalltalk would enable it to be a > genuine (business) competitor for Java's marketspace. > > That's the relevence of "SmallScript" and my efforts in developing it to > take Smalltalk to a new level. A good indication for a language viability is what kind of research still goes on on it. Obviously Java, C#, even C++ are still evolving and interesting things are happening. Generative programming in C++ is hot, Stroustrup is considering a new revision of the language. And these are major languages. Squeak, which is still Smalltalk and still in active research, is not about evolving the language. It is true that some have independently evolved their own language, SmallScript being the most advanced as far as I can tell, but this does not mean that Smalltalk has evolved there. Mark van Gulik regularly tells us about his Avail and it definitely looks interesting. But how many users (or developers) do these languages have ? Smalltalk itself needs to evolve or it will die. This would be true even if it were a huge commercial success. I think it is even more true as it is. > > -- Dave S. [www.smallscript.net] > Florin |
In reply to this post by Dave Harris-3
"Dave Harris" <[hidden email]> wrote in message
news:[hidden email]... > [hidden email] (David Simmons) wrote (abridged): > > I.e., real speedup will come from providing ![NCB] declarative form. > > > > aFoo do: ![:v| ...]. > > > > I just need a syntax that feels good for widespread usage and is > > relatively self evident. > > It sounds like there are two sides to this: non-capturable blocks, and > methods which do not capture their block arguments. We can annotate the > blocks or we can annotate the methods. > > For some methods, we can tell by inspecting the implementation that they > don't capture their arguments. We can use type-inference to propagate > that information around. So if we go the method route, we can potentially > optimise without needing (user) annotations at all. > > Presumably if we go the annotated-block route, there is a risk someone > will provide an NCB block to a routine which does, in fact, attempt to > capture it. That would be a runtime error. > > It sounds like the method route is better on both counts, and also > compatible with standard Smalltalk. > Is it not feasible in practice? No, not (fully) as far as I've been able to figure out. But implicit/explicit notation of the same constraints as NCBs require is, already within Smalltalk, in the usage of (^) caret non-local return blocks. Based on current Smalltalk semantics, one can implicitly treat any block with a non-local return (^) in it as being an NCB. However, in general, all such blocks require additional treatment to enable the non-local context unwinding. That treatment is still significantly less expensive than the 60-120 cycles required for a copying or dirty block. -- Why can (^) be an NCB? -- Background: An NCB can be allocated directly on the stack-frame using only one slot (via coloring -- only one such slot is needed per local-ncb-context nesting level). This constrains NCBs to the lifetime of their instantiation context. Under current Smalltalk semantics, a caret (^) return within a block requires a return "from" the home/method context. The execution of such a non-local return is "invalid" if the home context no longer exists. Therefore, the caret (^) non-local return implies the same requirements as an NCB. -- END SIDEBAR -- NCBs are very fast because they only require a stack-frame slot write to allocate a virtual-object pointer to them. Whereas the current non-NCB (non-static) blocks require at least the instantiation of a real block object, and may often require instantiation of a real shared-context object as well. The result is that an NCB solution would save 120-240 cycles for a given block expression. You ask, is this relevent or worthwhile? I would suggest that it is highly worthwhile on a number of levels. First, there is the general non-deterministic benefit of just creating fewer objects during code execution. Second, and more specifically, we can recognize that any message which requires a non-static block has a minimum overhead cost of (loosely speaking) 60-240 cycles per block argument as compared to perhaps 5 or 6 cycles. So given any moderately high level of invocation of such a message, we are gaining significant performance. I attempted to illustrated this in my example of the exception guards that used NCB style block mechanics. Where the loop performance difference was ~18ms vs ~120/250ms. Given the hi-frequency of, and desirability of, using blocks for collection behavior et cetera, there is significant value in providing NCB support either by inference or by explicit syntax/annotation. However the most likely form of block usage, where NCB usage would make a difference, is probably in blocks that do not contain a non-local return (^) and therefore one cannot implicitly ascertain (from the callsite) the capturability constraint. Due to Smalltalk's very nature, there is no way to guarantee that an NCB (or ^ non-local-return based block) is not captured; but there are efficient means to guarantee that attempts to use an NCB past its home context's lifetime generate a controlled exception. Chris Uppal's suggestion of _[..._] is the most attractive explicit notational form I've seen yet. It has the virtue that it is consistent with the ANSI suggestion of naming vendor/implementation specific methods/entities with a leading underscore. -- Dave S. [http://www.smallscript.net] > > Dave Harris, Nottingham, UK | "Weave a circle round him thrice, > [hidden email] | And close your eyes with holy dread, > | For he on honey dew hath fed > http://www.bhresearch.co.uk/ | And drunk the milk of Paradise." |
In reply to this post by Florin Mateoc-3
"Florin Mateoc" <[hidden email]> wrote in message
news:9e925r$gjn$[hidden email]... ...snip... > > More and more I find that we are thinking about exactly the same kind of > capabilities and optimizations offered by the ensemble > virtualMachine+JIT+packager+smalltalkCompiler (I was going to mention object > resizing and restructuring, but you did it before I had the chance ;-) > > Significant differences remain in where to assign the responsibilities: > - on one hand, I would like to modify (or enhance) Smalltalk (as visible > to the developer) as little as possible, to keep Smalltalk development > simple, and keep the optimizations under the hood. I believe some visible > enhancements are needed, but you have promoted a lot of the optimization > machinery at source code level, basically forcing the developer to make a > lot of early optimization decisions and write non-generic code, which is bad > in general. Sure. I can understand that point of view. My goals are to extend the language to provide new features. Therefore, to the extent to which those new features end up (for SmallScript users) superceding existing functionality for one reason or another your comments regarding cross-dialect issues are right on. > - on the other hand, we have switched positions when it comes to > packaging versus JIT time optimizations. You would like to defer them all to > the JIT, which has advantages - more adaptive and potentially more robust, > as you have described. Well sort-of. I'm a big advocate of capturing design and intent and of being able to use that information to generate up-to-date documentation that is close to if not wholly consistent with the implementation (code) itself. In our Smalltalk and related tool/repository facilities I've always tried to provide ways for expressing and tracking such information. I believe in providing the means to declare and access such information and having tools that validate the human assertions of intent. I've just learned that such information should be treated by execution engine mechanisms as nothing more than a hint/suggestion/request. Such suggestions may, based on policy settings, result in the generation of exceptions or warnings when packages/modules are verified/loaded. This is certainly the approach our QKS Smalltalk package system took on a number for a number of attributes/properties. > ROBUSTNESS SIDEBAR: I think accidental dynamic behavior is not desirable > and from my point of view, as a developer, I would like to be notified (even > with a walkback) that my assumptions have been invalidated. This would be a > bug in the sense that I did not understand the system's behavior and I > better fix it. -- END SIDEBAR -- I can only agree with you here. But, the devil is in the details on this one. In other words, I'm not entirely clear how one can accidently cause dynamic behavior to happen. It does seem to me that it is relatively easy to abuse/misuse MOP services and therefore wreak all manner of havoc or poor code. > While it is true that ideally optimizations are best deferred as much as > possible, there are some real resources and time constraints. A > packager/type inferencer/optimizer has relatively infinite time and > resources to do whole program optimizations. I disagree here to one extent or another. One of the large application focus areas for SmallScript is for scripting. This area includes embedding scripts within web-pages, etc. In that arena the compilation time and packaging for the cache system are very time sensitive. > A JIT has to do it on the fly. > The whole hot-spot idea relies exactly on the JIT's superficiality, doing > some of the more sophisticated optimizations only where it knows it counts > most, as a complement (because even the hot-spot has the same kind of run > time constraints like the JIT). That is true. But my philosphy is a bit different from that exhibited in Self. I am perfectly comfortable with the realities of providing a certain amount explicitly declared (intent) information for use in optimization. That allows fast and efficient code generation without having to rely 100% on adaptive/heuristic techniques. > Well, the packager does not have any of > these hard constraints, I think this is a very strong argument for > pre-computing the optimizations. Some of them would be simply impossible at > run time. I believe we actually share a basically similar philosophy here. But we are expressing it a little differently. I suspect that this is mostly the fact that SmallScript and the AOS are abstract ideas for you (since you have no way to have worked with it), and concrete working systems for me (because I am working with it). ...snip... > > I don't follow your argument here. You're advocating a new "primitive" > > method called #basicDo: and saying that is socially acceptable. But on the > > other hand saying that the new (as you termed them) "primitive" methods > for > > #(basicNRCSlotAt: basicNRCSlotAt:put:) somehow make SmallScript a > > significant departure from the Smalltalk model? > > > > #basicDo: would be exactly the same kind of primitive method like #to:do:, > and it would be safe in the sense that it could not be used inadvertently > and lead to an undebuggable crash. #basicNRCSlotAt: on the other hand, > could. I see your point. I would just remind you that basic methods are generally low level or MOP related. There are many existing operations one can perform within "classic" Smalltalk that can completely corrupt the system and lead to equivalently un-debuggable crashing. As a general rule, we should not shy away from providing unifying or fundamental capabilities just because they can be misused. > > > QKS Smalltalk has always supported read/write barriers and pre-bind > > delegation. It has supported asychronous hardware interrupts and timers. > It > > has always been natively multi-threaded. It has always had namespaces. It > > has always supported real-time package generation and loading with schema > > migration support and refactoring facilities. So those features could be > > characterized as having been Smalltalk features since ~1992; they just > have > > not been available in every dialect. > > > > Having those features in the virtual machine (implementation) X does not > > make it "not-smalltalk". What is probably fair to say is that "not all > > smalltalks" have XYZ feature. > > > > I perfectly agree. Let's put more of these things in the "generic" > virtual machine. <g> And just which of the many "generic" Smalltalk virtual machines would that be? > > I don't follow the rationale. You're suggesting a solution based on > > "annotating" packages as being the way to go, but "annotating" > > methods/classes/namespaces/projects as not being appropriate. > > > > If so, then we are arguing over the question of granularity of control. > > It is a rationale based on moment in the development cycle, not on > granularity. When you package your application, you are done with the > development (at least you think you are ;-)), when you write your classes > you are still in flux. If you annotate them then, you basically make a > premature optimization. If you only annotate them when you are done with > development (at packaging time), you have a lot more work to do, in many > more places, and error-prone. A tool should be able to do it for you (which > is what the packager/type inferencer would do). I don't agree here. The whole development process is fluid and iterative. So one should be able to express/alter design intent at any time for any subset of the body of work (otherwise you're impeding the refactoring process). Throughout the development process the tools and frameworks should utilize such information dynamically to verify and track constraints and declarations for consistency. Packaging time is the point at which everything you've declared (within your solution-space/project) can be finally/fully verified to enable generation of a given version of a distributable element and its manifest. ...snip... > > These arguments may be suited for the AOS platform, obviously they are not > for Smalltalk in general. > Plus, I think there are serious advantages in using a single language for > developing an application. For both human comprehesion and tools. There is > never an external module that fits perfectly your needs. How do you tinker > with it if it is written in another language? You use that other language. Or, you refine classes and methods of that module with the language of your choice. > Do you learn that language first? It is unlikely that you would want to, let alone need to. If it is all built on the same object model then you don't care what language the external code was written in. Furthermore, you may not have source to the module itself. But you will know its published classes, methods, and contracts. And presumably, if it is a framework that is intended for re-use there will be design level documentation -- for which code can never be a proper substitute. > How does global optimization work across languages > (not on the AOS platform)? I don't know how to answer this question since my multi-language (UVM) comments were premised on the concept of a common runtime/execution architecture. Optimization in such an architecture is performed both in the source generation of the IL, and in the compilation of the IL into a given machine/hardware configuration specific body of executable code. I probably should comment that IL is a higher and distinctly more expressive instruction form than what might be implied by typical "bytecode" systems designed for interpreters. That is why, historically, I've tended to refer to our instruction sets as opcodes, and with the greatly enhanced instruction set and related metadata of the v4 AOS Platform and Microsoft.NET they are more appropriately refered to as IL. ...snip... -- Dave S. > > > > > -- Dave S. [www.smallscript.net] > > > > Florin > > |
In reply to this post by David Simmons
[hidden email] (David Simmons) wrote (abridged):
> Given the hi-frequency of, and desirability of, using blocks for > collection behavior et cetera, there is significant value in > providing NCB support either by inference or by explicit > syntax/annotation. Yep. > However the most likely form of block usage, where NCB usage would > make a difference, is probably in blocks that do not contain a > non-local return (^) and therefore one cannot implicitly ascertain > (from the callsite) the capturability constraint. Agreed. I was thinking of inference in the other direction, marking non-capturing selectors. No doubt what follows is redundant and "teaching grandma to suck eggs", but I'll say it anyway just to be clear. No offense intended. It would be a two-phase approach. The first phase is for each method to track the arguments it stores into instance variables itself, directly. This information changes only when the method is recompiled; it's very local. It also tracks which other selectors it passes the arguments with. In the second phase, we mark a selector as not capturing its argument if none of the methods implementing it capture its argument. A method doesn't capture its argument if it neither captures it directly, nor passes it with a selector which captures it. This is a non-local property; recompiling a method can affect how other methods and selectors are marked. However, I should think updating the "capture" marks could be done incrementally and reasonably quickly in a JIT. As a result of this we might find that even though a selector has too many implementations to inline, none of those implementations capture arguments so all can use the NCB optimisation. Thus this is less pessimistic than looking for monomorphic selectors. Also the analysis would not apply only to blocks. It might be useful to help optimise other object creation, too. So this is what we're saying is not feasible. Is it still too pessimistic? Dave Harris, Nottingham, UK | "Weave a circle round him thrice, [hidden email] | And close your eyes with holy dread, | For he on honey dew hath fed http://www.bhresearch.co.uk/ | And drunk the milk of Paradise." |
In reply to this post by David Simmons
In article ?Px4O6.23006$%[hidden email]>, David Simmons
says... > >"Florin Mateoc" ?[hidden email]> wrote in message >news:9e925r$gjn$[hidden email]... > >> - on the other hand, we have switched positions when it comes to >> packaging versus JIT time optimizations. You would like to defer them all >to >> the JIT, which has advantages - more adaptive and potentially more robust, >> as you have described. > >Well sort-of. I'm a big advocate of capturing design and intent and of being >able to use that information to generate up-to-date documentation that is >close to if not wholly consistent with the implementation (code) itself. > >In our Smalltalk and related tool/repository facilities I've always tried to >provide ways for expressing and tracking such information. I believe in >providing the means to declare and access such information and having tools >that validate the human assertions of intent. I've just learned that such >information should be treated by execution engine mechanisms as nothing more >than a hint/suggestion/request. > >Such suggestions may, based on policy settings, result in the generation of >exceptions or warnings when packages/modules are verified/loaded. This is >certainly the approach our QKS Smalltalk package system took on a number for >a number of attributes/properties. > What can I say ? Great minds think alike (THIS WAS A JOKE (sort of.. (THIS WAS A META-JOKE (with a grain of truth (THIS WAS A META_META JOKE (.... Jokes aside, having worked on some large applications in great need of consistency and documentation (or at least comments), I couldn't agree more. But, as you say (elsewhere), the devil is in the details. There are many subtle ways in which you can invalidate the original design or implementation (not all the constraints are about types), that would be very hard to express or check. And if you don't know that you are invalidating something (or that you are relying on an implementation detail), you don't document it either. Worse, these originally subtle deviations get used by other people. <anecdote> Somebody implemented (before I got there) #= in Set. It so happens that Envy was doing a comparison of the class pool (which was a dictionary, inheriting from Set) at load time between the pool of the class in the image and the one in the newly built class. Since in the base system #= was inheriting from Object, this was originally an identity check. Now, suddenly the comparison returned true where before it was returning false, and the classPool would sometimes (when the sizes were the same) not be refreshed by a reload. Funny thing is, this was unfrequent enough that it went on for years before I found it. Nobody reported it, it had to happen to me for somebody to investigate it. In the meantime, a lot of code had been written that relied on an equality check in sets and I was not allowed to simply remove the offending method (and with a selector like #=, it is not easy to find the call sites). I had to build a small infrastructure to instrument it and slowly collect the usage information over several weeks. </anecdote> >> ROBUSTNESS SIDEBAR: I think accidental dynamic behavior is not desirable >> and from my point of view, as a developer, I would like to be notified >(even >> with a walkback) that my assumptions have been invalidated. This would be >a >> bug in the sense that I did not understand the system's behavior and I >> better fix it. -- END SIDEBAR -- > >I can only agree with you here. But, the devil is in the details on this >one. In other words, I'm not entirely clear how one can accidently cause >dynamic behavior to happen. It does seem to me that it is relatively easy to >abuse/misuse MOP services and therefore wreak all manner of havoc or poor >code. > But even MOP services should be used in a predictable manner. Somebody should be in control and be able to tell the limits of dynamic behavior. Those limits always exist, this is (valuable) information and I think we agree that this information should be accessible to someone working on the application (at least through instrumenting and such). This person could and should use it for improving the systrem. Where can such a person feed this information back to the system ? I think (re-)packaging is a very useful point for this purpose. >> While it is true that ideally optimizations are best deferred as much as >> possible, there are some real resources and time constraints. A >> packager/type inferencer/optimizer has relatively infinite time and >> resources to do whole program optimizations. > >I disagree here to one extent or another. One of the large application focus >areas for SmallScript is for scripting. This area includes embedding scripts >within web-pages, etc. In that arena the compilation time and packaging for >the cache system are very time sensitive. > So we have found the basic source for the slight disagreement. It is merely a different focus, with different constraints: mine is large applications (I have been working on some for the last 6 years) >> A JIT has to do it on the fly. >> The whole hot-spot idea relies exactly on the JIT's superficiality, doing >> some of the more sophisticated optimizations only where it knows it counts >> most, as a complement (because even the hot-spot has the same kind of run >> time constraints like the JIT). > >That is true. But my philosphy is a bit different from that exhibited in >Self. I am perfectly comfortable with the realities of providing a certain >amount explicitly declared (intent) information for use in optimization. >That allows fast and efficient code generation without having to rely 100% >on adaptive/heuristic techniques. > >> Well, the packager does not have any of >> these hard constraints, I think this is a very strong argument for >> pre-computing the optimizations. Some of them would be simply impossible >at >> run time. > >I believe we actually share a basically similar philosophy here. But we are >expressing it a little differently. I suspect that this is mostly the fact >that SmallScript and the AOS are abstract ideas for you (since you have no >way to have worked with it), and concrete working systems for me (because I >am working with it). Very true, I am looking forward to seeing it (although, as you say, you have to work with it to really get it) > >...snip... > >> > I don't follow your argument here. You're advocating a new "primitive" >> > method called #basicDo: and saying that is socially acceptable. But on >the >> > other hand saying that the new (as you termed them) "primitive" methods >> for >> > #(basicNRCSlotAt: basicNRCSlotAt:put:) somehow make SmallScript a >> > significant departure from the Smalltalk model? >> > >> >> #basicDo: would be exactly the same kind of primitive method like #to:do:, >> and it would be safe in the sense that it could not be used inadvertently >> and lead to an undebuggable crash. #basicNRCSlotAt: on the other hand, >> could. > >I see your point. I would just remind you that basic methods are generally >low level or MOP related. There are many existing operations one can perform >within "classic" Smalltalk that can completely corrupt the system and lead >to equivalently un-debuggable crashing. As a general rule, we should not shy >away from providing unifying or fundamental capabilities just because they >can be misused. > I am all for "Power to the developers". And I know that this may be hard, but don't you think that trying to minimize the number of system-corrupting operations without handicapping the system is a worthy goal ? Because it does not seem to be a priority for you <snip> >I don't agree here. The whole development process is fluid and iterative. Almost. The fluidity changes during the process; it is more fluid at the beginning and it becomes less and less as you cristalize your ideas. It is true that this is a continuous process and this continuity is not captured by the separation development time versus packaging time. But just like objects have continuously variable ages, discretely separating them into new and old is still a useful paradigm. <snip> >> How does global optimization work across languages >> (not on the AOS platform)? > >I don't know how to answer this question since my multi-language (UVM) >comments were premised on the concept of a common runtime/execution >architecture. Optimization in such an architecture is performed both in the >source generation of the IL, and in the compilation of the IL into a given >machine/hardware configuration specific body of executable code. > >I probably should comment that IL is a higher and distinctly more expressive >instruction form than what might be implied by typical "bytecode" systems >designed for interpreters. That is why, historically, I've tended to refer >to our instruction sets as opcodes, and with the greatly enhanced >instruction set and related metadata of the v4 AOS Platform and >Microsoft.NET they are more appropriately refered to as IL. > I see. And I understand that you work on a concrete system that has ist own assumptions. But there will be life after .NET Although probably successful, it will not be last word in language development. Don't you want to keep a little distance from it ? > >-- Dave S. > Florin |
In reply to this post by Dave Harris-3
"Dave Harris" <[hidden email]> wrote in message
news:[hidden email]... > [hidden email] (David Simmons) wrote (abridged): ...snip... > I was thinking of inference in the other direction, marking non-capturing > selectors. No doubt what follows is redundant and "teaching grandma to > suck eggs", but I'll say it anyway just to be clear. No offense intended. Don't worry about offense taken for explaining a process. There are many people reading this posts and a clear explanation is helpful no matter who is reading it. > > It would be a two-phase approach. The first phase is for each method to > track the arguments it stores into instance variables itself, directly. > This information changes only when the method is recompiled; it's very > local. It also tracks which other selectors it passes the arguments with. > > In the second phase, we mark a selector as not capturing its argument if > none of the methods implementing it capture its argument. A method > doesn't capture its argument if it neither captures it directly, nor > passes it with a selector which captures it. This is a non-local > property; recompiling a method can affect how other methods and selectors > are marked. However, I should think updating the "capture" marks could be > done incrementally and reasonably quickly in a JIT. > > As a result of this we might find that even though a selector has too > many implementations to inline, none of those implementations capture > arguments so all can use the NCB optimisation. Thus this is less > pessimistic than looking for monomorphic selectors. Also the analysis > would not apply only to blocks. It might be useful to help optimise other > object creation, too. > > So this is what we're saying is not feasible. Is it still too > pessimistic? It is is certainly feasible. It is just a variant of one of the classic mechanisms for tree-shaking to extrude a minimal image or package. AOS Platform VMs have typically tracked this kind of thing; including the use of shared-variables (namespace) to enable real-time refactoring. The JIT facilities track similar information for their optimization and maintenance tasks. But I don't believe it would be particularly fruitful. I'd considered this approach early on when I recognized the optimization opportunity but rejected it for the following reasons. The problem is that if there is just "one" method that captures its arguments (even if that capture is only for processing life of the call and thus doesn't violate the NCB contract) the optimization is blocked. There is a reasably high chance that such a method might place an NCB into a inst/shared (non-local/stack) var temporarily during its execution and remove it at the end of the process (or never read/access it again). There is a reasonable chance that the selector may be used in more than one semantically unrelated contract; where one of the contracts does utilize capturing. The optimizaton is prevented for the entire selector even though non NCB compliant methods may not, by contract/intent, ever be invoked from a location where the developer knew/intended to provide a block with NCB semantics. -- Utilizing smallscript's multi-method binding facilities we can help with the declaration and unit-testing validation of such contractual behavior. I.e., We could choose to define an interface named <ICapturable>. Then any method that captures its valuable would declare the affected arguments as needing to support that interface. That would satisfy the expression of intent and ensure that (for such properly annotated methods) there would be forward-invocation "delegation" leading to a does-not-respond-to (DNR) exception. We would only need to ensure that classes such as <Block> implemented the <ICapturable> interface. Or take the negation approach and ensure that classes such as a <NCB/LocalBlock> did not implement (blocked inheritance of) that interface. method behavior=AClass [ someSelector: <ICapturable> valuable ... ]. [ aClass someSelector: _[...]. "" Would fail to bind resulting in a DNR which "" would be readily caught in a unit testing "" scenario. ]. NOTE: If a declarative form is the way to go (which I suspect it is), then I think I also prefer the asymetric declaration of _[...]. --- But, in general, this is the exact same issue one sees with blocks containing a (^) non-local-return and therefore we are not introducing a new "issue" into Smalltalk per-se. Since few developers have noted or observed real problems relating to the (^) contract issues, I see little reason (pragmatically) to expect that a "local" block declaration would be a problem. Note, I think I prefer the term local-block (or slightly less desireable -- inline-block) to the term NCB. -- Dave S. > > Dave Harris, Nottingham, UK | "Weave a circle round him thrice, > [hidden email] | And close your eyes with holy dread, > | For he on honey dew hath fed > http://www.bhresearch.co.uk/ | And drunk the milk of Paradise." |
In reply to this post by David Simmons
"Florin Mateoc" <[hidden email]> wrote in message
news:i4bO6.1935$[hidden email]... > In article ?Px4O6.23006$%[hidden email]>, David Simmons > says... ...snip... > > > >> > I don't follow your argument here. You're advocating a new "primitive" > >> > method called #basicDo: and saying that is socially acceptable. But on > >the > >> > other hand saying that the new (as you termed them) "primitive" methods > >> for > >> > #(basicNRCSlotAt: basicNRCSlotAt:put:) somehow make SmallScript a > >> > significant departure from the Smalltalk model? > >> > > >> > >> #basicDo: would be exactly the same kind of primitive method like #to:do:, > >> and it would be safe in the sense that it could not be used inadvertently > >> and lead to an undebuggable crash. #basicNRCSlotAt: on the other hand, > >> could. > > > >I see your point. I would just remind you that basic methods are generally > >low level or MOP related. There are many existing operations one can perform > >within "classic" Smalltalk that can completely corrupt the system and lead > >to equivalently un-debuggable crashing. As a general rule, we should not shy > >away from providing unifying or fundamental capabilities just because they > >can be misused. > > > > I am all for "Power to the developers". And I know that this may be hard, but > don't you think that trying to minimize the number of system-corrupting > operations without handicapping the system is a worthy goal ? Because it does > not seem to be a priority for you. Well, hmm. What you suggest regarding my priorities is both true and untrue. It is true that I have no compunction about not providing something I believe to be a key feature/functionality even if it has inherent possibilities for "dangerous" misuse (often that "danger" is tied directly to its "benefit"). But, when I do so I have a number of criteria I go through for how it should be provided/manifested. One of those is to make every effort to remain true to smalltalk philosphy; including never breaking backwards compatibility. Another is to pursue the goal that simplicity is elegance and therefore look for unifying elements that can provide developer building blocks for higher level functionality. I firmly believe that MOP facilities come with explicit/implicit warnings that you must understand their contracts or you should not be using them. There are different levels/modes of development and design that are at play when building classes, components, frameworks, etc. Working with "potentially dangerous" materials should not be prevented, as long as it is clear to the user that they are doing so. For me, by definition, any #basic method fits that model as do any other operations involving MOP related interfaces. This whole area is very closely related to security and sandboxing issues. Which, in turn, is closely related to the dynamic scoped binding facilities present in SmallScript. (i.e., method access controlled by calling context in relating to bound/implementation context). > > <snip> > > >I don't agree here. The whole development process is fluid and iterative. > > Almost. The fluidity changes during the process; it is more fluid at the > beginning and it becomes less and less as you cristalize your ideas. It is true > that this is a continuous process and this continuity is not captured by the > separation development time versus packaging time. But just like objects have > continuously variable ages, discretely separating them into new and old is still > a useful paradigm. > > <snip> > > >> How does global optimization work across languages > >> (not on the AOS platform)? > > > >I don't know how to answer this question since my multi-language (UVM) > >comments were premised on the concept of a common runtime/execution > >architecture. Optimization in such an architecture is performed both in > >source generation of the IL, and in the compilation of the IL into a given > >machine/hardware configuration specific body of executable code. > > > >I probably should comment that IL is a higher and distinctly more expressive > >instruction form than what might be implied by typical "bytecode" systems > >designed for interpreters. That is why, historically, I've tended to refer > >to our instruction sets as opcodes, and with the greatly enhanced > >instruction set and related metadata of the v4 AOS Platform and > >Microsoft.NET they are more appropriately refered to as IL. > > > > I see. And I understand that you work on a concrete system that has ist own > assumptions. But there will be life after .NET > Although probably successful, it will not be last word in language development. In its implementation, certainly true. In its general goals and principles there are fundamental ideas which (have so far stood or) will stand the test of time. > Don't you want to keep a little distance from it? Hmmm. So my AOS Platform work and its UVM aspirations were there from day one ~1986 originally with my development of message-C and in the (QKS) AOS Platform's 1991 white paper and architecture notes. Over the years, its languages have included Smalltalk, Prolog, a tiny Basic, and a number of explorations into Apple languages. The idea of a UVM has been around for a long time, and for me at least, it was one of the prime areas of research that ultimately led to my involvement in the Smalltalk space. We (QKS) spent years working with Apple on this because they wanted to license QKS products and our AOS VM for hosting a variety of languages. The AOS Platform architecture is the foundation for SmallScript, and the AOS Platform is unrelated to Microsoft.NET. So, the AOS Platform provides a complete layer of "abstraction" for making SmallScript available on other execution platforms. My work with .NET has concentrated on translating AOS Platform IL to the Microsoft.NET platform. Which, in turn, is what enables AOS Platform based code (independent of source language) to be retargetted to Microsoft.NET's platform. Microsoft.NET is not offering some radical/wholly new approach or ideas in this regard. It is an excellent effort at creating a viable multi-paradigm-language UVM, and it is backed and supported by a company that has solid market presence and resources. If anything, my involvement over the last couple of years with Microsoft's .NET has had a positive and catalyzing influence on some aspects of my work. In other words, I feel very comfortable that there are no relevant Microsoft.NET dependency issues here. I learned many hard business lessons in this regard, when working with Apple. But there is a great deal of opportunity and synergy in supporting the Microsoft.NET platform (unlike efforts to support the Java platform where there are basic philosophical differences in addition to technical ones). -- Dave S. > > > > >-- Dave S. > > > > Florin |
In reply to this post by David Simmons
[hidden email] (David Simmons) wrote (abridged):
> The problem is that if there is just "one" method that captures its > arguments (even if that capture is only for processing life of the > call and thus doesn't violate the NCB contract) the optimization > is blocked. OK; that's what I meant by "pessimistic". I wonder how common that is in practice. Dave Harris, Nottingham, UK | "Weave a circle round him thrice, [hidden email] | And close your eyes with holy dread, | For he on honey dew hath fed http://www.bhresearch.co.uk/ | And drunk the milk of Paradise." |
Free forum by Nabble | Edit this page |