"Chris Fedde" <[hidden email]> wrote in message
news:nqb%5.295$[hidden email]... > In article <[hidden email]>, > Just Me <[hidden email]> wrote: > >Try writing this in your favourite language: > > > > 1000 factorial > > -> a very large number which causes overflow if > > not evaluated in Smalltalk > > I don't know what machine or Smalltalk dialect that was run on. But when I run it on 650MHz Athlon in QKS Smalltalk on the v4 AOS Platform. "1000 factorial" takes 24ms (.024s). To try something that more fully tests multi-precision and GC performance try "10000 factorial", which on the same system takes 1257 ms (1.25s). It actually takes a little less time than that which is reported, because the reporting is based on measuring the actual cpu cycles consumed from start to finish (which are shared among the Win2000 kernel, processes, and their threads). [ stdout cr << 'TIME: ' << ([ 10000 factorial. ] cyclesToRun // (650*1000)). ] TIME: 1257 "I.e., 1.257s" [ stdout cr << 'TIME: ' << ([ 100000 fibonacci. ] cyclesToRun // (650*1000)). ] TIME: 5339 "I.e., 5.339s" ========================= There are lots of other examples we could try: ========================= The Smalltalk I work in (SmallScript/QKS Smalltalk) has multi-methods (multi-arg dispatching), concrete mixin types (interfaces), optional typing with type-cases and parametric polymorphism, etc. Someone in one of the earlier posts said that Smalltalk doesn't have types? The statement that Smalltalk is not typed is of course absurd. Smalltalk is and always has been (fundamentally) strongly typed -- its variables and function/method-calls are not statically typed, they are 100% dynamically typed and typically support level of implicit/inferenced (or annotated) static type capabilities/performance. In simple terms, the objects are 100% typed, variables are by-default/implicitly containers for type <any>. The Dynamic Language Platforms (vm's) Smalltalk's typically execute on, unlike Java platform and the .NET architecture have a more unified object model with a single type at the root of the hierarchy <Object>. This allows one to declare a variable that can hold ANY type (i.e., any object because everything is an object). The only thing that is not (required to be) typed in Smalltalk are variables. Unless otherwise typed, they are generic containers that by default hold type <Object> or equivalently <any>. --- Note: Many of the Smalltalk implementations/dialects currently do not have facilities for supporting <type> annotations and therefore those versions currently cannot support multimethods. (For those who may be unclear, in a pure dynamic messaging OO language there is NO distinction between (operator) overloading and multimethods). However, as some people know, Sun's HotSpot technology was actually a Smalltalk system which did provide optional typing annotation. When Sun purchased Animorphic to obtain the HotSpot technology to make Java as fast or faster than Smalltalks of that era, the features necessary to efficiently support Smalltalk and Scripting languages (such as Python) seemed to have inexplicably disappeared as it was morphed into Java. --- All objects within Smalltalk have-a (know-their) type all the time and all function/method invocations are bound based on the type information in a given object. If the Smalltalk dialect supports multi-methods (multi-arg dispatching) then all arguments play a role in the binding process. Inside the various vm platforms that provide the object model and execution architecture many optimizations are typically performed to give many Smalltalk implementations performance that is competitive with statically compiled languages for most applications; where there are hotspot areas Smalltalk's today typically make it fairly easy to fully transparent to efficiently invoke code written in other languages (kind of like a transparent JNI). The reason Smalltalk is considered to be a pre-eminent "pure" OO language is that everything is an object within the type system and all objects and message/method-calls are essentially treated uniformly. That means methods are objects, classes are objects, the class of a class... is an object, interfaces are objects, namespaces are objects, threads/processes are objects, etc. This has the virtue the metaobject protocol supports complete runtime reflection on everything and the ability to morph any object to some other object and physical layout while preserving its identity. Classes, methods, namespaces, interfaces, objects can be added, removed, updated within a running application at any time. This allows loading and unloading of packages and modules with much fewer of the traditional challenges of schema migration and versioning or the problems and barriers of altering and extending a 24x7x365 system. Methods can be added or removed to/from any class including <Object>. There are typically few if any subclassing limitations. Classes can change any characteristic at any time, even while they have instances in existance. So a class can change its (concrete mixins) interfaces, its namespace, its superclass, its shared variables, etc. Depending on the dialect, methods are scoped and belong to modules. So methods can be added to any class at any time with full support for runtime enforced private and protected (scoped) behavior. This means that one can define a project/module that replaces methods in <String> without affecting other unrelated modules that may depend on the "replaced" versions of those methods. It also means that Smalltalk (a dynamically typed language) with optional typing and multimethods can provide runtime type signature/contract enforcement and correctness behavior just like in statically typed languages. The Smalltalk I work in has facilities for full closure semantics, continuations, before/after/around behavior, read/write barriers, weak-object-references, intrinsic regular expressions, transparent FFI, interfaces (w/transparent COM integration), synchronized methods, blocking methods, dynamic mananged object services (delegation to another (manager) object of all methods and read/write behavior at any time on any object), value and reference types fields (i.e., effectively allowing c++ style structs), full exception handling, pre-emptive multi-threading, AND Smalltalk's today are capable of supporting a very small application or component footprint (especially compared to other jitted virtual machine based languages),... Integrated metaobject aware repository systems combined with these dynamic capabilities for modifying a live running system, and the lack of required type declarations for variables or casts on expressions are what enable Smalltalk IDE tools to provide environments where developer productivity tends to surpass that of developers using Java IDE tools. That doesn't make Smalltalk better than Java, it just represents an area where Smalltalk has some better facilities. However, the overall sum of capabilities enables Smalltalk systems exhibiting these kind of capabilities to be among the most powerful and advanced language system platforms for aspect oriented programming and todays increasingly important and demanded capabilities for supporting adaptive systems. What Smalltalk is not, (relative to some other languages discussed in this post) is a well understood and popular language today. And, unlike Java VM Platforms, Smalltalk VM Platforms currently don't provide sandbox style security. There are dialect/vendor collaboration and consistency issues that remain to be resolved; advanced features within the metaobject facilities of the language and its virtual machine platform services vary from implementation to implementation. The higher level frameworks (such as UI), while very mature and robust, are often different from one vendor/dialect to another. Perhaps one of my biggest concerns is that of convergence (portable facilities) for the various advanced metaobject technologies available in Smalltalk. These facilities form an important piece of the foundation for establishing portable frameworks that are comparable to those which appears to have developed for Java. One area in particular, for capabilities I'm interested in, is advancements in: annotations, namespaces, and modules/packages. Many of these capabilities have been available in one Smalltalk or another for almost ten years. To me, an important (and very fuzzy/unclear) question is whether the vendors/dialects will find a way work together in establishing consistency or will exhibit a pattern of competition whereby they attempt to distinguish themselves based on some (but not all) of the kind of metaobject capabilities I alluded to in this post. But, (ignoring competitive business concerns) it is after all a very challenging techical problem. Some of those frameworks are the latest generation of the original Smalltalk ones that defined the term "frameworks", object-oriented programming, classes, and represent the invention of "user interface" with windows and widgets. Time will tell how this coming 29th year generation of Smalltalk systems fare, but the nature of todays computing challenges is certainly pointing to an opportunity for Smalltalk to once again find that this is its time for reassuming a position of leadership in the evolution of software technologies. -- Dave Simmons [www.qks.com / www.smallscript.com] "Effectively solving a problem begins with how you express it." For more information on Smalltalks available today, just search for Smalltalk. Or follow links such as http://www.smalltalk.org/ or http://www.smalltalk.org/versions.html. > > Here is one in perl... > > cat fact.pl > use Math::BigInt; > > sub fact { > my $n = shift; > if ($n == 0){ > return 1; > } else { > return $n * fact($n-1); > } > } > > print fact(Math::BigInt->new($ARGV[0])), "\n"; > > It's not very fast though: > > real 0m10.433s > user 0m10.104s > sys 0m0.033s > > Ruby seems to be able to do it much faster... > > $ cat fact.rb > def fact(n) > if n == 0 > 1 > else > n * fact(n-1) > end > end > > $ print fact(ARGV[0].to_i), "\n" > > time ruby fact.rb 1000 > > real 0m0.129s > user 0m0.086s > sys 0m0.034s > > Ruby is very much like smalltalk with a procedural language verneer and > many of the 'good things' from Perl. > > Both output what I suspect is the correct value... > ...snip... > chris > -- > This space intentionally left blank |
Dave
You wrote in message news:SH_06.26911$[hidden email]... > ... > I don't know what machine or Smalltalk dialect that was run on. > > But when I run it on 650MHz Athlon in QKS Smalltalk on the v4 AOS Platform. > > "1000 factorial" takes 24ms (.024s). On my rather slower laptop machine (a 333 Celeron) Dolphin 4.0 takes just 13ms to execute 1000 factorial. This is one of the areas were Dolphin is faster than most (if not all) other Smalltalks. I've always thought it could be made a fair bit faster too with a bit of optimization. > To try something that more fully tests multi-precision and GC performance > try "10000 factorial", which on the same system takes 1257 ms (1.25s). D4 takes 1371mS to do that on the same 333; still faster (bearing in mind relative machine speed), but the other factors you mention (i.e. other than LI arithmetic) are starting to come into play, evening out the difference. I'm glad there is at least one area of VM design and performance where we at least appear to be ahead of AOS Dave :-). Regards Blair (Both figures are elapsed times measured using the microsecond clock/profiling counter) |
"Blair McGlashan" <[hidden email]> wrote in message
news:92coer$6dq57$[hidden email]... > Dave > > You wrote in message > news:SH_06.26911$[hidden email]... > > ... > > I don't know what machine or Smalltalk dialect that was run on. > > > > But when I run it on 650MHz Athlon in QKS Smalltalk on the v4 AOS > Platform. > > > > "1000 factorial" takes 24ms (.024s). > > On my rather slower laptop machine (a 333 Celeron) Dolphin 4.0 takes > just 13ms to execute 1000 factorial. This is one of the areas were Dolphin > is faster than most (if not all) other Smalltalk's. I've always thought it > could be made a fair bit faster too with a bit of optimization. > > > To try something that more fully tests multi-precision and GC > > try "10000 factorial", which on the same system takes 1257 ms (1.25s). > > D4 takes 1371mS to do that on the same 333; still faster (bearing in mind > relative machine speed), but the other factors you mention (i.e. other than > LI arithmetic) are starting to come into play, evening out the difference. > > I'm glad there is at least one area of VM design and performance where we at > least appear to be ahead of AOS Dave :-). Blair, thank you for pointing this out. I really hadn't thought about how fast it ought to be. I've been so busy on other areas I hadn't really gone in and re-verified the heuristic feedback system in the garbage collection. And, there have been quite a few changes since the v4 collector was last tuned a year or so ago... When I read your post, the first thing I did was run it on our v3 system (which a slower than v4) to give me a ballpark reference point for comparison. The only substantive difference between v3 and v4 for the factorial test is in garbage collector, the multi-precision code is using effectively the same algorithms. The v3 AOS Platform yielded a result of 8ms for "1000 factorial" on the same hardware versus v4 at 24ms. So I knew something was up. After a few hours o f mucking about exploring test cases and monitoring the GC patterns I discovered two things. a) The multi-precision numerics libraries in v4 were still calling templated c++ object constructors that I created during the bootstrapping process -- principally to allow me to create objects at any time independently of the gc design. These were not only slow but had a number of checks in them for sanity issues. b) The gc heuristic feedback system was significantly perturbed. That system decides when and how to execute various portions of the collector, and correspondingly, how aggressively to work at keeping memory defragmented (which results from pinning etc). In other words, I had added some additional cases for special objects but I had not got around to retuning the algorithms to account for the impact of those changes. As I've mentioned in other posts and during presentations, the factorial and fibonacci cases are really good tests of the GC system. I just had not realized how significant they are... I've been busy preparing the system for alpha testing and I had not planned on performing tuning at this stage. Now that you've prodded me into it, the system is significantly faster across the board. There is clearly more refinement I can perform in the heuristics system, but here is what a few hours of heuristics tuning achieved: AOS v4 Platform, 650MHz Athlon ------------------------------ a) The GC system now requires/consumes less memory during various baseline tests. b) "1000 factorial" takes 4-6ms depending on whether you use the cpu's cycle timer, the OS <QueryPerformanceCounter>. The <GetThreadTimes> call revealed that it had a accuracy/ granularity of some ~10ms so it was not valid here. c) "10000 factorial" takes 641ms, 633ms, 630ms (cycle timer, perf-timer, thread-times) d) "100000 fibonacci" takes 337ms, 337ms, 340ms (cycle timer, perf-timer, thread-times) e) "1000000 fibonacci" takes. 3438ms, 3423ms, 3430ms (cycle timer, perf-timer, thread-times) > > Regards > > Blair > > (Both figures are elapsed times measured using the microsecond > clock/profiling counter) The updated execution times for "1000 factorial" are about the same as you're getting for a 333MHz Celeron; assuming the cycle ratios of 650MHz Athlon vs 333MHz Celeron and assuming the Athlon caches are helping more than the Celeron. The performance of "1000 factorial" has improved by a factor of 4-6X. The performance of "10000 factorial" has improved by a factor of 2X. The performance of "100000 fibonacci" has improved by a factor of 15X. Based on my v3 AOS Platform numbers, these are the kind of performance I would expect. I'm reasonably certain that they might be able to improved by up to 20% or so with more extensive gc profiling and tuning analysis. So I'd say that for now anyway, we've gotten two Smalltalk's demonstrating multi-precision performance that is probably an order of magnitude faster than Ruby; and a number of orders of magnitude faster than the originally posted Python. And I think it is probably a fair guess that it is somewhat representative of the performance of gc sensitive code on such systems. In other words, I'm betting the actual numerics calculations are about as fast as they can be. The actual variants are the GC behavior. It adds confidence that SmallScript should execute competitively well in its scripting language role. Again, its hard to compare because we don't know what the original processor was for those Ruby and Python numbers -- but these are good approximations assuming a 200MHz cpu for the Ruby and Python tests. Python (1000 factorial) ------ > It's not very fast though: > > real 0m10.433s > user 0m10.104s > sys 0m0.033s Ruby (1000 factorial) ---- > time ruby fact.rb 1000 > > real 0m0.129s > user 0m0.086s > sys 0m0.034s -- Dave Simmons [www.qks.com / www.smallscript.com] "Effectively solving a problem begins with how you express it." |
Free forum by Nabble | Edit this page |