Smalltalk › Usenets › Dolphin Smalltalk

Re: Language evolution C->Perl->C++->Java->Python (Is Python the ULTIMATE oflanguages??)

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

3 messages Options

David Simmons

Re: Language evolution C->Perl->C++->Java->Python (Is Python the ULTIMATE oflanguages??)

"Chris Fedde" <[hidden email]> wrote in message
news:nqb%5.295$[hidden email]...
> In article <[hidden email]>,
> Just Me <[hidden email]> wrote:
> >Try writing this in your favourite language:
> >
> > 1000 factorial
> > -> a very large number which causes overflow if
> > not evaluated in Smalltalk
> >

I don't know what machine or Smalltalk dialect that was run on.

But when I run it on 650MHz Athlon in QKS Smalltalk on the v4 AOS Platform.

"1000 factorial" takes 24ms (.024s).

To try something that more fully tests multi-precision and GC performance
try "10000 factorial", which on the same system takes 1257 ms (1.25s).

It actually takes a little less time than that which is reported, because
the reporting is based on measuring the actual cpu cycles consumed from
start to finish (which are shared among the Win2000 kernel, processes, and
their threads).

[
stdout cr << 'TIME: ' <<
([
10000 factorial.
] cyclesToRun // (650*1000)).
]

TIME: 1257 "I.e., 1.257s"

[
stdout cr << 'TIME: ' <<
([
100000 fibonacci.
] cyclesToRun // (650*1000)).
]

TIME: 5339 "I.e., 5.339s"

=========================
There are lots of other
examples we could try:

=========================

The Smalltalk I work in (SmallScript/QKS Smalltalk) has multi-methods
(multi-arg dispatching), concrete mixin types (interfaces), optional typing
with type-cases and parametric polymorphism, etc. Someone in one of the
earlier posts said that Smalltalk doesn't have types?

The statement that Smalltalk is not typed is of course absurd. Smalltalk is
and always has been (fundamentally) strongly typed -- its variables and
function/method-calls are not statically typed, they are 100% dynamically
typed and typically support level of implicit/inferenced (or annotated)
static type capabilities/performance. In simple terms, the objects are 100%
typed, variables are by-default/implicitly containers for type <any>.

The Dynamic Language Platforms (vm's) Smalltalk's typically execute on,
unlike Java platform and the .NET architecture have a more unified object
model with a single type at the root of the hierarchy <Object>. This allows
one to declare a variable that can hold ANY type (i.e., any object because
everything is an object). The only thing that is not (required to be) typed
in Smalltalk are variables. Unless otherwise typed, they are generic
containers that by default hold type <Object> or equivalently <any>.

---
Note: Many of the Smalltalk implementations/dialects currently do not have
facilities for supporting <type> annotations and therefore those versions
currently cannot support multimethods. (For those who may be unclear, in a
pure dynamic messaging OO language there is NO distinction between
(operator) overloading and multimethods). However, as some people know,
Sun's HotSpot technology was actually a Smalltalk system which did provide
optional typing annotation. When Sun purchased Animorphic to obtain the
HotSpot technology to make Java as fast or faster than Smalltalks of that
era, the features necessary to efficiently support Smalltalk and Scripting
languages (such as Python) seemed to have inexplicably disappeared as it was
morphed into Java.
---

All objects within Smalltalk have-a (know-their) type all the time and all
function/method invocations are bound based on the type information in a
given object. If the Smalltalk dialect supports multi-methods (multi-arg
dispatching) then all arguments play a role in the binding process. Inside
the various vm platforms that provide the object model and execution
architecture many optimizations are typically performed to give many
Smalltalk implementations performance that is competitive with statically
compiled languages for most applications; where there are hotspot areas
Smalltalk's today typically make it fairly easy to fully transparent to
efficiently invoke code written in other languages (kind of like a
transparent JNI).

The reason Smalltalk is considered to be a pre-eminent "pure" OO language is
that everything is an object within the type system and all objects and
message/method-calls are essentially treated uniformly. That means methods
are objects, classes are objects, the class of a class... is an object,
interfaces are objects, namespaces are objects, threads/processes are
objects, etc. This has the virtue the metaobject protocol supports complete
runtime reflection on everything and the ability to morph any object to some
other object and physical layout while preserving its identity.

Classes, methods, namespaces, interfaces, objects can be added, removed,
updated within a running application at any time. This allows loading and
unloading of packages and modules with much fewer of the traditional
challenges of schema migration and versioning or the problems and barriers
of altering and extending a 24x7x365 system. Methods can be added or removed
to/from any class including <Object>. There are typically few if any
subclassing limitations. Classes can change any characteristic at any time,
even while they have instances in existance. So a class can change its
(concrete mixins) interfaces, its namespace, its superclass, its shared
variables, etc.

Depending on the dialect, methods are scoped and belong to modules. So
methods can be added to any class at any time with full support for runtime
enforced private and protected (scoped) behavior. This means that one can
define a project/module that replaces methods in <String> without affecting
other unrelated modules that may depend on the "replaced" versions of those
methods. It also means that Smalltalk (a dynamically typed language) with
optional typing and multimethods can provide runtime type signature/contract
enforcement and correctness behavior just like in statically typed
languages.

The Smalltalk I work in has facilities for full closure semantics,
continuations, before/after/around behavior, read/write barriers,
weak-object-references, intrinsic regular expressions, transparent FFI,
interfaces (w/transparent COM integration), synchronized methods, blocking
methods, dynamic mananged object services (delegation to another (manager)
object of all methods and read/write behavior at any time on any object),
value and reference types fields (i.e., effectively allowing c++ style
structs), full exception handling, pre-emptive multi-threading, AND
Smalltalk's today are capable of supporting a very small application or
component footprint (especially compared to other jitted virtual machine
based languages),...

Integrated metaobject aware repository systems combined with these dynamic
capabilities for modifying a live running system, and the lack of required
type declarations for variables or casts on expressions are what enable
Smalltalk IDE tools to provide environments where developer productivity
tends to surpass that of developers using Java IDE tools. That doesn't make
Smalltalk better than Java, it just represents an area where Smalltalk has
some better facilities.

However, the overall sum of capabilities enables Smalltalk systems
exhibiting these kind of capabilities to be among the most powerful and
advanced language system platforms for aspect oriented programming and
todays increasingly important and demanded capabilities for supporting
adaptive systems.

What Smalltalk is not, (relative to some other languages discussed in this
post) is a well understood and popular language today. And, unlike Java VM
Platforms, Smalltalk VM Platforms currently don't provide sandbox style
security. There are dialect/vendor collaboration and consistency issues that
remain to be resolved; advanced features within the metaobject facilities of
the language and its virtual machine platform services vary from
implementation to implementation. The higher level frameworks (such as UI),
while very mature and robust, are often different from one vendor/dialect to
another.

Perhaps one of my biggest concerns is that of convergence (portable
facilities) for the various advanced metaobject technologies available in
Smalltalk. These facilities form an important piece of the foundation for
establishing portable frameworks that are comparable to those which appears
to have developed for Java. One area in particular, for capabilities I'm
interested in, is advancements in: annotations, namespaces, and
modules/packages. Many of these capabilities have been available in one
Smalltalk or another for almost ten years. To me, an important (and very
fuzzy/unclear) question is whether the vendors/dialects will find a way work
together in establishing consistency or will exhibit a pattern of
competition whereby they attempt to distinguish themselves based on some
(but not all) of the kind of metaobject capabilities I alluded to in this
post.

But, (ignoring competitive business concerns) it is after all a very
challenging techical problem. Some of those frameworks are the latest
generation of the original Smalltalk ones that defined the term
"frameworks", object-oriented programming, classes, and represent the
invention of "user interface" with windows and widgets. Time will tell how
this coming 29th year generation of Smalltalk systems fare, but the nature
of todays computing challenges is certainly pointing to an opportunity for
Smalltalk to once again find that this is its time for reassuming a position
of leadership in the evolution of software technologies.

-- Dave Simmons [www.qks.com / www.smallscript.com]
"Effectively solving a problem begins with how you express it."

For more information on Smalltalks available today, just search for
Smalltalk. Or follow links such as http://www.smalltalk.org/ or
http://www.smalltalk.org/versions.html.

>
> Here is one in perl...
>
> cat fact.pl
> use Math::BigInt;
>
> sub fact {
> my $n = shift;
> if ($n == 0){
> return 1;
> } else {
> return $n * fact($n-1);
> }
> }
>
> print fact(Math::BigInt->new($ARGV[0])), "\n";
>
> It's not very fast though:
>
> real 0m10.433s
> user 0m10.104s
> sys 0m0.033s
>
> Ruby seems to be able to do it much faster...
>
> $ cat fact.rb
> def fact(n)
> if n == 0
> 1
> else
> n * fact(n-1)
> end
> end
>
> $ print fact(ARGV[0].to_i), "\n"
>
> time ruby fact.rb 1000
>
> real 0m0.129s
> user 0m0.086s
> sys 0m0.034s
>
> Ruby is very much like smalltalk with a procedural language verneer and
> many of the 'good things' from Perl.
>
> Both output what I suspect is the correct value...
>

...snip...

> chris
> --
> This space intentionally left blank

Blair McGlashan

Re: Language evolution C->Perl->C++->Java->Python (Is Python the ULTIMATE oflanguages??)

Dave

You wrote in message
news:SH_06.26911$[hidden email]...
> ...
> I don't know what machine or Smalltalk dialect that was run on.
>
> But when I run it on 650MHz Athlon in QKS Smalltalk on the v4 AOS
Platform.
>
> "1000 factorial" takes 24ms (.024s).

On my rather slower laptop machine (a 333 Celeron) Dolphin 4.0 takes
just 13ms to execute 1000 factorial. This is one of the areas were Dolphin
is faster than most (if not all) other Smalltalks. I've always thought it
could be made a fair bit faster too with a bit of optimization.

> To try something that more fully tests multi-precision and GC performance
> try "10000 factorial", which on the same system takes 1257 ms (1.25s).

D4 takes 1371mS to do that on the same 333; still faster (bearing in mind
relative machine speed), but the other factors you mention (i.e. other than
LI arithmetic) are starting to come into play, evening out the difference.

I'm glad there is at least one area of VM design and performance where we at
least appear to be ahead of AOS Dave :-).

Regards

Blair

(Both figures are elapsed times measured using the microsecond
clock/profiling counter)

David Simmons

Re: Language evolution C->Perl->C++->Java->Python (Is Python the ULTIMATE oflanguages??)

"Blair McGlashan" <[hidden email]> wrote in message
news:92coer$6dq57$[hidden email]...

> Dave
>
> You wrote in message
> news:SH_06.26911$[hidden email]...
> > ...
> > I don't know what machine or Smalltalk dialect that was run on.
> >
> > But when I run it on 650MHz Athlon in QKS Smalltalk on the v4 AOS
> Platform.
> >
> > "1000 factorial" takes 24ms (.024s).
>
> On my rather slower laptop machine (a 333 Celeron) Dolphin 4.0 takes
> just 13ms to execute 1000 factorial. This is one of the areas were Dolphin
> is faster than most (if not all) other Smalltalk's. I've always thought it
> could be made a fair bit faster too with a bit of optimization.
>
> > To try something that more fully tests multi-precision and GC

performance
> > try "10000 factorial", which on the same system takes 1257 ms (1.25s).
>
> D4 takes 1371mS to do that on the same 333; still faster (bearing in mind
> relative machine speed), but the other factors you mention (i.e. other
than
> LI arithmetic) are starting to come into play, evening out the difference.
>
> I'm glad there is at least one area of VM design and performance where we
at
> least appear to be ahead of AOS Dave :-).

Blair, thank you for pointing this out. I really hadn't thought about how
fast it ought to be.

I've been so busy on other areas I hadn't really gone in and re-verified the
heuristic feedback system in the garbage collection. And, there have been
quite a few changes since the v4 collector was last tuned a year or so
ago...

When I read your post, the first thing I did was run it on our v3 system
(which a slower than v4) to give me a ballpark reference point for
comparison. The only substantive difference between v3 and v4 for the
factorial test is in garbage collector, the multi-precision code is using
effectively the same algorithms.

The v3 AOS Platform yielded a result of 8ms for "1000 factorial" on the same
hardware versus v4 at 24ms. So I knew something was up. After a few hours o
f mucking about exploring test cases and monitoring the GC patterns I
discovered two things.

a) The multi-precision numerics libraries in v4 were still calling templated
c++ object constructors that I created during the bootstrapping process --
principally to allow me to create objects at any time independently of the
gc design. These were not only slow but had a number of checks in them
for sanity issues.

b) The gc heuristic feedback system was significantly perturbed. That system
decides when and how to execute various portions of the collector, and
correspondingly, how aggressively to work at keeping memory defragmented
(which results from pinning etc). In other words, I had added some
additional cases for special objects but I had not got around to retuning
the algorithms to account for the impact of those changes.

As I've mentioned in other posts and during presentations, the factorial and
fibonacci cases are really good tests of the GC system. I just had not
realized how significant they are...

I've been busy preparing the system for alpha testing and I had not planned
on performing tuning at this stage. Now that you've prodded me into it,
the system is significantly faster across the board. There is clearly more
refinement I can perform in the heuristics system, but here is what a few
hours of heuristics tuning achieved:

AOS v4 Platform, 650MHz Athlon
------------------------------
a) The GC system now requires/consumes less memory during
various baseline tests.

b) "1000 factorial" takes 4-6ms depending on whether you use the
cpu's cycle timer, the OS <QueryPerformanceCounter>. The
<GetThreadTimes> call revealed that it had a accuracy/
granularity of some ~10ms so it was not valid here.

c) "10000 factorial" takes
641ms, 633ms, 630ms (cycle timer, perf-timer, thread-times)

d) "100000 fibonacci" takes
337ms, 337ms, 340ms (cycle timer, perf-timer, thread-times)

e) "1000000 fibonacci" takes.
3438ms, 3423ms, 3430ms (cycle timer, perf-timer, thread-times)

>
> Regards
>
> Blair
>
> (Both figures are elapsed times measured using the microsecond
> clock/profiling counter)

The updated execution times for "1000 factorial" are about the same as
you're getting for a 333MHz Celeron; assuming the cycle ratios of 650MHz
Athlon vs 333MHz Celeron and assuming the Athlon caches are helping more
than the Celeron.

The performance of "1000 factorial" has improved by a factor of 4-6X.
The performance of "10000 factorial" has improved by a factor of 2X.
The performance of "100000 fibonacci" has improved by a factor of 15X.

Based on my v3 AOS Platform numbers, these are the kind of performance I
would expect. I'm reasonably certain that they might be able to improved by
up to 20% or so with more extensive gc profiling and tuning analysis.

So I'd say that for now anyway, we've gotten two Smalltalk's demonstrating
multi-precision performance that is probably an order of magnitude faster
than Ruby; and a number of orders of magnitude faster than the originally
posted Python.

And I think it is probably a fair guess that it is somewhat representative
of the performance of gc sensitive code on such systems. In other words, I'm
betting the actual numerics calculations are about as fast as they can be.
The actual variants are the GC behavior. It adds confidence that SmallScript
should execute competitively well in its scripting language role.

Again, its hard to compare because we don't know what the original processor
was for those Ruby and Python numbers -- but these are good approximations
assuming a 200MHz cpu for the Ruby and Python tests.

Python (1000 factorial)
------
> It's not very fast though:
>
> real 0m10.433s
> user 0m10.104s
> sys 0m0.033s

Ruby (1000 factorial)
----
> time ruby fact.rb 1000
>
> real 0m0.129s
> user 0m0.086s
> sys 0m0.034s

-- Dave Simmons [www.qks.com / www.smallscript.com]
"Effectively solving a problem begins with how you express it."