The cost of sending a message

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

The cost of sending a message

Fernando Rodríguez
Hi,

This code is form the dolphin companion. It's a method of the Driver
class, that also understands the methods firstname, surname and name.
All of them return Strings.

setName
        "Create the name from the surname and firstname.
        This method is called when either of them changes."
        | tmpName |
        tmpName := String name.
        self surname notNil ifTrue: [tmpName := self surname].
        (self firstname notNil and: [self firstname notEmpty])
                ifTrue: [ tmpName := self firstname, ' ', tmpName ].
        self name: tmpName.

This method calls several times the surname and firstname, which
haven't changed between calls.

For someone used to VB this si a bit surprising, as using a temp
variable to cache those values is much faster than calling a method
(due to com overhead).

Is this the usual Smalltalk way? Is the cost of method calling
negligeable?

Thanks


Reply | Threaded
Open this post in threaded view
|

Re: The cost of sending a message

Ian Bartholomew-19
Fernando,

> Is the cost of method calling
> negligeable?

More or less. You can do little benchmarks to test it yourself. For (a
really crude) example [1], create a class and add the methods

test1
    | a b |
    b := self val.
    ^Time millisecondsToRun: [10000000 timesRepeat: [a := b]]

test2
    | a |
    ^Time millisecondsToRun: [10000000 timesRepeat: [a := self val]]

val
    ^1

In my system test2 takes about twice as long as test1.  (~600 mS as against
~300 mS) so there is an extra overhead for the extra message sends but it's
not particularly large in terms of actual time.   Because message sends are
rather common in Smalltalk :-) most implementations make sure they are very
efficient.

> Is this the usual Smalltalk way?

It really depends on the what the target method has to do to return a value.
If it's more than a simple accessor then it's probably better to cache the
value in a variable, at either end.  If it's just a simple accessor, a
constant or returning an inst var's value for example, then it really
depends on which form you prefer.  Local variable sometimes make a method
clearer but can also make it more difficult to read.

Kent Beck's book "Smalltalk: Best Practice Patterns (IMHO a must read for
any Smalltalker) has a section on this that's worth a read.

FWIW, the method you quote could now be written (after the inclusion of
ifNil/ifNotNil into a base Dolphin image) as something like the following
(untested) which gets around the problem anyway

    | tmpName |
    tmpName := String new.
    self surname ifNotNil: [:arg | tmpName := arg].
    self firstName ifNotNil: [:arg |
        arg notEmpty ifTrue: [
            tmpName := arg , ' ' , tmpName]].
    self name: tmpName

[1] I know Dolphin does a bit of optimization on certain operations (IIRC
including accessor methods?) so if this is a not particularly valid
experiment then I apologize.

--
Ian

Use the Reply-To address to contact me.
Mail sent to the From address is ignored.


Reply | Threaded
Open this post in threaded view
|

Re: The cost of sending a message

Schwab,Wilhelm K
In reply to this post by Fernando Rodríguez
Fernando,

> This code is form the dolphin companion. It's a method of the Driver
> class, that also understands the methods firstname, surname and name.
> All of them return Strings.
>
> setName
> "Create the name from the surname and firstname.
> This method is called when either of them changes."
> | tmpName |
> tmpName := String name.
> self surname notNil ifTrue: [tmpName := self surname].
> (self firstname notNil and: [self firstname notEmpty])
> ifTrue: [ tmpName := self firstname, ' ', tmpName ].
> self name: tmpName.
>
> This method calls several times the surname and firstname, which
> haven't changed between calls.

That is a very valid observation, but how often is #setName called?
Would the logical change make any real difference?

A way to find out would be to run code under Ian's profiler.  Machines
have gotten faster over the past several years, to the point that a
millisecond is really quite a long time, so you might want to profile a
loop that does a particular job many times.

Dan Ingalls et al. were faced with making something run on much slower
hardware [1] than we have now.  The 275 kilobytes (k, not M) of RAM he
needed was almost unthinkable in those days.  They did it in part by
making the common things very efficient, and not paying too much
attention to the rare stuff.  You can see that to this day with
Smalltalk code that runs when a primitive fails; some of it simply
signals an error, other times it does something "tricky" that would be
much harder to code in a primitive than in Smalltalk.

There is also the standard advice:

(1) make it run
(2) make it correct
(3) make it fast

Do them in that order.


[1] IIRC, Alan Kay has commented that modern caching pays little
attention to the lessons learned in the 60's, making the actual
throughput not nearly so large as one would expect looking at the clock
speeds.  Specifics/references would be greatly appreciated.


> For someone used to VB this si a bit surprising, as using a temp
> variable to cache those values is much faster than calling a method
> (due to com overhead).

It's not just COM overhead; it's Automation overhead in many cases.


> Is this the usual Smalltalk way? Is the cost of method calling
> negligeable?

As Ian said, a running image sends a lot of messages, so implementors
try to make it efficient.  However, any cost will eventually add up to
something noticeable.  You are wise to pay attention to the efficiency
of code, but note that you can typically get a lot further by following
the three steps avove, in order.  They are partly based on the reality
that many projects are throw-aways and do not require tuning (Smalltalk
will allow you to accomplish them MUCH faster than C* folks), and also
on the long-standing observation that programmers are typically have
very poor intuition when it comes to identifying performance
bottlenecks.  Get something running, and IF the performance is not to
your needs or liking, use a profiler to find out why and fix it.

Note that this soewhat cavalier attitude would be riskier in C*, as
design changes are often necessary.  However, by the time you get to
step (3), you should have units test that will help you get back to step
(2) after you make the design changes.

Keep in mind that you can always fall back on a C/C++ DLL for situations
that need very high performance.  However, you should resort to that
only when you are certain it is necessary, or there is some other reason
(e.g. existing code for numerical analysis, etc.) to use another
language.  #setName will not be likely to require a DLL.

Have a good one,

Bill

--
Wilhelm K. Schwab, Ph.D.
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: The cost of sending a message

Chris Uppal-3
In reply to this post by Fernando Rodríguez
Fernando wrote:

> This method calls several times the surname and firstname, which
> haven't changed between calls.
>
> For someone used to VB this si a bit surprising, as using a temp
> variable to cache those values is much faster than calling a method
> (due to com overhead).
>
> Is this the usual Smalltalk way? Is the cost of method calling
> negligeable?

(I've read the answers from Ian and Bill, and agree (mostly) with both -- this
is just to present my take on the matter)

First off, yes the cost is usually negligible.  On my 1.3 Ghz laptop, using a
"getter" method adds around 21 nanseconds to the cost of accessing the variable
directly.  That's to say, if you executed 50 million getter methods, then that
would add around 1 second to your total runtime.  The figure for "setters" is
around 5 times higher.  For most purposes numbers like that can be considered
negligible.  (BTW, as Ian mentioned, Dolphin uses an optimised method-call
implementation for getters and setters -- which makes the method-call overhead
lower than usual -- but I don't think that invalidates the measurement, since
its the cost of getter methods that is under discussion.)

Secondly, yes there is a substantial body of opinion that says that the use of
getter methods is correct, and people who believe that would probably also
write the code as Ted did -- they would think that pulling the values into
local variables was a premature optimisation, and something they wouldn't do
unless/until is was demonstrated that the optimisation was actually useful in
practise (which it almost certainly wouln't be).

However, there is another reason why you might pull the values into local
variables -- it might make the code clearer.  All the self-sends tend to
clutter up the code and make it less readable.  But then if you think that,
then you'll probably agree with me (/and/ Kent Beck and OA ;-) that
getter/setter methods are /not/ a good idea in general.  I'm in the "strongly
opposed to accessor methods" camp myself (i.e. I think the "substantial body of
opinion", mentioned above, is Just Plain Wrong) -- I think they obfuscate the
code, hide what's going on (an object doesn't have to hide implementation
details from /itself/ for God's sake ! ;-), and anyway the very concept of
"accessor method" is an affront to OO thinking.  I've never seen a
justification offered for their use that made much sense at all to me.  And, as
icing on the cake of a bad idea, they also run more slowly...

(<cough> sorry, rant-mode off)

But anyway, if you want to use (private) accessor methods in your code, then
that's a matter of style.  If you end up using getters, and /also/ using local
variable to "cache" the results of the getters, then I think you would just be
making life unecessarily difficult for yourself.

    -- chris


Reply | Threaded
Open this post in threaded view
|

Re: The cost of sending a message

Fernando Rodríguez
On Mon, 17 Jan 2005 09:30:37 -0000, "Chris Uppal"
<[hidden email]> wrote:


>However, there is another reason why you might pull the values into local
>variables -- it might make the code clearer.  All the self-sends tend to
>clutter up the code and make it less readable.  But then if you think that,
>then you'll probably agree with me (/and/ Kent Beck and OA ;-) that
>getter/setter methods are /not/ a good idea in general.  I'm in the "strongly
>opposed to accessor methods" camp myself (i.e. I think the "substantial body of
>opinion", mentioned above, is Just Plain Wrong) -- I think they obfuscate the
>code, hide what's going on (an object doesn't have to hide implementation
>details from /itself/ for God's sake ! ;-), and anyway the very concept of
>"accessor method" is an affront to OO thinking.  I've never seen a
>justification offered for their use that made much sense at all to me.  And, as
>icing on the cake of a bad idea, they also run more slowly...

Thanks, you convinced me... sort of. ;-)

I agree that using an accessor mehtod from within the the class
doens't make any sense, but you seem to go further and consider the
usage of any accessor method fundamentally wrong. Is this correct? If
yes, why? O:-)

Thanks


Reply | Threaded
Open this post in threaded view
|

Re: The cost of sending a message

Schwab,Wilhelm K
Fernando,

> I agree that using an accessor mehtod from within the the class
> doens't make any sense,

That depends.  It makes a lot of sense where lazy initialization is of
benefit.  It can also be useful in debugging.

I agree that knee-jerk use of accessors is bad, because it defeats
encapsulation, among other things.  To use a procedural analogy,
everybody was so worried about using goto, they lost site of the _real_
danger, which was the label.  Accessors can be bad for similar reasons.

Have a good one,

Bill


--
Wilhelm K. Schwab, Ph.D.
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: The cost of sending a message

Chris Uppal-3
In reply to this post by Fernando Rodríguez
Fernando,

> > [...] and anyway the very concept of "accessor method" is
> > an affront to OO thinking.
> [...]
> I agree that using an accessor method from within the the class
> doens't make any sense, but you seem to go further and consider the
> usage of any accessor method fundamentally wrong. Is this correct?

First off, let me emphasise that this is all only my personal opinion.

What I meant was that the /concept/ of an "accessor method" is non-OO.  I have
no objection whatever to a method that answers an object's height, even if it
is coded as:

    height
        ^ height.

(except that I would deplore the absence of a comment -- "height" is by no
means an unambiguous word).  On the other had the method might just as well be
coded:

    height
        ^ self map: altitudeAtMapReference: self mapReference.


or:

    height
        ^ intialHeight + (growthRate * age / 100.0).

or:

    height
        ^ top - bottom.

(possibly with a few more self-sends scattered around if you like using private
accessors)

What I do object to is the notion of a method whose /purpose/ is to read (or
write) the value of one of the object's instance variables.  /That/ is a
violation of encapsulation.   The caller should not care at all about how the
receiver represents its state, but only care about its documented behaviour.
If the method #height is part of the defined behaviour of the object, but just
happens to be implemented as a simple read of an instance variable, then that
is Fine By Me.  But in that case the method is only "accidentally" a getter
method, it has the /form/ of a getter method, but that's not how it is intended
to be thought of.

In my experience so far, most people who talk about getter/setter methods, or
accessor methods, /are/ thinking in terms of methods that change the values of
objects' instance variables.  And to me, that is a failure of OO thinking.

A few caveats:

One is that /private/ accessors aren't (IMO) a failure of OO.  I don't like
them myself (as a matter of readability and convenience) but I don't think that
they are actually /evil/.

Another is that sometimes you do need methods whose deliberate purpose is to
read/write an instance variable (and which would change if/when the IV
changed).  This may be necessary in reflective contexts if #instvarAt: (and its
friends) is not appropriate.  Such contexts are rare.

A last is that some people may have/use a different concept of what an
"accessor method" is.  They may mean something more general (or abstract),
like:
    They are quick, callers can assume that they run in fast
near-constant-time.
    They come (usually) in pairs of a "getter" and "setter".
    The "getter-style" member of the pair has no visible side-effects.
    The "setter-style" member of the pair is idempotent.
    The "getter" will always return the value set by the last "setter" unless
something else changes.
    etc...
If that's how someone is thinking about the "accessor methods" then I wouldn't
complain so much[*] about OO.  I would suggest, though, that that's not how
most programmers use the term[**].

One last point is that some of the advice to use "accessor methods" comes from
Java/C++ programmers, and what they mean is "use accessor methods instead of
publicly accessible instance variables" (there's no such option in Smalltalk,
of course).  I agree that they are better than public instvars, but that still
doesn't make them /good/.

HTH.  Please don't read too much into it; it is, as I said, just my personal --
and rather extreme -- opinion.

    -- chris

(
[*] I might still complain a bit -- lots of simple methods that merely get/set
some aspect of an object's state may indicate that the programmer is expecting
work to be done in the caller that the object should be able to do itself.

[**] Although I admit that I put lots of methods into the Dolphin method
category 'accessing' -- and that's how I'm thinking of the word when I do it.
)