Smalltalk › Pharo › Pharo Smalltalk Developers

accessors vs direct access in Cog

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

12 messages Options

Stéphane Ducasse

accessors vs direct access in Cog

Hi guys

Ideally I would love to be able to use accessors as the abstraction layer that they can bring us:
I mean the fact that we could avoid to have offset based bytecode means that we could reuse a lot more
the methods (in special case - mixins and others).

Now I have a question does the JIT or the shortcut (not sure if this is in stackVM) blurry the cost of accessors
vs. direct accesses?

Does anybody run a benchmarck about
self x vs x in Cog recently
on a real app?

Stef

Eliot Miranda-2

Re: accessors vs direct access in Cog

On Wed, Dec 22, 2010 at 12:12 AM, Stéphane Ducasse <[hidden email]> wrote:

Hi guys

Ideally I would love to be able to use accessors as the abstraction layer that they can bring us:
I mean the fact that we could avoid to have offset based bytecode means that we could reuse a lot more
the methods (in special case - mixins and others).

Now I have a question does the JIT or the shortcut (not sure if this is in stackVM) blurry the cost of accessors
vs. direct accesses?

Not yet. But IMO it would be straight-forward to add. The PIC design supports inserting extra code at a send site, which is the logical place to put the access code.

Does anybody run a benchmarck about
self x vs x in Cog recently
on a real app?

Stef

best

Eliot

Levente Uzonyi-2

Re: accessors vs direct access in Cog

In reply to this post by Stéphane Ducasse

On Wed, 22 Dec 2010, Stéphane Ducasse wrote:

> Hi guys
>
> Ideally I would love to be able to use accessors as the abstraction layer that they can bring us:
> I mean the fact that we could avoid to have offset based bytecode means that we could reuse a lot more
> the methods (in special case - mixins and others).

It's simply a bad idea. If you don't want instance variables, just change
the VM's object representation, but then don't call your system Smalltalk
anymore. ;)

Btw without instance variables you don't need mixins, cause you have
traits.

If you only want mixins (instead of stateful traits), then there's at
least one mixin implementation for Squeak out there.

>
> Now I have a question does the JIT or the shortcut (not sure if this is in stackVM) blurry the cost of accessors
> vs. direct accesses?

Bytecodes are still 10-12x faster with Cog than sends.

Levente

P.S.: IIRC one of V8's optimizations is to use a common representation
(class) for objects that have the same slots (instance variables).

>
> Does anybody run a benchmarck about
> self x vs x in Cog recently
> on a real app?
>
> Stef
>

Igor Stasenko

Re: accessors vs direct access in Cog

2010/12/22 Levente Uzonyi <[hidden email]>:

> On Wed, 22 Dec 2010, Stéphane Ducasse wrote:
>
>> Hi guys
>>
>> Ideally I would love to be able to use accessors as the abstraction layer
>> that they can bring us:
>> I mean the fact that we could avoid to have offset based bytecode means
>> that we could reuse a lot more
>> the methods (in special case - mixins and others).
>
> It's simply a bad idea. If you don't want instance variables, just change
> the VM's object representation, but then don't call your system Smalltalk
> anymore. ;)

Why? For me smalltalk is a syntax and everything is an object. The
rest is optional.

>
> Btw without instance variables you don't need mixins, cause you have traits.
>
> If you only want mixins (instead of stateful traits), then there's at least
> one mixin implementation for Squeak out there.
>
>>
>> Now I have a question does the JIT or the shortcut (not sure if this is in
>> stackVM) blurry the cost of accessors
>> vs. direct accesses?
>
> Bytecodes are still 10-12x faster with Cog than sends.
>

even those, which are optimized by jit?
i mean, could

| pt |
pt := 1@2.
[ pt x ] bench

'2.789668866226755e6 per second.'

| pt |
pt := 1@2.
[ pt xx ] bench
'2.642108378324335e6 per second.'

where Point>>xx is:
xx
^ self x

so, what are you mean by 10-12 times faster?

>
> Levente
>
> P.S.: IIRC one of V8's optimizations is to use a common representation
> (class) for objects that have the same slots (instance variables).
>
>
>>
>> Does anybody run a benchmarck about
>> self x vs x in Cog recently
>> on a real app?
>>
>> Stef
>

--
Best regards,
Igor Stasenko AKA sig.

Stéphane Ducasse

Re: accessors vs direct access in Cog

In reply to this post by Eliot Miranda-2

> Hi guys
>
> Ideally I would love to be able to use accessors as the abstraction layer that they can bring us:
> I mean the fact that we could avoid to have offset based bytecode means that we could reuse a lot more
> the methods (in special case - mixins and others).
>
> Now I have a question does the JIT or the shortcut (not sure if this is in stackVM) blurry the cost of accessors
> vs. direct accesses?
>
> Not yet. But IMO it would be straight-forward to add. The PIC design supports inserting extra code at a send site, which is the logical place to put the access code.

I would love that because it it would be a really cool step into late binding structure vs. behavior. (of course at the cost of weakening encapsulation but still). I like the design of newspeak from that perspective.

>
>
> Does anybody run a benchmarck about
> self x vs x in Cog recently
> on a real app?
>
> Stef
>
> best
> Eliot

Stéphane Ducasse

Re: accessors vs direct access in Cog

In reply to this post by Levente Uzonyi-2

>> Hi guys
>>
>> Ideally I would love to be able to use accessors as the abstraction layer that they can bring us:
>> I mean the fact that we could avoid to have offset based bytecode means that we could reuse a lot more
>> the methods (in special case - mixins and others).
>
> It's simply a bad idea.

Be scientific, bring real arguments on the table else this is not fun and I can just not reply to your email or have other preconceived statements.

> If you don't want instance variables, just change the VM's object representation, but then don't call your system Smalltalk anymore. ;)

this has nothing to do with that. You can instance variable but have an execution model that use internally
message sends and this way you can maximize reuse of methods for object having a different layout.
This is what a good mixin implementation is doing + copy-down accessor on each mixin application.

> Btw without instance variables you don't need mixins, cause you have traits.

unifying behavior and state is really a way to late bound decision and increase reuse.

> If you only want mixins (instead of stateful traits), then there's at least one mixin implementation for Squeak out there.
>
>>
>> Now I have a question does the JIT or the shortcut (not sure if this is in stackVM) blurry the cost of accessors
>> vs. direct accesses?
>
> Bytecodes are still 10-12x faster with Cog than sends.

You mean bytecode representing instance variable accesses are faster than sends?

> P.S.: IIRC one of V8's optimizations is to use a common representation (class) for objects that have the same slots (instance variables).

And?
using accessor vs direct field access has nothing to do with prototypes. So I do not understand what you are
implying.

Stef

Steve Wart-2

Re: accessors vs direct access in Cog

Is this the old chestnut about direct access versus accessors from a
software engineering standpoint or is it a purely technical
discussion?

If it's the first, I kind of agree with Igor that *always* using
selectors isn't great in Smalltalk, because accessors vs. ivars is the
only mechanism we have to decide whether something is private or
public.

As soon as you make an API public, then you get all kinds of
assumptions to fight with the next time you want to change it.

If it's the second discussion, please accept my apologies for the
interruption :-)

Have a good Christmas,
Steve

On Wed, Dec 22, 2010 at 11:27 AM, Stéphane Ducasse
<[hidden email]> wrote:

>>> Hi guys
>>>
>>> Ideally I would love to be able to use accessors as the abstraction layer that they can bring us:
>>> I mean the fact that we could avoid to have offset based bytecode means that we could reuse a lot more
>>> the methods (in special case - mixins and others).
>>
>> It's simply a bad idea.
>
> Be scientific, bring real arguments on the table else this is not fun and I can just not reply to your email or have other preconceived statements.
>

Levente Uzonyi-2

Re: accessors vs direct access in Cog

In reply to this post by Igor Stasenko

On Wed, 22 Dec 2010, Igor Stasenko wrote:

> 2010/12/22 Levente Uzonyi <[hidden email]>:
>> On Wed, 22 Dec 2010, Stéphane Ducasse wrote:
>>
>>> Hi guys
>>>
>>> Ideally I would love to be able to use accessors as the abstraction layer
>>> that they can bring us:
>>> I mean the fact that we could avoid to have offset based bytecode means
>>> that we could reuse a lot more
>>> the methods (in special case - mixins and others).
>>
>> It's simply a bad idea. If you don't want instance variables, just change
>> the VM's object representation, but then don't call your system Smalltalk
>> anymore. ;)
>
> Why? For me smalltalk is a syntax and everything is an object. The
> rest is optional.

Aren't instance variables part of the syntax? Or is Self Smalltalk?

>
>>
>> Btw without instance variables you don't need mixins, cause you have traits.
>>
>> If you only want mixins (instead of stateful traits), then there's at least
>> one mixin implementation for Squeak out there.
>>
>>>
>>> Now I have a question does the JIT or the shortcut (not sure if this is in
>>> stackVM) blurry the cost of accessors
>>> vs. direct accesses?
>>
>> Bytecodes are still 10-12x faster with Cog than sends.
>>
> even those, which are optimized by jit?
> i mean, could
>
> | pt |
> pt := 1@2.
> [ pt x ] bench
>
> '2.789668866226755e6 per second.'
>
>
> | pt |
> pt := 1@2.
> [ pt xx ] bench
> '2.642108378324335e6 per second.'
>
> where Point>>xx is:
> xx
> ^ self x
>
> so, what are you mean by 10-12 times faster?
>

You benchmark has several flaws. It uses bench which is a message send by
itself and does several other sends, block activations, whatever. Just
evaluate
[] bench.
to see the problem.

Here is the benchmark I based my idea about 10-12x performance difference:

0 tinyBenchmarks.
'540940306 bytecodes/sec; 50274171 sends/sec'

It shows 10.76x difference. You may say that it's inaccurate, so I wrote
another myself: http://leves.web.elte.hu/squeak/SendBenchmark.st

To run it evaluate the following:
SendBenchmark run.

My result is:
#(#(109 16) #(105 17) #(105 18) #(108 18) #(106 19)).
To get the difference (may not work in Pharo):
#(#(109 16) #(105 17) #(105 18) #(108 18) #(106 19)) sum in: [ :sum |
sum first / sum second roundTo: 0.01 ].
6.06

So it's 6x faster to use instance variables, than accessors.

Levente

>>
>> Levente
>>
>> P.S.: IIRC one of V8's optimizations is to use a common representation
>> (class) for objects that have the same slots (instance variables).
>>
>>
>>>
>>> Does anybody run a benchmarck about
>>> self x vs x in Cog recently
>>> on a real app?
>>>
>>> Stef
>>
>
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
>

Stéphane Ducasse

Re: accessors vs direct access in Cog

>>
>> Why? For me smalltalk is a syntax and everything is an object. The
>> rest is optional.
>
> Aren't instance variables part of the syntax? Or is Self Smalltalk?

?
What if you use the same syntax and behind the scene the system makes sure that
you get optimized message send?
From a method reuse point of view we do not have offset based bytecode anymore.

>>> Btw without instance variables you don't need mixins, cause you have traits.
>>>
>>> If you only want mixins (instead of stateful traits), then there's at least
>>> one mixin implementation for Squeak out there.
>>>
>>>>
>>>> Now I have a question does the JIT or the shortcut (not sure if this is in
>>>> stackVM) blurry the cost of accessors
>>>> vs. direct accesses?
>>>
>>> Bytecodes are still 10-12x faster with Cog than sends.
>>>
>> even those, which are optimized by jit?
>> i mean, could
>>
>> | pt |
>> pt := 1@2.
>> [ pt x ] bench
>>
>> '2.789668866226755e6 per second.'
>>
>>
>> | pt |
>> pt := 1@2.
>> [ pt xx ] bench
>> '2.642108378324335e6 per second.'
>>
>> where Point>>xx is:
>> xx
>> ^ self x
>>
>> so, what are you mean by 10-12 times faster?
>>
>
> You benchmark has several flaws. It uses bench which is a message send by itself and does several other sends, block activations, whatever. Just evaluate
> [] bench.
> to see the problem.
>
> Here is the benchmark I based my idea about 10-12x performance difference:
>
> 0 tinyBenchmarks.
> '540940306 bytecodes/sec; 50274171 sends/sec'
>
> It shows 10.76x difference. You may say that it's inaccurate, so I wrote another myself: http://leves.web.elte.hu/squeak/SendBenchmark.st
>
> To run it evaluate the following:
> SendBenchmark run.
>
> My result is:
> #(#(109 16) #(105 17) #(105 18) #(108 18) #(106 19)).
> To get the difference (may not work in Pharo):
> #(#(109 16) #(105 17) #(105 18) #(108 18) #(106 19)) sum in: [ :sum |
> sum first / sum second roundTo: 0.01 ].
> 6.06
>
> So it's 6x faster to use instance variables, than accessors.
>
>
> Levente
>
>>>
>>> Levente
>>>
>>> P.S.: IIRC one of V8's optimizations is to use a common representation
>>> (class) for objects that have the same slots (instance variables).
>>>
>>>
>>>>
>>>> Does anybody run a benchmarck about
>>>> self x vs x in Cog recently
>>>> on a real app?
>>>>
>>>> Stef
>>>
>>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>>
>>

Igor Stasenko

Re: accessors vs direct access in Cog

In reply to this post by Levente Uzonyi-2

2010/12/22 Levente Uzonyi <[hidden email]>:

> On Wed, 22 Dec 2010, Igor Stasenko wrote:
>
>> 2010/12/22 Levente Uzonyi <[hidden email]>:
>>>
>>> On Wed, 22 Dec 2010, Stéphane Ducasse wrote:
>>>
>>>> Hi guys
>>>>
>>>> Ideally I would love to be able to use accessors as the abstraction
>>>> layer
>>>> that they can bring us:
>>>> I mean the fact that we could avoid to have offset based bytecode means
>>>> that we could reuse a lot more
>>>> the methods (in special case - mixins and others).
>>>
>>> It's simply a bad idea. If you don't want instance variables, just change
>>> the VM's object representation, but then don't call your system Smalltalk
>>> anymore. ;)
>>
>> Why? For me smalltalk is a syntax and everything is an object. The
>> rest is optional.
>
> Aren't instance variables part of the syntax? Or is Self Smalltalk?
>
>>
>>>
>>> Btw without instance variables you don't need mixins, cause you have
>>> traits.
>>>
>>> If you only want mixins (instead of stateful traits), then there's at
>>> least
>>> one mixin implementation for Squeak out there.
>>>
>>>>
>>>> Now I have a question does the JIT or the shortcut (not sure if this is
>>>> in
>>>> stackVM) blurry the cost of accessors
>>>> vs. direct accesses?
>>>
>>> Bytecodes are still 10-12x faster with Cog than sends.
>>>
>> even those, which are optimized by jit?
>> i mean, could
>>
>> | pt |
>> pt := 1@2.
>> [ pt x ] bench
>>
>> '2.789668866226755e6 per second.'
>>
>>
>> | pt |
>> pt := 1@2.
>> [ pt xx ] bench
>> '2.642108378324335e6 per second.'
>>
>> where Point>>xx is:
>> xx
>> ^ self x
>>
>> so, what are you mean by 10-12 times faster?
>>
>
> You benchmark has several flaws. It uses bench which is a message send by
> itself and does several other sends, block activations, whatever. Just
> evaluate
> [] bench.
> to see the problem.
>
> Here is the benchmark I based my idea about 10-12x performance difference:
>
> 0 tinyBenchmarks.
> '540940306 bytecodes/sec; 50274171 sends/sec'
>
> It shows 10.76x difference. You may say that it's inaccurate, so I wrote
> another myself: http://leves.web.elte.hu/squeak/SendBenchmark.st
>
> To run it evaluate the following:
> SendBenchmark run.
>
> My result is:
> #(#(109 16) #(105 17) #(105 18) #(108 18) #(106 19)).
> To get the difference (may not work in Pharo):
> #(#(109 16) #(105 17) #(105 18) #(108 18) #(106 19)) sum in: [ :sum |
> sum first / sum second roundTo: 0.01 ].
> 6.06
>
> So it's 6x faster to use instance variables, than accessors.
>

so it is 6x. and takin into account that Cog code gen is "naive", this
number can be improved even more.

But reality is, that application will not necessary run 6x times
slower, because accessing ivars is only a % of all operation types,
that's why i compared two dirty loops,
instead of pure ones, like you did.
But of course it depends on application.

>
> Levente
>
>>>
>>> Levente
>>>
>>> P.S.: IIRC one of V8's optimizations is to use a common representation
>>> (class) for objects that have the same slots (instance variables).
>>>
>>>
>>>>
>>>> Does anybody run a benchmarck about
>>>> self x vs x in Cog recently
>>>> on a real app?
>>>>
>>>> Stef
>>>
>>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>>
>

--
Best regards,
Igor Stasenko AKA sig.

Stéphane Ducasse

Re: accessors vs direct access in Cog

In reply to this post by Steve Wart-2

> Is this the old chestnut about direct access versus accessors from a
> software engineering standpoint or is it a purely technical
> discussion?

For me more a technical discussion to see how we could reuse more methods
and have more late bound object structure access.

Now the story about encapsulation in Smalltalk is mainly a convention (I do not use message
>
> If it's the first, I kind of agree with Igor that *always* using
> selectors isn't great in Smalltalk, because accessors vs. ivars is the
> only mechanism we have to decide whether something is private or
> public.
>
> As soon as you make an API public, then you get all kinds of
> assumptions to fight with the next time you want to change it.

Yes. This is why I have a hate/love affair with the unification of state and methods
Ideally I would love to have:
late bound state = unification
+ protected/public methods (but I could not find a nice model yet....)
You can read encapsulation in dynamic language as a reflexion on the topic

> If it's the second discussion, please accept my apologies for the
> interruption :-)

you are welcome.

>
> Have a good Christmas,
> Steve
>
> On Wed, Dec 22, 2010 at 11:27 AM, Stéphane Ducasse
> <[hidden email]> wrote:
>>>> Hi guys
>>>>
>>>> Ideally I would love to be able to use accessors as the abstraction layer that they can bring us:
>>>> I mean the fact that we could avoid to have offset based bytecode means that we could reuse a lot more
>>>> the methods (in special case - mixins and others).
>>>
>>> It's simply a bad idea.
>>
>> Be scientific, bring real arguments on the table else this is not fun and I can just not reply to your email or have other preconceived statements.
>>
>

Levente Uzonyi-2

Re: accessors vs direct access in Cog

In reply to this post by Stéphane Ducasse

On Wed, 22 Dec 2010, Stéphane Ducasse wrote:

>>>
>>> Why? For me smalltalk is a syntax and everything is an object. The
>>> rest is optional.
>>
>> Aren't instance variables part of the syntax? Or is Self Smalltalk?
>
> ?
> What if you use the same syntax and behind the scene the system makes sure that
> you get optimized message send?

That's possible with VM hacking, but I doubt you'd get the same
performance.

> From a method reuse point of view we do not have offset based bytecode anymore.

Some primitives also rely on instance variable indexes.

Levente

>
>
>>>> Btw without instance variables you don't need mixins, cause you have traits.
>>>>
>>>> If you only want mixins (instead of stateful traits), then there's at least
>>>> one mixin implementation for Squeak out there.
>>>>
>>>>>
>>>>> Now I have a question does the JIT or the shortcut (not sure if this is in
>>>>> stackVM) blurry the cost of accessors
>>>>> vs. direct accesses?
>>>>
>>>> Bytecodes are still 10-12x faster with Cog than sends.
>>>>
>>> even those, which are optimized by jit?
>>> i mean, could
>>>
>>> | pt |
>>> pt := 1@2.
>>> [ pt x ] bench
>>>
>>> '2.789668866226755e6 per second.'
>>>
>>>
>>> | pt |
>>> pt := 1@2.
>>> [ pt xx ] bench
>>> '2.642108378324335e6 per second.'
>>>
>>> where Point>>xx is:
>>> xx
>>> ^ self x
>>>
>>> so, what are you mean by 10-12 times faster?
>>>
>>
>> You benchmark has several flaws. It uses bench which is a message send by itself and does several other sends, block activations, whatever. Just evaluate
>> [] bench.
>> to see the problem.
>>
>> Here is the benchmark I based my idea about 10-12x performance difference:
>>
>> 0 tinyBenchmarks.
>> '540940306 bytecodes/sec; 50274171 sends/sec'
>>
>> It shows 10.76x difference. You may say that it's inaccurate, so I wrote another myself: http://leves.web.elte.hu/squeak/SendBenchmark.st
>>
>> To run it evaluate the following:
>> SendBenchmark run.
>>
>> My result is:
>> #(#(109 16) #(105 17) #(105 18) #(108 18) #(106 19)).
>> To get the difference (may not work in Pharo):
>> #(#(109 16) #(105 17) #(105 18) #(108 18) #(106 19)) sum in: [ :sum |
>> sum first / sum second roundTo: 0.01 ].
>> 6.06
>>
>> So it's 6x faster to use instance variables, than accessors.
>>
>>
>> Levente
>>
>>>>
>>>> Levente
>>>>
>>>> P.S.: IIRC one of V8's optimizations is to use a common representation
>>>> (class) for objects that have the same slots (instance variables).
>>>>
>>>>
>>>>>
>>>>> Does anybody run a benchmarck about
>>>>> self x vs x in Cog recently
>>>>> on a real app?
>>>>>
>>>>> Stef
>>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Igor Stasenko AKA sig.
>>>
>>>
>
>
>