Trying to load ALienOpenGL into 4.1 alpha...

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
44 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Trying to load ALienOpenGL into 4.1 alpha...

LawsonEnglish
hi all, was trying to load AlienOpenGL into 4.1 alpha and got the
warning:  This package depends on the following classes: AlienLibrary

Is this deprecated for 4.x or renamed or..?


Thanks, Lawson

Reply | Threaded
Open this post in threaded view
|

Re: Trying to load ALienOpenGL into 4.1 alpha...

Josh Gargus
I don't think that Alien was ever in trunk.  IIRC the guy who wrote AlienOpenGL was a "Pharo guy"... I think that Pharo has Alien in by default (but I don't follow it closely).  To get it to run, you'll have to load the prerequisite Alien code first.  I haven't done this myself, so I have no particular advice for you.

BTW, why do you want to use AlienOpenGL, instead of "Croquet OpenGL"?  Just curious?  I wouldn't use it myself for performance reasons.  For one thing, Alien is slower than the old FFI (Eliot has ideas about how to JIT the marshalling code, but it's not on his short-term list of projects).  More importantly, AlienOpenGL uses Alien very inefficiently, or at least it did the last time I looked at it, I think in December.  Every time you call an OpenGL function, it looks up the function address from scratch instead of caching it somewhere.  I don't remember the precise results of the benchmarks that I did, but the overhead was terrible; it would be unusable for any moderately complex scene.

OTOH, if you're looking for something fun to hack, AlienOpenGL might be just the thing.  It wouldn't be hard to cache the functions so that they're looked up only once; this would have a tremendous impact on performance.

Cheers,
Josh



On Mar 22, 2010, at 3:47 PM, Lawson English wrote:

> hi all, was trying to load AlienOpenGL into 4.1 alpha and got the warning:  This package depends on the following classes: AlienLibrary
>
> Is this deprecated for 4.x or renamed or..?
>
>
> Thanks, Lawson
>


Reply | Threaded
Open this post in threaded view
|

Re: Trying to load ALienOpenGL into 4.1 alpha...

LawsonEnglish
Josh Gargus wrote:
> I don't think that Alien was ever in trunk.  IIRC the guy who wrote AlienOpenGL was a "Pharo guy"... I think that Pharo has Alien in by default (but I don't follow it closely).  To get it to run, you'll have to load the prerequisite Alien code first.  I haven't done this myself, so I have no particular advice for you.
>
> BTW, why do you want to use AlienOpenGL, instead of "Croquet OpenGL"?  Just curious?  I wouldn't use it myself for performance reasons.  For one thing, Alien is slower than the old FFI (Eliot has ideas about how to JIT the marshalling code, but it's not on his short-term list of projects).  More importantly, AlienOpenGL uses Alien very inefficiently, or at least it did the last time I looked at it, I think in December.  Every time you call an OpenGL function, it looks up the function address from scratch instead of caching it somewhere.  I don't remember the precise results of the benchmarks that I did, but the overhead was terrible; it would be unusable for any moderately complex scene.
>
>  

Croquet OpenGL is dependent on all sorts of things. Have you managed to
get Croquet working in a modernish version of Squeak/Pharo?

Also, I was under the impression that Alien FFI was faster than the
standard FFI.

> OTOH, if you're looking for something fun to hack, AlienOpenGL might be just the thing.  It wouldn't be hard to cache the functions so that they're looked up only once; this would have a tremendous impact on performance.
>
> Cheers,
> Josh
>
>
>
> On Mar 22, 2010, at 3:47 PM, Lawson English wrote:
>
>  
>> hi all, was trying to load AlienOpenGL into 4.1 alpha and got the warning:  This package depends on the following classes: AlienLibrary
>>
>> Is this deprecated for 4.x or renamed or..?
>>
>>
>> Thanks, Lawson
>>
>>    
>
>
>  


Reply | Threaded
Open this post in threaded view
|

Re: Trying to load ALienOpenGL into 4.1 alpha...

Andreas.Raab
On 3/22/2010 7:27 PM, Lawson English wrote:
> Croquet OpenGL is dependent on all sorts of things. Have you managed to
> get Croquet working in a modernish version of Squeak/Pharo?

I can probably whip one up fairly easily. The actual dependencies are
rather minor - all you need to do is drop the positional argument
variants (we've thrown these out in our own images too) and load the FFI
first.

> Also, I was under the impression that Alien FFI was faster than the
> standard FFI.

Oh, dear. This is hearsay, right? I.e., neither you nor anyone who
claims it have ever ever run an actual benchmark, have you?

There is interesting out-of-context quote in the Alien documentation
that brings as one of the arguments for Alien something that I said
about the FFI, namely that the "FFI is slow ..." but unfortunately it
doesn't quote the other half of that statement which is "... when
compared to the Squeak plugin interface". That is undoubtedly true in
the context of a discussion that compares the FFI and the Squeak plugin
interface since the FFI has marshalling overhead that is not incurred by
a regular plugin. That said, the FFI isn't slow per se - in particular
not when compared with doing marshalling inside Squeak (as Alien does).

Put on top that people seem to use Alien in the most naive (e.g., slow)
way looking up the functions on each call, and I'd say the FFI will beat
Alien in *any* practical performance tests today (and for the
foreseeable future). Doesn't mean Alien can't be improved, but the next
time someone claims that "FFI is slow and Alien is fast" ask for the
benchmark they ran instead of taking the claim at face value :-)

The main reason for using Alien today is callbacks. There is still no
support for callbacks in the FFI so if you need callbacks Alien is your
choice. One of the things that I've got on my TODO list with Eliot is to
improve interoperability between Alien and the FFI. It should be
possible to pass Aliens straight into FFI calls at which point you could
have your cake and eat it, too.

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

[ANN] Croquet OpenGL on Squeaksource (Re: Trying to load ALienOpenGL into 4.1 alpha...)

Andreas.Raab
In reply to this post by LawsonEnglish
On 3/22/2010 7:27 PM, Lawson English wrote:
> Croquet OpenGL is dependent on all sorts of things. Have you managed to
> get Croquet working in a modernish version of Squeak/Pharo?

(you will need an updated 4.1 trunk image - I've promoted
Form>>flipVertically to core in the process of making this package)

 From http://www.squeaksource.com/CroquetGL

The OpenGL interface from Croquet for consumption in other contexts.
Supports OpenGL 1.4 plus extensions.

To install, first load the FFI via:

(Installer repository: 'http://source.squeak.org/FFI')
        install: 'FFI-Pools';
        install: 'FFI-Kernel';
        install: 'FFI-Tests'.

then load CroquetGL:

(Installer repository: 'http://www.squeaksource.com/CroquetGL')
        install: '3DTransform';
        install: 'OpenGL-Pools';
        install: 'OpenGL-Core'.

When everything has loaded, try the example:

        OpenGL example.

Important Windows Note:

In order to use Croquet on Windows, you must make sure your VM is set to
support OpenGL instead of D3D by default. To do this, press F2 or go to
the system menu, into the "Display and Sound" section and ensure the
preference "Use OpenGL (instead of D3D" is ENABLED.

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Alien vs. FFI benchmarks (Re: Trying to load ALienOpenGL into 4.1 alpha...)

Andreas.Raab
In reply to this post by Andreas.Raab
On 3/22/2010 9:21 PM, Andreas Raab wrote:
>> Also, I was under the impression that Alien FFI was faster than the
>> standard FFI.
>
> Oh, dear. This is hearsay, right? I.e., neither you nor anyone who
> claims it have ever ever run an actual benchmark, have you?

I thought it'd be fun to make a little benchmark so here we go. I'm
using the most trivial example, namely the call to glGetError() which
should be done early and often :-) Using the FFI (from CroquetGL) this
looks like here:

        ogl := OpenGL newIn: (0@0 corner: 10@10).
        time := [1 to: 1000000 do:[:i| ogl glGetError]] timeToRun.
        ogl destroy.

This takes about 435 msecs. Now the same with AlienOpenGL in Pharo:

        drawable := OpenGLSurface newIn: (0@0 corner: 10@10).
        ogl := AlienOpenGLLibrary uniqueInstance.
        time := [1 to: 1000000 do:[:i| ogl glGetError]] timeToRun.
        drawable close.

This takes 8372 msecs. Whoopsie. That's a factor of 20x AlienOpenGL is
slower for the same C call. But okay, that may not be a fair comparison
due to the naive use of Alien. Let's be more clever, look up the method
just once and invoke it:

        drawable := OpenGLSurface newIn: (0@0 corner: 10@10).
        ogl := AlienOpenGLLibrary uniqueInstance.
        alienMethod := ogl alienMethodNamed: 'glGetError'.
        time := [1 to: 1000000 do:[:i|
                error := GLEnum new.
                alienMethod primFFICallResult: error.
        ]] timeToRun.
        drawable close.

This still takes 3589 msecs. Whoops, it did it again. Even with the more
elaborate use Alien is still 8x slower than the FFI. That's the cost of
doing marshalling in Squeak and the effect it has on Alien performance
when compared with the FFI.

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] Croquet OpenGL on Squeaksource (Re: Trying to load ALienOpenGL into 4.1 alpha...)

Bert Freudenberg
In reply to this post by Andreas.Raab
On 23.03.2010, at 07:04, Andreas Raab wrote:

>
> On 3/22/2010 7:27 PM, Lawson English wrote:
>> Croquet OpenGL is dependent on all sorts of things. Have you managed to
>> get Croquet working in a modernish version of Squeak/Pharo?
>
> (you will need an updated 4.1 trunk image - I've promoted Form>>flipVertically to core in the process of making this package)
>
> From http://www.squeaksource.com/CroquetGL
>
> The OpenGL interface from Croquet for consumption in other contexts. Supports OpenGL 1.4 plus extensions.
>
> To install, first load the FFI via:
>
> (Installer repository: 'http://source.squeak.org/FFI')
> install: 'FFI-Pools';
> install: 'FFI-Kernel';
> install: 'FFI-Tests'.
>
> then load CroquetGL:
>
> (Installer repository: 'http://www.squeaksource.com/CroquetGL')
> install: '3DTransform';
> install: 'OpenGL-Pools';
> install: 'OpenGL-Core'.
>
> When everything has loaded, try the example:
>
> OpenGL example.
>
> Important Windows Note:
>
> In order to use Croquet on Windows, you must make sure your VM is set to support OpenGL instead of D3D by default. To do this, press F2 or go to the system menu, into the "Display and Sound" section and ensure the preference "Use OpenGL (instead of D3D" is ENABLED.
>
> Cheers,
>  - Andreas
>

Neat :) Works like a charm on my old MacBook Pro, about 400 fps.

- Bert -



Reply | Threaded
Open this post in threaded view
|

EventSensor wait2ms (Re: [ANN] Croquet OpenGL on Squeaksource)

Andreas.Raab
On 3/23/2010 4:46 AM, Bert Freudenberg wrote:
> Neat :) Works like a charm on my old MacBook Pro, about 400 fps.

Yeah, I was wondering why it's so much slower in trunk than in our
internal images or Croquet. First I thought it's the JIT but then I
looked at a profile and found EventSensor>>wait2ms called from several
places in EventSensor and indirectly via Sensor anyButtonPressed .

WTF? This isn't in any of the images I've ever used, does someone know
why we're doing this? Randomly waiting for 2ms and *no comment* as to
what the purpose of that wait might be? I'm in favor of ripping this
out; there should be absolutely no need to wait for 2ms in EventSensor.

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: EventSensor wait2ms (Re: [ANN] Croquet OpenGL on Squeaksource)

Igor Stasenko
On 23 March 2010 18:10, Andreas Raab <[hidden email]> wrote:

> On 3/23/2010 4:46 AM, Bert Freudenberg wrote:
>>
>> Neat :) Works like a charm on my old MacBook Pro, about 400 fps.
>
> Yeah, I was wondering why it's so much slower in trunk than in our internal
> images or Croquet. First I thought it's the JIT but then I looked at a
> profile and found EventSensor>>wait2ms called from several places in
> EventSensor and indirectly via Sensor anyButtonPressed .
>
> WTF? This isn't in any of the images I've ever used, does someone know why
> we're doing this? Randomly waiting for 2ms and *no comment* as to what the
> purpose of that wait might be? I'm in favor of ripping this out; there
> should be absolutely no need to wait for 2ms in EventSensor.
>

All senders of wait2ms seems having same author , and very small time frame:
JMM 11/7/2005 14:39

So, i suppose we could ask the author to comment this.

> Cheers,
>  - Andreas
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: Trying to load ALienOpenGL into 4.1 alpha...

Eliot Miranda-2
In reply to this post by Andreas.Raab


On Mon, Mar 22, 2010 at 9:21 PM, Andreas Raab <[hidden email]> wrote:
On 3/22/2010 7:27 PM, Lawson English wrote:
Croquet OpenGL is dependent on all sorts of things. Have you managed to
get Croquet working in a modernish version of Squeak/Pharo?

I can probably whip one up fairly easily. The actual dependencies are rather minor - all you need to do is drop the positional argument variants (we've thrown these out in our own images too) and load the FFI first.


Also, I was under the impression that Alien FFI was faster than the
standard FFI.

Oh, dear. This is hearsay, right? I.e., neither you nor anyone who claims it have ever ever run an actual benchmark, have you?

There is interesting out-of-context quote in the Alien documentation that brings as one of the arguments for Alien something that I said about the FFI, namely that the "FFI is slow ..." but unfortunately it doesn't quote the other half of that statement which is "... when compared to the Squeak plugin interface". That is undoubtedly true in the context of a discussion that compares the FFI and the Squeak plugin interface since the FFI has marshalling overhead that is not incurred by a regular plugin. That said, the FFI isn't slow per se - in particular not when compared with doing marshalling inside Squeak (as Alien does).

Put on top that people seem to use Alien in the most naive (e.g., slow) way looking up the functions on each call, and I'd say the FFI will beat Alien in *any* practical performance tests today (and for the foreseeable future). Doesn't mean Alien can't be improved, but the next time someone claims that "FFI is slow and Alien is fast" ask for the benchmark they ran instead of taking the claim at face value :-)

The main reason for using Alien today is callbacks. There is still no support for callbacks in the FFI so if you need callbacks Alien is your choice. One of the things that I've got on my TODO list with Eliot is to improve interoperability between Alien and the FFI. It should be possible to pass Aliens straight into FFI calls at which point you could have your cake and eat it, too.

Right.  I won't stand by the "slow" comment anymore.  I've done a much faster version of FFI here that has essentially the same performance as Alien.  The important thing is to use alloca to allocate the outgoing stack frame and marshall to that. The old FFI code marshalled to static memory and then copied to the stack frame.  This makes the old implementation inherrently slower /and/ non-reentrant.  Now this is solved FFI is essentially as fast as Alien.  

The advantage FFI has right oer Alien call-outs (as opposed to Alien data representation) is more typing and so a better chance of dealing correctly with RISC calling conventions.  So the clear path is to merge the data management side of Alien into FFI and extend FFI with true callbacks, a la Alien.  We then have the best of both worlds.  That's something I want to get done this year, e.g. as part of the GSoC.


Cheers,
 - Andreas




Reply | Threaded
Open this post in threaded view
|

Re: EventSensor wait2ms (Re: [ANN] Croquet OpenGL on Squeaksource)

johnmci
In reply to this post by Igor Stasenko
When the *high priority* task looking at the mouse location in the Browser see the cursor move between the browser windows it would change the cursor. Then invoke a tight loop to see when the cursor reentered a browser pane. This would drive 100,000 peekPosition  starving the regular morphic loop.  The result was that at the time the cursor would *stick* between browser panes if your timing was *just* right.

It would appear that all that code was refactored in the last 5 years. Still if you feel that looking at the mouse position 100,000 a second is worthwhile?
you could rip it all out and hope there aren't side-effects.  Like someone launch a process, then peek for mouse up. Wonder who will get the CPU? The task doing the work? Or the hyper processing spin 100,000 peeks a second waiting for the mouse to go up?


"Change Set: EventSensorDelayOnHyperPolling
Date: 7 November 2005
Author: [hidden email]

Attempt to ensure polling for event data does not drive cpu to 100%. Wait 2ms between looks at mouse position or for keyboard events. Usually these don't happen 100's of times per second anyways"!




On 2010-03-23, at 9:16 AM, Igor Stasenko wrote:

> On 23 March 2010 18:10, Andreas Raab <[hidden email]> wrote:
>> On 3/23/2010 4:46 AM, Bert Freudenberg wrote:
>>>
>>> Neat :) Works like a charm on my old MacBook Pro, about 400 fps.
>>
>> Yeah, I was wondering why it's so much slower in trunk than in our internal
>> images or Croquet. First I thought it's the JIT but then I looked at a
>> profile and found EventSensor>>wait2ms called from several places in
>> EventSensor and indirectly via Sensor anyButtonPressed .
>>
>> WTF? This isn't in any of the images I've ever used, does someone know why
>> we're doing this? Randomly waiting for 2ms and *no comment* as to what the
>> purpose of that wait might be? I'm in favor of ripping this out; there
>> should be absolutely no need to wait for 2ms in EventSensor.
>>
>
> All senders of wait2ms seems having same author , and very small time frame:
> JMM 11/7/2005 14:39
>
> So, i suppose we could ask the author to comment this.
>
>> Cheers,
>>  - Andreas
>>
>>
>
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
--
===========================================================================
John M. McIntosh <[hidden email]>   Twitter:  squeaker68882
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
===========================================================================







smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: EventSensor wait2ms (Re: [ANN] Croquet OpenGL on Squeaksource)

Igor Stasenko
On 23 March 2010 19:50, John M McIntosh <[hidden email]> wrote:
> When the *high priority* task looking at the mouse location in the Browser see the cursor move between the browser windows it would change the cursor. Then invoke a tight loop to see when the cursor reentered a browser pane. This would drive 100,000 peekPosition  starving the regular morphic loop.  The result was that at the time the cursor would *stick* between browser panes if your timing was *just* right.
>
> It would appear that all that code was refactored in the last 5 years. Still if you feel that looking at the mouse position 100,000 a second is worthwhile?
> you could rip it all out and hope there aren't side-effects.  Like someone launch a process, then peek for mouse up. Wonder who will get the CPU? The task doing the work? Or the hyper processing spin 100,000 peeks a second waiting for the mouse to go up?

My 2c.
A tight polling loop is, obviously, a mistake, an abuse which should
be fixed in places where it is used, instead of patching the
implementor.
Because it is really ridiculous to patch the implementation to suit
the needs of bad design practices. :)
Sure, it is easier sometimes, than fixing the issue which causing it :)

>
>
> "Change Set:            EventSensorDelayOnHyperPolling
> Date:                   7 November 2005
> Author:                 [hidden email]
>
> Attempt to ensure polling for event data does not drive cpu to 100%. Wait 2ms between looks at mouse position or for keyboard events. Usually these don't happen 100's of times per second anyways"!
>
>
>
>
> On 2010-03-23, at 9:16 AM, Igor Stasenko wrote:
>
>> On 23 March 2010 18:10, Andreas Raab <[hidden email]> wrote:
>>> On 3/23/2010 4:46 AM, Bert Freudenberg wrote:
>>>>
>>>> Neat :) Works like a charm on my old MacBook Pro, about 400 fps.
>>>
>>> Yeah, I was wondering why it's so much slower in trunk than in our internal
>>> images or Croquet. First I thought it's the JIT but then I looked at a
>>> profile and found EventSensor>>wait2ms called from several places in
>>> EventSensor and indirectly via Sensor anyButtonPressed .
>>>
>>> WTF? This isn't in any of the images I've ever used, does someone know why
>>> we're doing this? Randomly waiting for 2ms and *no comment* as to what the
>>> purpose of that wait might be? I'm in favor of ripping this out; there
>>> should be absolutely no need to wait for 2ms in EventSensor.
>>>
>>
>> All senders of wait2ms seems having same author , and very small time frame:
>> JMM 11/7/2005 14:39
>>
>> So, i suppose we could ask the author to comment this.
>>
>>> Cheers,
>>>  - Andreas
>>>
>>>
>>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>>
>
> --
> ===========================================================================
> John M. McIntosh <[hidden email]>   Twitter:  squeaker68882
> Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
> ===========================================================================
>
>
>
>
>
>
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: Trying to load ALienOpenGL into 4.1 alpha...

Nicolas Cellier
In reply to this post by Eliot Miranda-2
2010/3/23 Eliot Miranda <[hidden email]>:

>
>
> On Mon, Mar 22, 2010 at 9:21 PM, Andreas Raab <[hidden email]> wrote:
>>
>> On 3/22/2010 7:27 PM, Lawson English wrote:
>>>
>>> Croquet OpenGL is dependent on all sorts of things. Have you managed to
>>> get Croquet working in a modernish version of Squeak/Pharo?
>>
>> I can probably whip one up fairly easily. The actual dependencies are
>> rather minor - all you need to do is drop the positional argument variants
>> (we've thrown these out in our own images too) and load the FFI first.
>>
>>> Also, I was under the impression that Alien FFI was faster than the
>>> standard FFI.
>>
>> Oh, dear. This is hearsay, right? I.e., neither you nor anyone who claims
>> it have ever ever run an actual benchmark, have you?
>>
>> There is interesting out-of-context quote in the Alien documentation that
>> brings as one of the arguments for Alien something that I said about the
>> FFI, namely that the "FFI is slow ..." but unfortunately it doesn't quote
>> the other half of that statement which is "... when compared to the Squeak
>> plugin interface". That is undoubtedly true in the context of a discussion
>> that compares the FFI and the Squeak plugin interface since the FFI has
>> marshalling overhead that is not incurred by a regular plugin. That said,
>> the FFI isn't slow per se - in particular not when compared with doing
>> marshalling inside Squeak (as Alien does).
>>
>> Put on top that people seem to use Alien in the most naive (e.g., slow)
>> way looking up the functions on each call, and I'd say the FFI will beat
>> Alien in *any* practical performance tests today (and for the foreseeable
>> future). Doesn't mean Alien can't be improved, but the next time someone
>> claims that "FFI is slow and Alien is fast" ask for the benchmark they ran
>> instead of taking the claim at face value :-)
>>
>> The main reason for using Alien today is callbacks. There is still no
>> support for callbacks in the FFI so if you need callbacks Alien is your
>> choice. One of the things that I've got on my TODO list with Eliot is to
>> improve interoperability between Alien and the FFI. It should be possible to
>> pass Aliens straight into FFI calls at which point you could have your cake
>> and eat it, too.
>
> Right.  I won't stand by the "slow" comment anymore.  I've done a much
> faster version of FFI here that has essentially the same performance as
> Alien.  The important thing is to use alloca to allocate the outgoing stack
> frame and marshall to that. The old FFI code marshalled to static memory and
> then copied to the stack frame.  This makes the old implementation
> inherrently slower /and/ non-reentrant.  Now this is solved FFI is
> essentially as fast as Alien.

Naive question: is there any potential stack overflow problem with alloca ?

Nicolas

> The advantage FFI has right oer Alien call-outs (as opposed to Alien data
> representation) is more typing and so a better chance of dealing correctly
> with RISC calling conventions.  So the clear path is to merge the data
> management side of Alien into FFI and extend FFI with true callbacks, a la
> Alien.  We then have the best of both worlds.  That's something I want to
> get done this year, e.g. as part of the GSoC.
>>
>> Cheers,
>>  - Andreas
>>
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Trying to load ALienOpenGL into 4.1 alpha...

Eliot Miranda-2


On Tue, Mar 23, 2010 at 11:41 AM, Nicolas Cellier <[hidden email]> wrote:
2010/3/23 Eliot Miranda <[hidden email]>:
>
>
> On Mon, Mar 22, 2010 at 9:21 PM, Andreas Raab <[hidden email]> wrote:
>>
>> On 3/22/2010 7:27 PM, Lawson English wrote:
>>>
>>> Croquet OpenGL is dependent on all sorts of things. Have you managed to
>>> get Croquet working in a modernish version of Squeak/Pharo?
>>
>> I can probably whip one up fairly easily. The actual dependencies are
>> rather minor - all you need to do is drop the positional argument variants
>> (we've thrown these out in our own images too) and load the FFI first.
>>
>>> Also, I was under the impression that Alien FFI was faster than the
>>> standard FFI.
>>
>> Oh, dear. This is hearsay, right? I.e., neither you nor anyone who claims
>> it have ever ever run an actual benchmark, have you?
>>
>> There is interesting out-of-context quote in the Alien documentation that
>> brings as one of the arguments for Alien something that I said about the
>> FFI, namely that the "FFI is slow ..." but unfortunately it doesn't quote
>> the other half of that statement which is "... when compared to the Squeak
>> plugin interface". That is undoubtedly true in the context of a discussion
>> that compares the FFI and the Squeak plugin interface since the FFI has
>> marshalling overhead that is not incurred by a regular plugin. That said,
>> the FFI isn't slow per se - in particular not when compared with doing
>> marshalling inside Squeak (as Alien does).
>>
>> Put on top that people seem to use Alien in the most naive (e.g., slow)
>> way looking up the functions on each call, and I'd say the FFI will beat
>> Alien in *any* practical performance tests today (and for the foreseeable
>> future). Doesn't mean Alien can't be improved, but the next time someone
>> claims that "FFI is slow and Alien is fast" ask for the benchmark they ran
>> instead of taking the claim at face value :-)
>>
>> The main reason for using Alien today is callbacks. There is still no
>> support for callbacks in the FFI so if you need callbacks Alien is your
>> choice. One of the things that I've got on my TODO list with Eliot is to
>> improve interoperability between Alien and the FFI. It should be possible to
>> pass Aliens straight into FFI calls at which point you could have your cake
>> and eat it, too.
>
> Right.  I won't stand by the "slow" comment anymore.  I've done a much
> faster version of FFI here that has essentially the same performance as
> Alien.  The important thing is to use alloca to allocate the outgoing stack
> frame and marshall to that. The old FFI code marshalled to static memory and
> then copied to the stack frame.  This makes the old implementation
> inherrently slower /and/ non-reentrant.  Now this is solved FFI is
> essentially as fast as Alien.

Naive question: is there any potential stack overflow problem with alloca ?

Not over and above what already exists in an FFI.  One can arrange that one alloca's close to what is needed for a given call, not simply some maximum on every call.  So when done like this alloca takes only a small percentage more than one would allocate for a normal call, and certainly less than a factor of two for a call with a large call frame.

In my reimplementation of the FFI the first time an FFI method is run the alloca is 16kb plus the size of the struct return, with the call failing if this isn't enough space.  The code then calculates how much of the 16k was actually used and caches this in the FFI method's ExternalFunction object so next time it alloca's only what's required.  Actual overhead is greater than the alloca because one has to allow for the plugin function's invocation and its local variables.  Even if the total overhead for a call were to reach 32k bytes one would have to have a deeply nested series of call-outs and call-backs before one was in danger of overflowing a typical 1Meg/thread stack.

HTH
Eliot


Nicolas

> The advantage FFI has right oer Alien call-outs (as opposed to Alien data
> representation) is more typing and so a better chance of dealing correctly
> with RISC calling conventions.  So the clear path is to merge the data
> management side of Alien into FFI and extend FFI with true callbacks, a la
> Alien.  We then have the best of both worlds.  That's something I want to
> get done this year, e.g. as part of the GSoC.
>>
>> Cheers,
>>  - Andreas
>>
>
>
>
>
>




Reply | Threaded
Open this post in threaded view
|

Re: EventSensor wait2ms (Re: [ANN] Croquet OpenGL on Squeaksource)

johnmci
In reply to this post by Igor Stasenko

On 2010-03-23, at 11:21 AM, Igor Stasenko wrote:

> Sure, it is easier sometimes, than fixing the issue which causing it :)

Ya, and the "proper" fix was, change the browser code to use an UI event model, versus a UI polling model.
But that seems to have taken a few years to happen...

--
===========================================================================
John M. McIntosh <[hidden email]>   Twitter:  squeaker68882
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
===========================================================================







smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: EventSensor wait2ms (Re: [ANN] Croquet OpenGL on Squeaksource)

Andreas.Raab
In reply to this post by johnmci
On 3/23/2010 10:50 AM, John M McIntosh wrote:
> When the *high priority* task looking at the mouse location in the Browser see the cursor move between the browser windows it would change the cursor. Then invoke a tight loop to see when the cursor reentered a browser pane. This would drive 100,000 peekPosition  starving the regular morphic loop.  The result was that at the time the cursor would *stick* between browser panes if your timing was *just* right.

Which high priority task are you referring to? I think we might want to
fix that, it seems completely pointless to run a loop like that.

> It would appear that all that code was refactored in the last 5 years. Still if you feel that looking at the mouse position 100,000 a second is worthwhile?

If I am trying to run a benchmark? Yes, absolutely.

> you could rip it all out and hope there aren't side-effects.  Like someone launch a process, then peek for mouse up. Wonder who will get the CPU? The task doing the work? Or the hyper processing spin 100,000 peeks a second waiting for the mouse to go up?

Good question. The usage of Sensor in Morphic is completely abusive.
Sensor isn't a Morphic entity; only the hand should be used to query
such information. I'll check it out.

Cheers,
   - Andreas

>
>
> "Change Set: EventSensorDelayOnHyperPolling
> Date: 7 November 2005
> Author: [hidden email]
>
> Attempt to ensure polling for event data does not drive cpu to 100%. Wait 2ms between looks at mouse position or for keyboard events. Usually these don't happen 100's of times per second anyways"!
>
>
>
>
> On 2010-03-23, at 9:16 AM, Igor Stasenko wrote:
>
>> On 23 March 2010 18:10, Andreas Raab<[hidden email]>  wrote:
>>> On 3/23/2010 4:46 AM, Bert Freudenberg wrote:
>>>>
>>>> Neat :) Works like a charm on my old MacBook Pro, about 400 fps.
>>>
>>> Yeah, I was wondering why it's so much slower in trunk than in our internal
>>> images or Croquet. First I thought it's the JIT but then I looked at a
>>> profile and found EventSensor>>wait2ms called from several places in
>>> EventSensor and indirectly via Sensor anyButtonPressed .
>>>
>>> WTF? This isn't in any of the images I've ever used, does someone know why
>>> we're doing this? Randomly waiting for 2ms and *no comment* as to what the
>>> purpose of that wait might be? I'm in favor of ripping this out; there
>>> should be absolutely no need to wait for 2ms in EventSensor.
>>>
>>
>> All senders of wait2ms seems having same author , and very small time frame:
>> JMM 11/7/2005 14:39
>>
>> So, i suppose we could ask the author to comment this.
>>
>>> Cheers,
>>>   - Andreas
>>>
>>>
>>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>>
>
> --
> ===========================================================================
> John M. McIntosh<[hidden email]>    Twitter:  squeaker68882
> Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
> ===========================================================================
>
>
>
>
>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: EventSensor wait2ms (Re: [ANN] Croquet OpenGL on Squeaksource)

johnmci

On 2010-03-23, at 2:30 PM, Andreas Raab wrote:

> On 3/23/2010 10:50 AM, John M McIntosh wrote:
>> When the *high priority* task looking at the mouse location in the Browser see the cursor move between the browser windows it would change the cursor. Then invoke a tight loop to see when the cursor reentered a browser pane. This would drive 100,000 peekPosition  starving the regular morphic loop.  The result was that at the time the cursor would *stick* between browser panes if your timing was *just* right.
>
> Which high priority task are you referring to? I think we might want to fix that, it seems completely pointless to run a loop like that.

In checking with a 3.10.2 image this morning that logic has been rewritten.  I think you'll need to review were wait2ms is used, and work backward
looking at the callers.

I'll note wait2ms doesn't exist in Pharo since it was eradicated with the EventSensor replacement.


===========================================================================
John M. McIntosh <[hidden email]>   Twitter:  squeaker68882
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
===========================================================================







smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Alien vs. FFI benchmarks (Re: Trying to load ALienOpenGL into 4.1 alpha...)

LawsonEnglish
In reply to this post by Andreas.Raab
Andreas Raab wrote:

> [...]
> This still takes 3589 msecs. Whoops, it did it again. Even with the
> more elaborate use Alien is still 8x slower than the FFI. That's the
> cost of doing marshalling in Squeak and the effect it has on Alien
> performance when compared with the FFI.
>
> Cheers,
>   - Andreas
>
>
 Thanks so much for  for all of this, Andreas. BTW what is the overhead
of a named primitive vs an unnamed primitive vs an FFI call?

My tests (perhaps not relevant):

[1 to: 1000000 do:[:i| i+2]] timeToRun.   18

 [1 to: 1000000 do:[:i| i+2.0]] timeToRun.  106

 [1 to: 1000000 do:[:i| 2.0 +2.0]] timeToRun.   121

 [1 to: 1000000 do:[:i| (FFITestLibrary ffiTestFloats: i with: 2.0)]]
timeToRun.  2742

 [1 to: 1000000 do:[:i| (FFITestLibrary ffiTestFloats: 2.0 with: 2.0)]]
timeToRun.  2556



where would a


NPTestLibrary>>npTestFloats: 2.0 with: 2.0)


rank?

Would it ever be worth it to create a OpenGL plugin wrapper rather than
use the FFI calls?



Lawson

Reply | Threaded
Open this post in threaded view
|

Re: Alien vs. FFI benchmarks (Re: Trying to load ALienOpenGL into 4.1 alpha...)

Josh Gargus

On Mar 24, 2010, at 8:43 PM, Lawson English wrote:
>
> Would it ever be worth it to create a OpenGL plugin wrapper rather than use the FFI calls?


Your application would have to be *very* efficient (much more efficient than, say, Croquet's scene-graph):

[1000000 timesRepeat: [gl glGetError]] timeToRun  627
[1000000 timesRepeat: [Object new; new; new; new]] timeToRun  747
[1000000 timesRepeat: [OrderedCollection new add: 1]] timeToRun  1078

We're talking less than a microsecond per FFI call.  Your whole application would have to be *heavily* optimized before this became the #1 bottleneck.

Cheers,
Josh



>
>
>
> Lawson
>


Reply | Threaded
Open this post in threaded view
|

Re: Alien vs. FFI benchmarks (Re: Trying to load ALienOpenGL into 4.1 alpha...)

Igor Stasenko
In reply to this post by Andreas.Raab
On 23 March 2010 09:09, Andreas Raab <[hidden email]> wrote:

> On 3/22/2010 9:21 PM, Andreas Raab wrote:
>>>
>>> Also, I was under the impression that Alien FFI was faster than the
>>> standard FFI.
>>
>> Oh, dear. This is hearsay, right? I.e., neither you nor anyone who
>> claims it have ever ever run an actual benchmark, have you?
>
> I thought it'd be fun to make a little benchmark so here we go. I'm using
> the most trivial example, namely the call to glGetError() which should be
> done early and often :-) Using the FFI (from CroquetGL) this looks like
> here:
>
>        ogl := OpenGL newIn: (0@0 corner: 10@10).
>        time := [1 to: 1000000 do:[:i| ogl glGetError]] timeToRun.
>        ogl destroy.
>
> This takes about 435 msecs. Now the same with AlienOpenGL in Pharo:
>
>        drawable := OpenGLSurface newIn: (0@0 corner: 10@10).
>        ogl := AlienOpenGLLibrary uniqueInstance.
>        time := [1 to: 1000000 do:[:i| ogl glGetError]] timeToRun.
>        drawable close.
>
> This takes 8372 msecs. Whoopsie. That's a factor of 20x AlienOpenGL is
> slower for the same C call. But okay, that may not be a fair comparison due
> to the naive use of Alien. Let's be more clever, look up the method just
> once and invoke it:
>
>        drawable := OpenGLSurface newIn: (0@0 corner: 10@10).
>        ogl := AlienOpenGLLibrary uniqueInstance.
>        alienMethod := ogl alienMethodNamed: 'glGetError'.
>        time := [1 to: 1000000 do:[:i|
>                error := GLEnum new.
>                alienMethod primFFICallResult: error.
>        ]] timeToRun.
>        drawable close.
>
i don't know how to load this code, could you put
error := GLEnum new. out of block and run it again?

> This still takes 3589 msecs. Whoops, it did it again. Even with the more
> elaborate use Alien is still 8x slower than the FFI. That's the cost of
> doing marshalling in Squeak and the effect it has on Alien performance when
> compared with the FFI.
>
It is expected to see that marshalling/converting types takes most of the time,
while rest should be same, because it straightforward: push arguments,
make a call and return result.
So, at some cases (not at all), one could prepare arguments and return
value holder and then use it
in calls, avoiding allocating & converting them each time.
For instance, when you passing a string (char*),
one may use a CString object (a null-terminated String), and pass it
into Alien without conversion,
while FFI allocates a null-terminated strings on heap, copying String
contents to that buffer and then using them as arguments to a
function.
So its hard to say, which way is better.

> Cheers,
>  - Andreas

--
Best regards,
Igor Stasenko AKA sig.

123