BitBlt speed WinXP

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

BitBlt speed WinXP

Herbert König
Hello,

using BitBlt for speed I draw some 15.000 white rectangles on a Morph.
Each is 4x4 pixels.

On my slow computer I get 480ms for the following method:

testDrawWhiteRects
        "just try if the white rectangles are drawn correctly"

        | bb |
        bb := BitBlt new.
        bb setDestForm: Display.
        bb width: dotSize ; height: dotSize .
        bb fillColor: Color white; combinationRule: Form paint.
        rectangleCoords do: [:koords| bb destX: koords x; destY: koords y; copyBits.]

If I comment out the actual transfer (the copyBits) the result is only
25ms.

I use a Display depth of 32, no difference if I change between big and
little endian and my Win also uses 32 bits display depth.

Tried with 3.7 and 3.9 VM. Also nearly no difference between OpenGL
and Direct3D.

On a fast computer (1.86GHz Pentium M) results aren't much better (10
and 270ms).

Is this to be expected, or am I doing something wrong? We have
Ballon3D which promises real time movement and I guess it finally
relies on BitBlt. So I expected much better performance.
 
Thanks for any pointers,

Herbert                          mailto:[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: BitBlt speed WinXP

Bert Freudenberg

On Feb 18, 2007, at 15:51 , Herbert König wrote:

> Hello,
>
> using BitBlt for speed I draw some 15.000 white rectangles on a Morph.
> Each is 4x4 pixels.
>
> On my slow computer I get 480ms for the following method:
>
> testDrawWhiteRects
>         "just try if the white rectangles are drawn correctly"
>
>         | bb |
>         bb := BitBlt new.
>         bb setDestForm: Display.
>         bb width: dotSize ; height: dotSize .
>         bb fillColor: Color white; combinationRule: Form paint.
>         rectangleCoords do: [:koords| bb destX: koords x; destY:  
> koords y; copyBits.]

Wouldn't "Form over" be sufficient?

> If I comment out the actual transfer (the copyBits) the result is only
> 25ms.
>
> I use a Display depth of 32, no difference if I change between big and
> little endian and my Win also uses 32 bits display depth.

Try not drawing to Display but to a temporary form, and then blt this  
one.

> Tried with 3.7 and 3.9 VM. Also nearly no difference between OpenGL
> and Direct3D.

BitBlt has nothing to do with OpenGL or D3D.

> On a fast computer (1.86GHz Pentium M) results aren't much better (10
> and 270ms).
>
> Is this to be expected, or am I doing something wrong? We have
> Ballon3D which promises real time movement and I guess it finally
> relies on BitBlt. So I expected much better performance.

Balloon3D does have a software rasterizer as well as an hw-
accelerated renderer. But drawing 15,000 separate rectangles even in  
C and OpenGL would be exactly speedy. You would have to wrap your  
drawing algorithm around the way the hw wants it.

What are you actually trying to achieve? It's probably much more  
efficient to update a single form and then enlarge it by 4 using  
warpblt.

Also, the Kedama Plugin has primitives for drawing large numbers of  
particles, if that is what you're after.

- Bert -



Reply | Threaded
Open this post in threaded view
|

Re[2]: BitBlt speed WinXP

Herbert König
Hello Bert,

thank you, your tips are welcome as always.

maybe its clearer if I state what I want to achieve in the beginning.

I have a neural network (self organizing feature map) and I want to
watch it learn.

Right now it has 100 nodes and 156 inputs, that's determining the
size. I start out with a neighbourhood of 32 which means that 65 of
the nodes will get trained for each of my 800+ samples. That was my
Float array question last time.

If everything goes well, after every training of any node I want to
redraw the line of 156 coefficients which are between 0 and 1.

I planned to have a white Form and a blue Form, and use
copyBitsTranslucent: coefficientMappedTo255..

Or I might use 255 Forms of the right colours and blit them to the
appropriate coordinates.
>>         rectangleCoords do: [:koords| bb destX: koords x; destY:
>> koords y; copyBits.]

BF> Wouldn't "Form over" be sufficient?

How to understand this? Speed is about the same. Is there some place
to read up on what the different modes actually do?

BF> Try not drawing to Display but to a temporary form, and then blt this
BF> one.

Thanks I'll try. Do you mean that one big blt is faster than many
small blt*s and that a blt to a physical display is slower than to a
Form? I didn't yet because it involved an extra step.

BF> BitBlt has nothing to do with OpenGL or D3D.

Just didn't know what ate the cycles so I tried everything that could
be tried easily before asking here.

BF> What are you actually trying to achieve? It's probably much more  
BF> efficient to update a single form and then enlarge it by 4 using  
BF> warpblt.

See above and I'll try warpblt too.

BF> Also, the Kedama Plugin has primitives for drawing large numbers of
BF> particles, if that is what you're after.

I'll look again but I think that's different from what I want. Squeak
is big, so give me a decade or two :-))

BTW, others here seem to be into neural nets too, I don't mind to
share what I have and what I have learned.

Thanks,

Herbert                            mailto:[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Re[2]: BitBlt speed WinXP

Bert Freudenberg

On Feb 18, 2007, at 20:48 , Herbert König wrote:

> Hello Bert,
>
> thank you, your tips are welcome as always.
>
> maybe its clearer if I state what I want to achieve in the beginning.
>
> I have a neural network (self organizing feature map) and I want to
> watch it learn.
>
> Right now it has 100 nodes and 156 inputs, that's determining the
> size. I start out with a neighbourhood of 32 which means that 65 of
> the nodes will get trained for each of my 800+ samples. That was my
> Float array question last time.
>
> If everything goes well, after every training of any node I want to
> redraw the line of 156 coefficients which are between 0 and 1.
>
> I planned to have a white Form and a blue Form, and use
> copyBitsTranslucent: coefficientMappedTo255..
>
> Or I might use 255 Forms of the right colours and blit them to the
> appropriate coordinates.
>>>         rectangleCoords do: [:koords| bb destX: koords x; destY:
>>> koords y; copyBits.]
>
> BF> Wouldn't "Form over" be sufficient?
>
> How to understand this? Speed is about the same. Is there some place
> to read up on what the different modes actually do?

The Blue Book for example. I usually look at the BitBltPlugin  
sources ...

> BF> Try not drawing to Display but to a temporary form, and then  
> blt this
> BF> one.
>
> Thanks I'll try. Do you mean that one big blt is faster than many
> small blt*s

Yes. The way to optimize drawing performance is to let bitblt do most  
of the work in as few calls as possible.

> and that a blt to a physical display is slower than to a
> Form?

Yes, because it is also copied to the OS window.

> I didn't yet because it involved an extra step.
>
> BF> BitBlt has nothing to do with OpenGL or D3D.
>
> Just didn't know what ate the cycles so I tried everything that could
> be tried easily before asking here.

Well, the best way is to measure.

> BF> What are you actually trying to achieve? It's probably much more
> BF> efficient to update a single form and then enlarge it by 4 using
> BF> warpblt.
>
> See above and I'll try warpblt too.
>
> BF> Also, the Kedama Plugin has primitives for drawing large  
> numbers of
> BF> particles, if that is what you're after.
>
> I'll look again but I think that's different from what I want. Squeak
> is big, so give me a decade or two :-))
>
> BTW, others here seem to be into neural nets too, I don't mind to
> share what I have and what I have learned.

Profile your performance. And, for a real speedup, try to rework the  
algorithm to not work on individual elements, or at least keep the  
per-element cost to a minimum.

- Bert -