Many thanks to Ian for tackling the work to integrate the faster bitblt code into the unix tree; cmake is fun, for certain definitions of 'fun'.
On 15-08-2013, at 8:06 AM, Ian Piumarta <
[hidden email]> wrote:
> I'd love to know the results if anyone measures the difference in performance between regular and optimised BitBlt.
Ben did a *lot* of performance measuring when developing the code; I mean *lots*. He built a full test harness to run all the test cases we could come up with wrt combination rule, masking, shifting, width, depth, color maps, etc etc - mostly generated by instrumenting the system and logging all those values. He even threw in 'fuzz testing' to probe the limits and make sure things shouldn't explode.
In general we saw improvements typically in the 2-3x area but some cases hit 10x. The framework of code is extensible if needed, so any important new cases could be added in the future. It's very much ARM v6k architecture based, specific to the Pi. The general principles would likely benefit any machine, though to be honest a modern fully leaded desktop machine has such huge caches, wide memory buses and high performance that I'd be startled to see much effect. For iOS & Android machines though, I suspect there is some advantage.
tim
--
tim Rowledge;
[hidden email];
http://www.rowledge.org/timLast one out, turn off the computer!