As some know, I spent some time looking into how to do the fastest blts
possible. The motivation for this was that the LookEnhancements in 3.9
put a heavy load on BitBlt for things like window resizing and other
common things that people expect to be very fluid.
The result was a promising architecture, but I ran out of steam on this
after getting it to a 'first demo' stage. It runs from within squeak, at
1 and 32 bit depths, on rules 3 and 25 (the most common), but not as a
drop-in replacement. The prospect of getting it to work with the many
modes of operation that Squeak's bitblt has scared me off for now, but
it should be possible.
The demonstrated performance gain for the supported modes ranges from
1.3x to 14x. Most of the performance opportunities exist in color
conversions; Squeak's bitblt is fairly close to optimal when no
conversion is needed. Color conversion was also one thing I never
completely figured out, and I was worried that my highly specific color
conversion routines would not offer the flexibility that Squeak's
mechanism offers.
I wanted to make this available in case anyone is interested. I also
thank all the people on IRC that put up with my questions.
A brief description, as well as source code files, are located here.
http://minnow.cc.gatech.edu/squeak/5845Eddie