I couldn't sleep very well last night, but that gave me some time to think about how to optimize collegro.
Currently, the only collision tests that do not run in constant time are, of course, the bit map tests.
To answer your question of why not bit pack all the bits into an unsigned char, at least for right now...
1 - Really complicated code. I may be able to test 8 bits at once, but I may need to add several more operations and tests to align the bits, making the increase in speed on certain tests, marginal. However, the code would come close to unreadable.
2 - There are a lot of functions that need to grab one bit at a time. For those functions, storing each bit seperately is actually faster.
3 - The horizontal / vertical flipping flags would be difficult and slower to implement using bit fields. I would have to rebuild and store horizontal, vertially, and horizontally/vertically flipped bit masks in memory all at the same time, at the very least.
4 - Optimizations like that come last, after you get the code working. =P
But here are some ideas that could make some signifigant speed increases...
1) Using MMX, SSE, SSE2, etc.. to test multiple sets of data at once.
My only question would be how portable can the code be? I don't want to force the user to set flags and stuff for different systems.
2) Instead of packing into unsigned chars, pack into unsigned ints. Why test 8 bits at once, when we can test 32 / 64 at once, using the data size the processor likes?
The only draw-back... very complicated code.
Whoah, that must be the hugest function I've seen to perform a bounding box collision... You can do it in 2 lines (4 tests and 4 sums) as I did in the other collision thread
It's long, but it's not slow, just a lot of if/else statements. =P I was mainly concerned with writing very understandable code at the time that I wrote it, I already planned on compressing it into a shorter amount of code for the next release anyways. =)