|Resolution Independance for 2d-Tilebased|
Hi! Loving Allegro 5. Working on my new ascii roguelike. I've run into some interesting quirks with Allegro which I am curious about and I thought I would post here. A penny for your thoughts.
So I decided to write my first Allegro5 game using resolution independance, and it was taking a long time to render. It was moving through an array of 2,000 screen-location objects and it was rendering in, I don't know, 20 ms. I then implemented a "dirty bit" for each character so it checked the dirty bit and only rendered each tile (TTF letter) when it had changed. I got rendering time down to less than 3ms per call to render(), sometimes 2ms, which was pretty good.
But I checked without using the buffer and I was getting times of 0.2ms per update. I can do better, I thought. So I changed the pipeline into an STL list, and didn't scan over the whole data (even to look at dirty bits). But suprisingly, the time to render did not go down. It was stuck firmly at about 2ms per call. If I went from 32 to 64 pointsize (it's a font) it would go up to 8 or 10 ms. Maybe because I was using a bitmap 4x the data size. Makes sense, I thought.
So I had the idea, ok, maybe it's because of the size of the scaled bitmaps? So I tried to draw smaller scaled bitmaps -- given that larger bitmaps were slower, surely drawing only the part of the scaled bitmap that changed would be faster. I went from:
al_draw_scaled_bitmap(buffer, 0, 0, bufWidth, bufHeight, scaleX, scaleY, scaleW, scaleH, 0);
al_draw_scaled_bitmap(buffer, sx, sy, sw, sh, dx, dy, dw, dh, 0);
Amazingly -- it took exactly the same time. 2 to 3 ms per render.
I was flabbergasted. Eventually all the info I could come up with (and I did search around on here for an hour or so before posting this) told me that it was the calls to al_set_target_bitmap() and al_set_target_backbuffer() which were the real culprits. I couldn't imagine why. But I realized that I could never stand to have a rendering time of 2 or 3 ms in an ascii roguelike game. This represented an FPS of about 400! Unacceptable. Don't scoff -- I have a liquid cooled 5960x running this and it was taking 230ms per call for a 128point font (i.e. sufficient to have zero distortion in fullscreen). A lesser processor would choke and die -- but even the times I was getting for 64point seemed unreasonable for what was just an ASCII terminal simulation.
So I had this idea but it's time consuming and I wanted to ask here if this was an advisable thing to do, or a best thing to do.
As it just so happens on-screen my characters happen to have a border around them I.E. they don't really touch, so long as the resolution is high enough. Here's my plan. I will pre-render each glyph into a high-resolution (say 144x256) bitmap, and map these bitmaps to an ascii code. I will then use it as a tilemap, and not use the ttf/font library part of allegro.
Then, instead of calling al_draw_text() on a per-character basis I would just draw these bitmaps -- scaled -- directly to the screen -- no intermediate bitmap buffers to scale, and no bitmaps or backbuffers to target. It would take ~10 megs of memory, I think, to store a font at this resolution (256pt), but the advantage should be a 0.2ms rendering time with crisp, clear 9x16 ascii at any screen resolution. I might be able to get by with 128pt.
Right now, it's not a huge deal to have 64point and ~10ms rendering times with the scaled bitmaps. I just think I can do better. Is my idea good? I'd hate to spend all day coding this and come out with a slow rendering time. Thanks for your advice!
First thought whenever I hear "fixed time/framerate issues":
1) vsync() - also check your driver panel to ensure "Force vsync" isn't on.
2) make sure it's not related to locking a bitmap. Locking a bitmap copies it from VRAM to system RAM so it's faster to work on, and then unlocking it sends it back over the bus. The bus is a big speed factor.
How... many of those are you calling? Because every one of those is a texture change which can easily turn into a bottleneck if you're doing thousands or tens of thousands. People tend to put all their tiles into a couple "texture atlas" megatextures so the number of texture switching calls shrinks by orders-of-a-magnitude.
But perhaps I'm not fully grasping your situation.
Edit: Have you done actual profiling or are you guessing where the slowdowns are? It'd probably be useful to see some code snippets and/or some raw profile numbers of #calls, time spent in each call, time spend total.
Also, if you're on Linux, a handy utility to add onto the list is strace.
strace my_program -c
strace logs all system calls to the kernel. Adding -c gives you a summary of times they took and # of calls, instead of the normal mode which simply dumps a text message of all calls to the console. For example, if new memory is a problem, it'll show tons of mmap (malloc) using time. You could be recreating a buffer every frame by accident and it'd show up.
If I recall correctly, the Allegro TTF add on isn't all that optimized for rendering huge characters. The approach of caching could help,balthough the TTF add-on attempts to cache as well. For your convenience, you could try a bitmap font in stead of a TTF font and see if that helps anything.
What I do for my games is that I implement everything to run at a virtual resolution of 640x480, and then I use Allegro's transformations to upscale to the screen resolution, with a few black rectangles drawn over if needed for letterboxing. This gives me fine performance at the cost of somewhat lower on screen resolution. I don't mind the latter since I want the retro look anyway.
As an aside 400 FPS is imho fine. Even at 30 FPS most people will experience smooth enough animation.