I must be doing something terribly wrong.
Similiar code in C# (using XNA) requires 40k-50k sprites drawn per frame to drop it to 60FPS,
but using ALLEGRO with Direct3D is only able to render 9000 sprites per frame before falling below 60FPS. Can someone please help me figure out what I am doing wrong?
Main.cs only calls RendererTest();
Even If I directly call the draws from RendererTest (The commented section)
i still get ~9000 draws per frame
I am fairly certain that I am using DirectX, When I run it using OpenGL the frames number of sprites I can draw per frame drop even further.
The profiler says that the majority of time is being spent in al_created_d3d_bitmap which, I can't figure out why that would be the case, b/c I initialize everything before the main loop, so by the time i start the profiler, It shouldn't even be called.
RendererTest.cs
Renderer.h
std::sort(renderList.begin(),renderList.end());
This might be part of your problem, sorting the renderList every frame, sorting is quite an expensive operation (Sort only on frames when something new was added?).
I believe XNA does not use the CPU to sort, but the GPU. (Depth-buffering)
Also you might want to use al_draw_textf for your formatting instead of creating and appending to strings every frame. I think it performs better, but I'm not sure.
al_hold_bitmap_drawing and atlases, if you're not.
That's actually the biggest problem with allegro for me, I have terrible fps too, even when just drawing the same image over and over in a simple test.
Concerning soring: no, sorting shouldn't be that slow. I also use isometric sorting in my game, and it hardly takes up 5% of my program's time, most time is actually taken by allegro draw functions.
However, your words about fps in XNA remainded me that I never compared allegro's fps on my PC to performance of anything else. I might try SDL and compare results in the future.
It'd be very helpful if a self-contained example was posted. I certainly would like to know how al_created_d3d_bitmap shows up in a profiler (and what is the identity of that function, because no such function actually exists).
Similiar code in C# (using XNA) requires 40k-50k sprites drawn per frame to drop it to 60FPS,
but using ALLEGRO with Direct3D is only able to render 9000 sprites per frame before falling below 60FPS. Can someone please help me figure out what I am doing wrong?
Actually, that sounds about right if you're using mid-range hardware.
Allegro has a lot of overhead going on in the background when you perform rendering calls, which is handled by the CPU, so if you need to exceed Allegro's limitations you need to get a little more creative. For instance, prepping everything for render using a single large texture in an array of vertices and texture coordinates, then rendering that entire thing in a single call to al_draw_prim().
One other thing you could do is look into calling al_hold_bitmap_drawing() at the start and end of your rendering blocks and making sure that all sprites using the same source bitmap (not necessarily the same sub-bitmap) are grouped together in their drawing calls, as the less times you have to switch the active texture on the GPU, the better.
But I have to wonder, what the heck you could possibly be doing that requires using over 9000 sprites per frame...
@Kris Asick:
Do you have a link or can you elaborate on the part about using al_draw_prim()?
Also, I don't particularly require 9000 sprites and certainly not 50k per frame, but I wanted see what the max was without any game logic so I could bench mark how badly various bits of game logic are slowing the game down... I just figured that if i started with the ability to draw 50k, it would give me more wiggle room for AI, and path finding then if i started @9k... does that make sense or am I thinking about it wrong?
@Trent Gamblin:
The tile atlas doesn't apply in this case because it is a single sprite, but the al_hold_bitmap_drawing certainly did... 50k sprites per frame before hitting 60fps, so the atlases will certainly come in handy when handling actual sprites/rendering.
@SiegeLord:
Sorry, i think it was al_d3d_create_bitmap I don't think it is part of the public API, it must be part of allegro's internals... but I know that the profiler listed it as taking 50-80% of the time used by the RunRenderTest function (the main function in the code below).
Self contained source:
Requires rewriting LoadSprite to load a 32x32 png
Requires rewriting LoadFOnt to load a font
There are a few things to mention.
1. You forgot al_hold_bitmap_drawing in your code. Even if you're only drawing one sprite it helps if you're drawing it many times at once. Wrap the loop with a pair of hold drawing calls the first true and the second false, and your drawing times should decrease substantially.
2. clock() returns total processor time on Linux/Unix, and total time past on Windows. It's not a good way to measure time. Use an ALLEGRO_TIMER to tick once a second if you want to count fps and ups.
See https://www.allegro.cc/forums/thread/599033/791757#target for details.
And this old thread by GullRaDriel is gold if you're really interested in timing :
https://www.allegro.cc/forums/thread/585140
3. You forgot al_init_font_addon.
void main()
Only int main() or int main(int argc, char *argv[]) is legal C/C++.
Do you have a link or can you elaborate on the part about using al_draw_prim()?
Unfortunately, no. I'm speaking from experience in that, al_draw_prim() has a huge overhead cost to call, but can send massive amounts of data to the video card in that single call, so if you use it correctly it can be very powerful and can perform better than multiple al_draw_bitmap() calls, but to use it, you need to manually assign all of your texture coordinates, triangle coordinates, vertex colours, etc., and put them all into a massive array you can send to the function.
Beyond that, just read the manual.
Also, I don't particularly require 9000 sprites and certainly not 50k per frame, but I wanted see what the max was without any game logic so I could bench mark how badly various bits of game logic are slowing the game down... I just figured that if i started with the ability to draw 50k, it would give me more wiggle room for AI, and path finding then if i started @9k... does that make sense or am I thinking about it wrong?
You're thinking about it wrong. You'd be surprised how much game logic you can process without affecting the framerate on today's computers. Heck, I was running particle engines at 70 FPS using Allegro 4 back in 2000 on original Pentium CPUs.
Plus, if your game logic DOES slow the game down, there's almost certainly going to be alternative approaches. For instance, if you have a massive tile-based world and various objects in that world can update themselves at random, instead of scanning every single tile every game tick and performing those random calculations, you could just scan a handful of random tiles and make it more likely the random tile event will happen when it gets scanned. Or, if you have tile entities that need to constantly be updated, instead of scanning the entire map for changes, you make a list of each tile that needs to be updated every game tick and update that list as more such tiles are placed or removed.
So I profiled this on Windows, and didn't get anything sensible (I used the VerySleepy profiler). I checked, and there are no calls to _al_create_d3d_bitmap inside the draw loop, so if your profiler shows that as the main function, your profiler is broken.
Anyway, I tested a few other things: I tried al_hold_bitmap_drawing as others have already suggested, and that made things 5x faster. Then I tried a light-weight replacement based on al_draw_prim in my FastDraw library and got it nearly 2x faster still (nearly 10x faster relative to the original). I am not aware of other techniques that will support faster drawing that the approach taken in that library given changing sprites every frame. If you can guarantee that you won't change your sprites often (e.g. it's a level tilemap) then Allegro 5.1 provides a feature to take it even further beyond, potentially adding 4x to 5x speed improvement on top of FastDraw's performance (i.e. perhaps as much as 50x faster than you current code).
SiegeLord, I quickly looked at fast_draw.c, is it essentially the same thing as using al_draw_prim manually? I often do batch sprites up and draw them with al_draw_prim, would there be any difference besides ease of use?
It's just for ease of use.
Ok. I can see that being useful. I'm used to doing it manually. I'll try to remember it exists.