My 3D engine relies on using al_draw_indexed_prim to draw the 3D models in my scene, but I have found that there is a severe performance bottleneck with this method. While my actual production code is immensely more complex, this simple example should demonstrate how I typically (and probably not ideally) use the pipeline:
In this example, I am drawing 10,000 "Models". Each model is just a 3-vertex triangle and is drawn through al_draw_indexed_prim. (You'll have to hard crash the running program yourself)
In my real code, each model is more complex, is projected in 3D, and has textures.
Now, in the example program above, I am drawing 10,000 models with 3 vertexes each, so, 30,000 vertexes each loop. It's my understanding that a cheap modern graphics card can push ~500 million vertexes in a second, so what gives? Why is there already such a substantial lag at 30,000 vertexes? What should I be doing instead?
Here are some pretty screenshots:
example program from above (with transforms on)
{"name":"609131","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/d\/9\/d9d57232f36a88503323c2a0e5d79c20.png","w":819,"h":548,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/d\/9\/d9d57232f36a88503323c2a0e5d79c20"}
example from production (already laggy):
{"name":"609130","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/5\/8\/580f4973b673ba83e5acb7b75d2a48b3.png","w":1276,"h":711,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/5\/8\/580f4973b673ba83e5acb7b75d2a48b3"}
I didn't have to do anything to crash it.
Program received signal SIGSEGV, Segmentation fault. 0x000000004009fee7 in ?? () (gdb) bt #0 0x000000004009fee7 in ?? () #1 0x00007ffff256f53b in ?? () from /usr/lib64/libnvidia-glcore.so.331.20 #2 0x00007ffff25731ef in ?? () from /usr/lib64/libnvidia-glcore.so.331.20 #3 0x00007ffff21c2012 in ?? () from /usr/lib64/libnvidia-glcore.so.331.20 #4 0x00007ffff78c0cfa in _al_draw_prim_indexed_opengl () from /usr/local/lib64/liballegro_primitives.so.5.1 #5 0x00007ffff78c3a63 in al_draw_indexed_prim () from /usr/local/lib64/liballegro_primitives.so.5.1 #6 0x0000000000401265 in Model::draw (this=0x7fffefb48010) at t.cpp:53 #7 0x0000000000401010 in main (argc=1, argv=0x7fffffffe0c8) at t.cpp:89 (gdb)
[EDIT]
Well if I use the allegro debug libs I get this instead.
t: /home/prog/allegro-5.1.8/addons/primitives/primitives.c:109: al_draw_indexed_prim: Assertion `indices' failed.
[EDIT 2]
If I put in a line
int indices[] = { 0, 1, 2 };and change line 53 to
al_draw_indexed_prim(&vertex[0], NULL, NULL, indices, 3, ALLEGRO_PRIM_TRIANGLE_LIST);
it doesn't crash but only shows one triangle with a round thingee.
Oh so weird, how did I miss that... and how did my program not crash as a result??? OK, I'll edit my first post.
You should be using VBOs or something, you'll never get anywhere near 500 mil verts without batching them and keeping them on the GPU. It's kind of like al_hold_bitmap_drawing only worse.
it doesn't crash but only shows one triangle with a round thingee.
You have to uncomment the transform lines with the random_float calls
Each al_draw_prim call has some overhead, so if you want to get better performance, you want to minimize the number of those calls. Additionally, as Trent said, if you use vertex buffers, then things will go even faster. There's a vertex buffer API in the primitives addon. There's basically no reason not to use it (even for dynamically updated data, vertex buffers tend to be faster than al_draw_prim).
I can only use a vertex buffer in OpenGL mode on Vista. Doesn't matter how I create it in D3D it returns 0. And even in OpenGL mode performance isn't much better than al_draw_prim, batched or not.
That's interesting. Do you have a reasonably modern GPU?
But either way, it's good that it's not worse in OpenGL otherwise I'd have to eat my earlier post
.
It's a crappy integrated ATI Radeon X1270 on my Gateway Vista laptop.
this is ex_vertex_buffer.c on my computer.
As is (Direct3D):
{"name":"609136","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/5\/d\/5dc9c5a4bd354d5524f560d83763c9b4.png","w":669,"h":537,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/5\/d\/5dc9c5a4bd354d5524f560d83763c9b4"}
with al_set_new_display_flags(ALLEGRO_OPENGL);
{"name":"609135","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/2\/4\/2457de99f0e0fb0b741e05c4b1be79d9.png","w":668,"h":529,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/2\/4\/2457de99f0e0fb0b741e05c4b1be79d9"}
{"name":"609137","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/4\/e\/4e2c71ae6ea682b77fd959cfa7bcbec0.png","w":674,"h":526,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/4\/e\/4e2c71ae6ea682b77fd959cfa7bcbec0"}
... don't know what these numbers represent, but there they are.
Wow, a second person non-functioning D3D vertex buffers. Is your GPU pretty bad too?
Either way, those numbers are frames per second (FPS
) in scientific notation. The top circle is al_draw_prim and the rest are vertex buffers. Dynamic means you're changing the buffer every frame, and static means you don't. It looks like vertex buffers are 30% - 50% faster on your system.
Incidentally,
al_set_new_display_flags(ALLEGRO_OPENGL);
I prefer to use allegro5.cfg to switch backends for the examples.
That one was a Intel HD Graphics 4000, it's the onboard graphics card on my laptop. It's not a bad chip.
I also have an external doc with a "much more powerful" AMD Radeon HD 7670M, but looking at the numbers I don't see much of a change: 
D3D:
{"name":"609140","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/b\/7\/b7e98fb3d161b09f68d002e8e4343c56.png","w":679,"h":542,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/b\/7\/b7e98fb3d161b09f68d002e8e4343c56"}
with OpenGL:
{"name":"609138","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/d\/2\/d29378d37b9e076abc344d4d6408dbb9.png","w":666,"h":538,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/d\/2\/d29378d37b9e076abc344d4d6408dbb9"}
{"name":"609139","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/d\/5\/d5a15e18245ac913096e3a238456a140.png","w":674,"h":538,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/d\/5\/d5a15e18245ac913096e3a238456a140"}
I've added some more logging to the D3D side of things in the primitives addon. Could either of you get the latest 5.1 branch, compile in Debug mode and check out the allegro.log?
I prefer to use allegro5.cfg to switch backends for the examples.
Is allegro5.cfg documented anywhere? What the various options are, what the syntax is, so on...
I've added some more logging to the D3D side of things in the primitives addon. Could either of you get the latest 5.1 branch, compile in Debug mode and check out the allegro.log?
Here's my allegro.log file.
(What I think) the relevant parts are :
d3d_primitives W C:\mingw\LIBS\A5GIT\allegro\addons\primitives\prim_directx.cpp:60 is_legacy_card [ 0.27891] Your GPU is considered legacy! Some of the features of the primitives addon will be slower/disabled. d3d_primitives W C:\mingw\LIBS\A5GIT\allegro\addons\primitives\prim_directx.cpp:1022 _al_create_vertex_buffer_directx [ 0.27896] Cannot create vertex buffer for a legacy card. d3d_primitives W C:\mingw\LIBS\A5GIT\allegro\addons\primitives\prim_directx.cpp:1022 _al_create_vertex_buffer_directx [ 0.27898] Cannot create vertex buffer for a legacy card. d3d_primitives W C:\mingw\LIBS\A5GIT\allegro\addons\primitives\prim_directx.cpp:1022 _al_create_vertex_buffer_directx [ 0.27900] Cannot create vertex buffer for a legacy card. d3d_primitives W C:\mingw\LIBS\A5GIT\allegro\addons\primitives\prim_directx.cpp:1022 _al_create_vertex_buffer_directx [ 0.27902] Cannot create vertex buffer for a legacy card. d3d_primitives W C:\mingw\LIBS\A5GIT\allegro\addons\primitives\prim_directx.cpp:1022 _al_create_vertex_buffer_directx [ 0.27903] Cannot create vertex buffer for a legacy card. d3d_primitives W C:\mingw\LIBS\A5GIT\allegro\addons\primitives\prim_directx.cpp:1022 _al_create_vertex_buffer_directx [ 0.27905] Cannot create vertex buffer for a legacy card. d3d_primitives W C:\mingw\LIBS\A5GIT\allegro\addons\primitives\prim_directx.cpp:1069 _al_create_index_buffer_directx [ 0.28117] Cannot create index buffer for a legacy card.
I've had the legacy card check go wrong on me before with shaders. If you return false from that function does it work?
To check, I ran it on my desktop with an ATI Radeon 4600 series gfx card, and it created the vertex buffers in dynamic and non-dynamic mode. I get about 1100 frames with a buffer dynamically and about 2300 statically. (I get about 1300 frames dynamically with no buffer, which seems backwards). However, allegro.log says my card is still legacy, which it shouldn't be. It's a fairly decent gpu from about 5 years ago. The allegro log file is attached again.
Is allegro5.cfg documented anywhere? What the various options are, what the syntax is, so on...
There's a prototypical copy inside the root directory of the source distributions. Here it is online: https://github.com/liballeg/allegro5/blob/5.1/allegro5.cfg . I'm adding documentation about this next to al_get_system_config, but I think we can do something better with this... perhaps generate it automatically somehow.