[5.0rc2] OpenGL/Deferred drawing CPU use
Bullet

I might be miss using deferred drawing here but when using it with OpenGL the cpu usage goes to about 50%, without deferred drawing it only uses a few percent. Works fine with directX.

#SelectExpand
1#include <allegro5/allegro.h> 2 3 4int main() 5{ 6 if(!al_init()) 7 return 1; 8 9 if(!al_install_keyboard()) 10 return 1; 11 12 13 al_set_new_display_option(ALLEGRO_VSYNC, 1, ALLEGRO_SUGGEST); 14 al_set_new_display_flags(ALLEGRO_OPENGL); 15 16 ALLEGRO_DISPLAY* display = al_create_display(200, 200); 17 18 if(!display) 19 return 1; 20 21 ALLEGRO_EVENT_QUEUE* queue = al_create_event_queue(); 22 23 if(!queue) 24 { 25 al_destroy_display(display); 26 return 1; 27 } 28 29 al_register_event_source(queue, al_get_keyboard_event_source()); 30 al_register_event_source(queue, al_get_display_event_source(display)); 31 32 ALLEGRO_BITMAP* image = al_create_bitmap(20, 20); 33 34 if(!image) 35 { 36 al_destroy_event_queue(queue); 37 al_destroy_display(display); 38 return 1; 39 } 40 41 al_set_target_bitmap(image); 42 al_clear_to_color(al_map_rgb(255, 255, 255)); 43 al_set_target_bitmap(al_get_backbuffer(display)); 44 45 bool IsRunning = true; 46 while(IsRunning) 47 { 48 ALLEGRO_EVENT Event; 49 50 while(al_get_next_event(queue, &Event)) 51 { 52 if(Event.type == ALLEGRO_EVENT_DISPLAY_CLOSE) 53 { 54 IsRunning = false; 55 } 56 else if(Event.type == ALLEGRO_EVENT_KEY_UP && 57 Event.keyboard.keycode == ALLEGRO_KEY_ESCAPE) 58 { 59 IsRunning = false; 60 } 61 } 62 63 al_clear_to_color(al_map_rgb(0, 0, 0)); 64 65 al_hold_bitmap_drawing(true); 66 for(unsigned i = 0; i < 9; i++) 67 { 68 float X = i * 22.0f; 69 70 for(unsigned j = 0; j < 9; j++) 71 { 72 float Y = j * 22.0f; 73 al_draw_bitmap(image, X, Y, 0); 74 } 75 } 76 al_hold_bitmap_drawing(false); 77 78 al_flip_display(); 79 } 80 81 al_destroy_bitmap(image); 82 al_destroy_event_queue(queue); 83 al_destroy_display(display); 84 85 return 0; 86}

Elias

I can't seem to reproduce here. This is oprofile output of your example with held drawing:

13268    60.7120  libnvidia-glcore.so.260.19.06 /usr/lib/nvidia-current/libnvidia-glcore.so.260.19.06
1082      4.9510  libGL.so.260.19.06       /usr/lib/nvidia-current/libGL.so.260.19.06
1054      4.8229  liballegro-debug.so.5.1.0 draw_quad
575       2.6311  ld-2.12.1.so             __tls_get_addr
437       1.9996  liballegro-debug.so.5.1.0 tls_get
345       1.5787  liballegro-debug.so.5.1.0 _draw_tinted_rotated_scaled_bitmap_region
300       1.3727  liballegro-debug.so.5.1.0 al_compose_transform
259       1.1851  liballegro-debug.so.5.1.0 al_identity_transform
250       1.1440  libc-2.12.1.so           memcpy
230       1.0524  liballegro-debug.so.5.1.0 al_rotate_transform
230       1.0524  liballegro-debug.so.5.1.0 al_transform_coordinates

This is without:

9394     59.3618  libnvidia-glcore.so.260.19.06 /usr/lib/nvidia-current/libnvidia-glcore.so.260.19.06
749       4.7330  libGL.so.260.19.06       /usr/lib/nvidia-current/libGL.so.260.19.06
458       2.8942  liballegro-debug.so.5.1.0 draw_quad
454       2.8689  libc-2.12.1.so           fgetc
405       2.5592  libX11.so.6.3.0          /usr/lib/libX11.so.6.3.0
229       1.4471  ld-2.12.1.so             __tls_get_addr
209       1.3207  libc-2.12.1.so           __GI___strcmp_ssse3
169       1.0679  liballegro-debug.so.5.1.0 _draw_tinted_rotated_scaled_bitmap_region
160       1.0111  libc-2.12.1.so           memcpy

One difference you can see is that with held drawing, we do transformations in software, so for each bitmap there's calls to the transformations functions - so they show up at the top. Without held drawing transformations are done on the GPU so they don't show up.

However I can't see this causing 50% CPU... in my case it makes no difference on the end result. It also would show up with D3D.

It would be interesting seeing profiling output but I don't think you can get that in Windows. But maybe there's something in allegro.log which can give a hint, so if you can try compiling it with the debug version of Allegro and attach the allegro.log file it might help find the problem.

Bullet

Log attached.

Elias

Hm, nothing out of the ordinary. How many cores do you have? I assume 50% CPU means half of them are spin-locking. Maybe some threading issue with our wgl implementation. Not sure how to look into it without being able to reproduce.

Can someone else reproduce this?

Bullet

Yes, the cpu has two cores.

EDIT: Neither core is completely saturated one is at about 75% and the other at about 25%.

Elias

That could just mean that the kernel is switching load between the cores.

I wonder if it has something to do with tls_get (which shows up as 2% in my profiling results with held drawing) - I don't see why the OpenGL and D3D drivers would call it different amounts of time though. Let's wait if someone else can reproduce it. I'll also try it on my netbook.

Thread #605850. Printed from Allegro.cc