Allegro.cc - Online Community

Allegro.cc Forums » Programming Questions » MacOSX 10.9 al_draw_scaled_bitmap - very slow

This thread is locked; no one can reply to it. rss feed Print
MacOSX 10.9 al_draw_scaled_bitmap - very slow
zdo.str
Member #15,582
April 2014

I'm using MacOSX 10.9 with MacBook Pro late 2011 (Intel Core i7 2.2GHz) + 16GB ram and AMD Radeon HD 6750M 512 MB. So problem, described below, is not hardware performance problem, I think.

I draw every frame (30fps) a 2d-map of terrain. All bitmaps are videocard bitmaps and their size is 32x32 pixels. The size of 2d-map is about 200x200, so there are drawings of 200x200 images of 32x32 pixels. And this action is very slow and cause 100% cpu using.

There is profiler screenshot in attachment: al_profile.png

'hestur' in that screenshot - is my application name and before calling al_draw_scaled_bitmap there were recursive nodes passing and drawing all of them. But passing through the tree is not so resource-cost as drawing itself is.

This is how drawing is performed:

//---- texture.h file

struct texture {
        int retain_count;
        char *path;

        ALLEGRO_BITMAP *bitmap;
        struct size size;

        struct list_head list;
};

//---- drawing routine (draw bitmap at specified center and with specified size)
//---- this place performed for each 200x200 2d-map cell for each frame.

                struct texture *t = n1->tex;

                int x = -n1->draw_size.w * n1->center.x;
                int y = -n1->draw_size.h * n1->center.y;
                int w = n1->draw_size.w;
                int h = n1->draw_size.h;

                //xmsg("%d\n", al_get_bitmap_flags(t->bitmap));
                al_draw_scaled_bitmap(t->bitmap,
                                0, 0, t->size.w, t->size.h,
                                x, y, w, h,
                                ALLEGRO_VIDEO_BITMAP);

If I uncomment xmsg(...) then printed flags value are 1056: that means texture is opengl object and stored in videocard, not in main memory.

Can anyone help me to solve this issue and determine, where is the bottleneck?

P.S.
There was similar thread https://www.allegro.cc/forums/thread/612832 , but no solution.

///// UPDATE after 30 min

I made some researches: wrote two application using SDL2 and liballegro.
Both perform the same task: drawing 200x200 map with 50x50 pixels texture.

Source for SDL:

#include <SDL2/SDL.h>
#include <math.h>

int main()
{
        SDL_Init(SDL_INIT_EVERYTHING);
        SDL_Window *wnd = SDL_CreateWindow(
                "My SDL Game",
                SDL_WINDOWPOS_UNDEFINED, SDL_WINDOWPOS_UNDEFINED,
                800, 600, SDL_WINDOW_SHOWN);

        SDL_Renderer *ren = SDL_CreateRenderer(wnd, -1,
                SDL_RENDERER_ACCELERATED | SDL_RENDERER_PRESENTVSYNC);

        SDL_Surface *bmp = SDL_LoadBMP("test.bmp");
        if (bmp == NULL){
                return 1;
        }

        SDL_Texture *tex = SDL_CreateTextureFromSurface(ren, bmp);
        SDL_FreeSurface(bmp);

        int f = 0;

        SDL_Event e;
        int quit = 0;
        while (!quit){
                ++f;

                while (SDL_PollEvent(&e)){
                        if (e.type == SDL_QUIT)
                                quit = 1;
                        if (e.type == SDL_KEYDOWN)
                                quit = 1;
                        if (e.type == SDL_MOUSEBUTTONDOWN)
                                quit = 1;
                }

                SDL_RenderClear(ren);

                int w = 32;
                int h = 32;

                for (int x = 0; x < 400; ++x) {
                        for (int y = 0; y < 400; ++y) {
                                SDL_Rect dst;
                                dst.x = x*w;
                                dst.y = y*h + sin(f / 30.0) * 50;
                                dst.w = w;
                                dst.h = h;
                                SDL_RenderCopy(ren, tex, NULL, &dst);
                        }
                }

                SDL_RenderPresent(ren);
        }

        SDL_DestroyTexture(tex);
        SDL_DestroyRenderer(ren);
        SDL_DestroyWindow(wnd);
        SDL_Quit();
}

Source for allegro:

include <allegro5/allegro.h>
#include <allegro5/allegro_image.h>
#include <math.h>

int main()
{
        al_init();
        al_set_blender(ALLEGRO_ADD, ALLEGRO_ALPHA, ALLEGRO_INVERSE_ALPHA);
        al_init_image_addon();
        al_install_keyboard();
        al_install_mouse();

        al_set_new_display_flags(ALLEGRO_WINDOWED);
        al_set_new_display_option(ALLEGRO_VSYNC, 1, ALLEGRO_SUGGEST);

        ALLEGRO_DISPLAY *display = al_create_display(800, 600);

        ALLEGRO_EVENT_QUEUE *event_queue = al_create_event_queue();
        al_register_event_source(event_queue, al_get_display_event_source(display));
        al_register_event_source(event_queue, al_get_keyboard_event_source());
        al_register_event_source(event_queue, al_get_mouse_event_source());

        al_set_new_bitmap_flags(ALLEGRO_VIDEO_BITMAP);

        ALLEGRO_EVENT ev;
        ALLEGRO_TIMEOUT timeout;
        al_init_timeout(&timeout, 1.0 / 30.0);

        ALLEGRO_BITMAP *bmp = al_load_bitmap("test.bmp");

        int f = 0;

        while (1) {
                ++f;

                al_wait_for_event_until(event_queue, &ev, &timeout);
                al_clear_to_color(al_map_rgb(0, 255, 0));

        //      al_hold_bitmap_drawing(1);
                for (int x = 0; x < 200; ++x) {
                        for (int y = 0; y < 200; ++y) {
                                al_draw_scaled_bitmap(bmp, 0, 0, 50, 50,
                                        x*32, y*32 + sin(f/20.0)*15, 32, 32,
                                        0);
                        }
                }
        //      al_hold_bitmap_drawing(0);

                al_flip_display();
        }

        return 0;
}

CPU using on my configuration described above is:
SDL: 30% CPU and very fast.
Allegro: 100% CPU and very slow.

I don't know, where is a bottleneck in allegro, but it seems to be in determining, what triangles should be drawn and what can be trown up, I think.

Can somebody explain such big CPU using amount in allegro?

Kris Asick
Member #1,424
July 2001

A few things of note:

1. al_set_blender() functions on the current target bitmap, so you don't want to call it until after you've set your target bitmap for drawing.

2. You should stick to powers of 2 for ALL primary texture sizes for sake of compatibility. (16, 32, 64, 128, 256, etc.) Only use non-power of 2 sizes with sub-textures.

3. The way you've coded your timing routine, instead of aiming to running 30 FPS all the time, it runs 30 FPS when NOTHING is happening. Any keyboard key you press and any mouse movement you make will cause the framerate to spike, plus anything that happens following your event timeout will reduce you below the target 30 FPS, since you're essentially just adding a 0.0333 second delay to every frame.

4. 100% CPU usage can be caused by a number of factors, including GPU driver optimizations and power utilization settings for laptops. You can't go by CPU usage alone to determine that there's a problem or that the system's being overtaxed. My current project runs close to 20% CPU usage if my graphics drivers are set to perform "Threaded Optimization", whereas if I turn that feature of my GPU off, the CPU usage of my project is so low it doesn't register. ;)

5. Uncomment your calls to al_hold_bitmap_drawing(), you're using them right and they WILL help your framerate! (There's very few instances where you don't want to use this command!)

Now, with all that said, here's something to keep in mind about doing 2D graphics: You can only get away with rendering about 10,000 sprites at a time on mid-range hardware using Allegro without killing the framerate. You're not only trying to render 40,000 sprites at a time but you're also delaying your program code by 0.0333 seconds every time you finish rendering those sprites. So if it takes 0.0333 seconds to render everything, your framerate won't be 30 FPS, it'll be 15 FPS!

If you need to render more than 10,000 sprites per second, you need to look into using Allegro's primitives system, which can allow you to draw many thousands of objects stored in an array with a single call to al_draw_prim(). (Don't try calling al_draw_prim() itself thousands of times though, you'll kill your framerate even worse!)

I don't know enough about SDL to know why it works so much faster, but my best guess looking at the code is that it has very little overhead and thus can run into lots of potential problems depending on whatever else you're doing in the frame to adjust how rendering occurs. Allegro does a lot of stuff behind the scenes so that it's much harder to write code that doesn't work at all or gets screwed up by other calls.

But yeah, try not to exceed 10,000 textured sprites per frame when making a 2D game. (20,000 if you intend to force 30 FPS instead of allowing 60 FPS.) Otherwise, you're gambling on how well optimized the end-user's system is and what it can handle.

--- Kris Asick (Gemini)
--- http://www.pixelships.com

Edgar Reynaldo
Major Reynaldo
May 2007
avatar

Please put your code in <code>code goes here...</code> tags. It makes it much easier to read.

You're not comparing SDL and Allegro in a useful way yet. You need to make the maps the same size, with the same tile size, and render them at the same position. The high CPU is probably because you're not holding drawing. If you don't hold drawing, then each and every draw call will use the CPU to upload the data to the GPU, which comes with overhead and slows the whole process down.

Also, you never want to render an entire tilemap, but only the visible portion. You would never render 200 32pixel wide tiles. Even at 1280x800 (my laptop res) that is only 40x25 = 1000 32x32 tiles. Not 40,000 like in your example.

I tried to make a better test program, and here's the code :

#SelectExpand
1 2#include <allegro5/allegro.h> 3#include <allegro5/allegro_image.h> 4#include <cstring> 5#include <cstdio> 6#include <cmath> 7 8int SCREEN_WIDTH = 640; 9int SCREEN_HEIGHT = 640; 10int TILE_WIDTH = 32; 11int TILE_HEIGHT = 32; 12int MAP_NUM_TILES_WIDE = 20; 13int MAP_NUM_TILES_TALL = 20; 14 15int FPS = 60; 16 17double start_time = 0.0; 18double frame_time = 0.0; 19double total_frame_time = 0.0; 20 21double timer_rate = 1.0/FPS; 22 23int main(int argc , char** argv) { 24 25 if (argc != 4 && argc != 1) { 26 printf("Usage : \n"); 27 printf("MapRenderTest.exe\n"); 28 printf("MapRenderTest.exe SCREEN_WIDTH , SCREEN_HEIGHT , TARGET_FPS\n"); 29 return 0; 30 } 31 if (argc == 4) { 32 SCREEN_WIDTH = atoi(argv[1]); 33 if (SCREEN_WIDTH < TILE_WIDTH) {SCREEN_WIDTH = TILE_WIDTH;} 34 SCREEN_HEIGHT = atoi(argv[2]); 35 if (SCREEN_HEIGHT < TILE_HEIGHT) {SCREEN_HEIGHT = TILE_HEIGHT;} 36 FPS = atoi(argv[3]); 37 if (FPS < 1) {FPS = 1;} 38 } 39 40 printf("Num tiles to render = %d\n" , MAP_NUM_TILES_WIDE*MAP_NUM_TILES_TALL); 41 42 al_init(); 43// al_set_blender(ALLEGRO_ADD, ALLEGRO_ALPHA, ALLEGRO_INVERSE_ALPHA); 44 al_init_image_addon(); 45 al_install_keyboard(); 46 al_install_mouse(); 47 48 al_set_new_display_option(ALLEGRO_VSYNC , ALLEGRO_SUGGEST , 2); 49 al_set_new_display_flags(ALLEGRO_WINDOWED); 50 51 ALLEGRO_DISPLAY* display = al_create_display(SCREEN_WIDTH , SCREEN_HEIGHT); 52 if (!display) { 53 printf("Could not create display.\n"); 54 return 1; 55 } 56 printf("Allegro reports vsync is %d\n" , al_get_display_option(display , ALLEGRO_VSYNC)); 57 58 ALLEGRO_TIMER* render_timer = al_create_timer(timer_rate); 59 ALLEGRO_TIMER* sec_timer = al_create_timer(1.0); 60 61 ALLEGRO_EVENT_QUEUE* queue = al_create_event_queue(); 62 al_register_event_source(queue, al_get_display_event_source(display)); 63 al_register_event_source(queue, al_get_keyboard_event_source()); 64 al_register_event_source(queue, al_get_mouse_event_source()); 65 al_register_event_source(queue , al_get_timer_event_source(render_timer)); 66 al_register_event_source(queue , al_get_timer_event_source(sec_timer)); 67 68 al_set_new_bitmap_flags(ALLEGRO_VIDEO_BITMAP); 69 ALLEGRO_BITMAP *bmp = al_load_bitmap("test.bmp"); 70 71 TILE_WIDTH = al_get_bitmap_width(bmp); 72 TILE_HEIGHT = al_get_bitmap_height(bmp); 73 MAP_NUM_TILES_WIDE = (int)ceil((double)SCREEN_WIDTH/TILE_WIDTH); 74 MAP_NUM_TILES_TALL = (int)ceil((double)SCREEN_HEIGHT/TILE_HEIGHT); 75 76 int num_frames = 0; 77 int total_frames = 0; 78 79 bool quit = false; 80 bool redraw = true; 81 bool wait_for_timer = true; 82 83 al_start_timer(render_timer); 84 al_start_timer(sec_timer); 85 86 bool key_up_held = false; 87 bool key_down_held = false; 88 89 while (!quit) { 90 91 do { 92 ALLEGRO_EVENT ev; 93 94 if (wait_for_timer) { 95 al_wait_for_event(queue , &ev); 96 } 97 else { 98 if (!al_get_next_event(queue , &ev)) {break;} 99 } 100 if (ev.type == ALLEGRO_EVENT_KEY_DOWN && ev.keyboard.keycode == ALLEGRO_KEY_ESCAPE) { 101 quit = true; 102 } 103 if (ev.type == ALLEGRO_EVENT_KEY_DOWN) { 104 if (ev.keyboard.keycode == ALLEGRO_KEY_ESCAPE) {quit = true;} 105 if (ev.keyboard.keycode == ALLEGRO_KEY_G) { 106 wait_for_timer = !wait_for_timer; 107 if (wait_for_timer) { 108 al_start_timer(render_timer); 109 } 110 else { 111 al_stop_timer(render_timer); 112 } 113 } 114 if (ev.keyboard.keycode == ALLEGRO_KEY_UP) {key_up_held = true; } 115 if (ev.keyboard.keycode == ALLEGRO_KEY_DOWN) {key_down_held = true;} 116 } 117 if (ev.type == ALLEGRO_EVENT_KEY_UP) { 118 if (ev.keyboard.keycode == ALLEGRO_KEY_UP) {key_up_held = false;} 119 if (ev.keyboard.keycode == ALLEGRO_KEY_DOWN) {key_down_held = false;} 120 } 121 if (ev.type == ALLEGRO_EVENT_DISPLAY_CLOSE) {quit = true;} 122 if (ev.type == ALLEGRO_EVENT_TIMER) { 123 if (ev.timer.source == render_timer) { 124 if (wait_for_timer) { 125 redraw = true; 126 } 127 } 128 if (ev.timer.source == sec_timer) { 129 printf("Real FPS = %d , Average frametime = %f\n" , num_frames , total_frame_time/total_frames); 130 num_frames = 0; 131 } 132 } 133 134 } while (!al_is_event_queue_empty(queue)); 135 136// al_clear_to_color(al_map_rgb(0, 255, 0)); 137 138 if (redraw || !wait_for_timer) { 139 if (key_up_held) { 140 ++FPS; 141 al_set_timer_speed(render_timer , 1.0/FPS); 142 printf("Target FPS = %d\n" , FPS); 143 total_frame_time = 0.0; 144 total_frames = 0; 145 } 146 if (key_down_held) { 147 --FPS; 148 if (FPS < 1) {FPS = 1;} 149 al_set_timer_speed(render_timer , 1.0/FPS); 150 printf("Target FPS = %d\n" , FPS); 151 total_frame_time = 0.0; 152 total_frames = 0; 153 } 154 start_time = al_get_time(); 155 al_hold_bitmap_drawing(true); 156 for (int y = 0; y < MAP_NUM_TILES_TALL ; ++y) { 157 int desty = y*TILE_HEIGHT; 158 for (int x = 0; x < MAP_NUM_TILES_WIDE ; ++x) { 159 int destx = x*TILE_WIDTH; 160 al_draw_bitmap(bmp, destx , desty , 0); 161 } 162 } 163 al_hold_bitmap_drawing(false); 164 al_flip_display(); 165 frame_time = al_get_time() - start_time; 166 total_frame_time += frame_time; 167 ++total_frames; 168 ++num_frames; 169 170// printf("Frame took %f seconds.\n" , frame_time); 171 172 } 173 } 174 return 0; 175}

Here's a screenshot of task manager recording the program running at 1024x768 with a 32x32 tile bitmap.

{"name":"608517","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/a\/e\/ae7d78e51e10a5335f824e68c8006963.png","w":816,"h":453,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/a\/e\/ae7d78e51e10a5335f824e68c8006963"}608517

As you can see baseline cpu usage is about 10%, 60FPS is about 20% and full bore is about 40% cpu, which means it takes about 30% cpu on one core to run the program at 1024x768 without any pauses between screen draws. Full bore it was running at about 200+ FPS.

You should try my program on OSX and see what kind of numbers you get. When running the program, you need a "test.bmp" in the same directory, of any size less than the screen resolution.

Usage :
MapRenderTest.exe
MapRenderTest.exe SCREEN_WIDTH SCREEN_HEIGHT TARGET_FPS

You can adjust the target frames per second by using the up and down keys. The g key lets you toggle whether the drawing waits for the render timer or not.

The console will output the real FPS and the average time it takes to draw a frame.

Oddly enough, if you minimize the program FPS goes up to 600 and CPU goes way up. But that's just because I don't pause on lost focus.

Go to: