Tips for optimizing image loading.

My problem is that when loading a file to video memory it takes too long. I've timed it with an allegro timer directly before and after the al_load_bitmap() and a 350x350 bmp can take over 25mS. It's a visible stutter. I tried other file formats with similar results.

I'm trying to avoid loading screens, not waste resources, and use detailed art. Are there any ways I can optimize the loading of images so they don't interrupt the main game loop? Would threading work or would I still need to dedicate a single large chunk of time to loading the image that would stall the main loop?


You're limited by the bus speed between the CPU and graphics card. No matter how many threads you have, you still only have one bus, so they will not help.

You can load only a portion of the image to video memory each frame however. It will remove the stutter, but might look a bit odd.


I wonder if you could use an interlaced image file for this. Probably doesn't play well with graphics memory. (I.e. you would get more stutter from access to gfx memory than disk access.)

Peter Wang

You're limited by the bus speed between the CPU and graphics card.

For a 350x350 bmp taking over 25 ms? I doubt it. More likely it's the hard disk seek time.


Oh, yes, definitely that. Sorry, I wasn't paying precise attention. Load over multiple frames, then load to the GPU over multiple frames, if necessary.


Sounds like loading over several frames would do the trick. Everything is loaded off screen before it is needed and not used/drawn until the character is a certain distance away so it shouldn't look weird.

Are there any resources on how to accomplish loading a portion of the image to video memory? Didn't see anything relevant in the allegro manual and my google-fu isn't turning anything up.

As far as it being hard disk seek time or a bus limitation I would have to guess it's not seek time. I tried running the program and loading from a SSD and it performed the same as a regular HD, maybe a bit better. Also the larger the file, the greater the delay. If it were seek time then the time vs file size wouldn't be so drastic.


I think threads could be applicable here, depending on how many images you need to load. As long as you start loading the bitmaps a little bit before you need them, you would be able to have one thread loading all the bitmaps off the disk while the game keeps chugging along with its logic and drawing. This way it would not block while waiting to get the data off the disk. I've not used threads much though, so I'm not sure if this really is the optimal solution or if there are things to watch out for with Allegro and threads. Of course threading it would be much more difficult too, so try what the others said first.


You can use threads for loading the images to memory from the hard disk, but will only gain performance if the images are stored on separate drives or the same drive with multiple heads (and otherwise likely have performance loss). Having one thread which does all the loading, however, is a good idea.

Multi-threading loads to the video card, however, is a bad idea, as I've already described.

Also, could you post some code. A 25ms load time does not seem to fit with the input and hardware you're specifying.

1#include <stdio.h> 2#include <allegro5/allegro.h> 3#include <allegro5/allegro_image.h> 4#include <iostream> 5#include <string.h> 6#include <string> 7#include <Windows.h> 8#include <math.h> 9#include <sstream> 10#include <fstream> 11using namespace std; 12 13 14int main(int argc, char **argv) 15{ 16al_init(); // initializing allegro 17al_init_image_addon(); 18 19ALLEGRO_DISPLAY *display = NULL; // display 20display = al_create_display(640, 480); // creating 21 22ALLEGRO_PATH *path = al_get_standard_path(ALLEGRO_RESOURCES_PATH); // declaring file path 23ALLEGRO_BITMAP *dummy1; // declaring variable 24ALLEGRO_TIMER *timer = NULL; // declaring timer 25timer = al_create_timer(1.0 / 1000); // 1 mS per count // setting timer 26 27 28 29 30al_start_timer(timer); // starting timer 31al_rest(1.0); 32 33cout << al_get_timer_count(timer) << '\n'; 34dummy1 = al_load_bitmap("file_name.bmp"); 35cout << al_get_timer_count(timer) << '\n'; 36 37al_draw_scaled_bitmap(dummy1, 20, 20, 20, 20, 100, 100, 20, 20, 0); 38al_flip_display(); 39 40al_rest(2.0); 41al_destroy_bitmap(dummy1); 42{ 43cout << al_get_timer_count(timer) << '\n'; 44dummy1 = al_load_bitmap("file_name.bmp"); 45cout << al_get_timer_count(timer) << '\n'; 46} 47 48al_clear_to_color(al_map_rgb(0,0,0)); 49al_draw_scaled_bitmap(dummy1, 20, 20, 20, 20, 100, 100, 20, 20, 0); 50al_flip_display(); 51 52al_rest(4.0); 53al_destroy_display(display); 54 55return 0; 56}

Here's the example I made while troubleshooting to try and eliminate variables. It waits a second, checks the timer, loads a file, checks the timer again, then repeats. A 350x350 file of any type gives me roughly 25mS. A 4x4 bmp loads in 9mS, so that's my lower limit which still seems a bit high. Using an i7 750, gtx 295, and intel SSD. Tried it on my standard HD with similar results, maybe a bit worse but wasn't really paying attention.

BTW I just draw it to verify the image loaded properly. My guess is I'm doing something stupid with how I'm loading the file, but that's how every example I've seen works.


Well, your profiling code isn't perfect, and should be more along these lines:

int64_t start = al_get_timer_count(timer);
dummy1 = al_load_bitmap("file_name.bmp");
int64_t end = al_get_timer_count(timer);
std::cout << end - start << std::endl;

So potential flush times and iostream overhead isn't factored in. Try that and get back to me with results, as your results still suggest an inefficiency in al_load_bitmap. Either that, or your allegro setup is off, but I'm not intimately familiar with setup details.


It's the same result, my version just gives the absolute times and you do the subtraction in your head.

As for the setup I used the tutorial in the A5 wiki. Seemed simple enough.


No, your version also adds in the time iostream takes to run, which is largely indeterminate. However, even with that minor incongruity, your results suggest the problem is with Allegro.


Why not load 100 bitmaps and divide the time by 100 to get a more accurate result?

Dario ff

Doesn't Windows do some optimization of its own for loading faster recently opened files? You would probably get way different results from the first time you test it.

I've noticed this on a loader I wrote reading nearly 7k files, and the loading time would go down like to 1/4th the 2nd time I tried. (Closing and reopening the application)


Ah, didn't even think about iostream taking up time. However, if it is, it's very small, since the results were the same.

@ Jmaster. It's pretty accurate. IE loading in 10 of the same image gives you a factor of 10 greater than a single image +/- a few % because the time for each individual load can vary by a mS or two. While I'm sure that this method is no substitute for an atomic clock it's producing the results one would expect given the symptoms that are occurring. I know I'm really breaking the 16 mS barrier because a file that times in at greater than 16mS can cause me to stutter in my game which is running at 60fps.

@ dario in my test example I load in the same file twice. If there is a difference then it's not very large at this scale. We're talking 25 mS for a small file. Aside from that the goal is to load in a file once without it holding up the main logic loop.

That just seems excessive for such small files, but then again I'm not an expert on such matters. Currently looking into a different, non allegro method to get a comparison.

Update: Enabling openGL and using png's got me down to 9-15mS on the 350x350 and 60~ on the 1100x1100. So a decent improvement there. Reading the 350x350, writing it to an HDC, copying it, and then writing the copy to a bmp file on the HD takes 2mS. A lot faster than I thought. It seems A5 got rid of the ability to load bmps from hdc though =/

Thread #609624. Printed from