Allegro.cc - Online Community

Allegro.cc Forums » Game Design & Concepts » [a5] pixel function benchmarks

This thread is locked; no one can reply to it. rss feed Print
[a5] pixel function benchmarks
Mark Oates
Member #1,146
March 2001
avatar

Alright, I did some benchmarks on the pixel operations:

{"name":"603154","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/3\/3\/33e3cfda668c7e6ba202ab298b3a24a2.png","w":624,"h":519,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/3\/3\/33e3cfda668c7e6ba202ab298b3a24a2"}603154

Much to my surprise, the al_put_pixel() and al_get_pixel() functions are quite a bit faster than I imagined they would be. I had thought writing a direct memory write function on a locked region would be faster.

Basically, under no circumstances would you want to use them without first locking your bitmap, as you can see from the chart. As remarkably fast as the functions are when locked, they're equally remarkably slow when not. al_draw_pixel()is reasonable as long as you don't want to draw a lot of pixels.

The locking function itself can be kind of slow, however. On my system, I could probably lock about 30 320x240 bitmaps in their native pixel format and that alone would affect the 60fps frame rate without anything else happening.

Also, there were no noticeable differences between ALLEGRO_LOCK_READONLY, ALLEGRO_LOCK_WRITEONLY, and ALLEGRO_LOCK_READWRITE.

write_pixel_argb_8888() is as follows:

inline void write_pixel_argb_8888(ALLEGRO_LOCKED_REGION *region, int x, int y, ALLEGRO_COLOR &col)
{
  uint32_t *ptr32;
  unsigned char r, g, b, a;
  al_unmap_rgba(col, &r, &g, &b, &a);
  ptr32 = (uint32_t *)region->data + x + y*(region->pitch/4);
  *ptr32 = (a << 24) | (r << 16) | (g << 8) | b;
}

Edgar Reynaldo
Member #8,592
May 2007
avatar

The benchmarks might be more useful if we could see how many seconds per call they each took. Use al_get_time before and after a set number of calls to each function (less calls for the slower functions). Then you can get calls/second through simple division. It would give an idea how long a custom blit function would take.

Mark Oates
Member #1,146
March 2001
avatar

With the method I was using, I recorded the operations with al_get_time() and incremented the number of loops (num_color_placements) to max out the bar to to the 60fps-limit line.

  start_profile_timer("lock");
  if (lock_bitmap) region = al_lock_bitmap(image, ALLEGRO_PIXEL_FORMAT_ARGB_8888, lock_flags);
  stop_profile_timer("lock");

  start_profile_timer("function");
  for (int i=0; i<num_color_placements; i++)
  {
    write_pixel_argb_8888(region, 30, 30, base_color);
    //al_put_pixel(30, 30, base_color);
    //al_draw_pixel(30, 30, base_color);
    //al_get_pixel(image, 30, 30);
  }
  stop_profile_timer("function");

{"name":"603157","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/9\/9\/99b3293f4d29ccc4ffd72ff88d4884e9.png","w":325,"h":394,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/9\/9\/99b3293f4d29ccc4ffd72ff88d4884e9"}603157

If it's right on the that light blue line, then then all the calls take exactly 1 frame of 60fps, or 1/60 of a second. The actual numbers jump around a bit, so it's difficult to get an exact seconds-per-call. The numbers on the right are seconds*10000.

Dario ff
Member #10,065
August 2008
avatar

Did you try reading directly from the lock's data instead of using al_get_pixel()?

Here's an example ML showed to me recently. If you're in the benchmarking mood, I'm curious how faster it is.

TranslatorHack 2010, a human translation chain in a.cc.
My games: [GiftCraft] - [Blocky Rhythm[SH2011]] - [Elven Revolution] - [Dune Smasher!]

Trent Gamblin
Member #261
April 2000
avatar

Reading and writing directly from the lock buffer is only useful in certain circumstances. The problem is, you have to lock in the same format as the bitmap data is in (ALLEGRO_PIXEL_FORMAT_ANY), because if you don't, Allegro will convert to the requested format and then back when you unlock which nullifies the reason for manually reading/writing in the first place (speed). You could theoretically support multiple pixel formats, but then you're going to be writing a lot of code.

However, if you KNOW the format of a bitmap is something specific and always will be, writing a single code path to process the locked region directly is going to be faster than put/get pixel. Probably not by enough to make it worth it in most circumstances, but it definitely has its uses in time critical code.

Matthew Leverton
Supreme Loser
January 1999
avatar

Quote:

y*(region->pitch/4);

I'm not sure if that's guaranteed to work. If pitch is not a multiple of four, then you'll lose the remainder.

Go to: