Hello, im attempting to follow this tutorial to make a raycaster game like wolfenstien 3D but the tutorial it uses the SDL library instead of allegro and i've run into some major performance hits when trying to follow the second part of the tutorial involving textured walls. I've converted his code into allegro quite easily and everything works as intended, its just that performance drops significantly when i'm looking at a wall and close to a wall (presumable because it has to run through the drawing loop more times). Ive already searched around for potential solutions to my problem and i think that ive narrowed it down to it just being an issue with al_put_pixel(). ive tried following suggestions like locking bitmaps and drawing to another buffer bitmap first and then drawing that to the back buffer to render but none of them have worked. i dont know if thats because im implementing them wrong or what but id appreciate if i could get some help and maybe specific suggestions for how to implement the code that will make it more efficient. Thanks.
Here is the entirety of the code, its a bit messy because i copied and pasted bits from that tutorial i linked above. if you would like clarification of something please ask. thanks for the help.
(i had to omit the definition of worldMap because this post is too long)
Allegro5 is different than old SDL in that it is hardware accelerated, which has bad performance on per-pixel operations.
Try to lock your texture for write only, write out the pixels manually, then render the texture instead of draw operations. Make sure allegro does not preserve the texture.
Also try making 1x1 rectangles using the primitive addon. I know it sounds strange, but I've had better performance using that.
Thanks for the replies but could you please explain to me what you mean by (im quite new to graphics and allegro)
write out the pixels manually, then render the texture instead of draw operations
thanks for the help.
edit: i tried using the primitive add on to draw rectangles instead and that worsened the issue
Right now, you are asking the graphics card to draw n pixels of rectangles and it has to send the verticies and texture coords for each call to al_draw_pixel.
Allegro can instead allow you to create a texture using al_create_bitmap.
You then lock it https://www.allegro.cc/manual/5/al_lock_bitmap and that puts it in RAM. Once in RAM, you access and write to the individual pixels... in RAM.
You then release the lock that sends the texture to the graphics card and then you draw it to the screen. This should be much faster.
inline void write_pixel_argb_8888(ALLEGRO_LOCKED_REGION *region, int x, int y, ALLEGRO_COLOR &col) { uint32_t *ptr32; unsigned char r, g, b, a; al_unmap_rgba(col, &r, &g, &b, &a); ptr32 = (uint32_t *)region->data + x + y*(region->pitch/4); *ptr32 = (a << 24) | (r << 16) | (g << 8) | b; }
If you could attach all the code as a zip that might help.
Just looking at the code, are you clear on the difference between al_put_pixel and al_draw_pixel? Locking the screen with ALLEGRO_LOCK_WRITEONLY should increase speed. Also I think your bitmaps ought to be loaded as memory bitmaps, not video for fastest access. Another thing (I'm not sure if this is still true or ever has been true for allegro 5!) if you create the display before loading the bitmaps it will ensure they have the same pixel format.
Finally, I know that MSVC compiles in a lot of bounds checking if you use STL containers in debug mode - could that be an issue for you?
If none of that helps you will have to go to direct access as jmasterx suggests.
Thanks for all of the suggestions guys. unfortunately i think im just too much of a novice to figure this out because I've tried implementing all of your suggestions multiple times but its either not working or im not doing it correctly (probably the latter). I appreciate the help though, thanks for your time.
Don't give up now
I had a quick try and I got it to run OK (admittedly not as quickly as I remember Wolfenstein running back in 1993 but that was only running at 320x200 IIRC)
I changed line 58 to use ALLEGRO_MEMORY_BITMAP and dropped the locking code on lines 69-70.
I changed al_draw_pixel to al_put_pixel on line 312.
Finally I added on line 211
al_lock_bitmap(al_get_backbuffer(display), ALLEGRO_PIXEL_FORMAT_ANY, ALLEGRO_LOCK_READWRITE);
and on line 315
al_unlock_bitmap(al_get_backbuffer(display));
Does that help any with your system?
If it's a per pixel operation, why not using a Pixel Shader ?
Thanks for the extra help Peter Hull, i implemented everything that you said to do and it did help but not to the extent i'm looking for still. Im still trying to figure out how to implement jmsaterx's solution but i always end up with worse results (and weird artifact-ing).
Here is an implementation of what I was talking about.
I also did some basic optimizations to avoid multiplications in the tight loop.
https://www.allegro.cc/files/attachment/610374
I'm getting a great framerate compared to your original code.
Some of your logic that you do in the render could be optimized/cached in memory.
Those 2 square roots are not cheap.
One trick I did for my ray casting demo was create vertical, 1px wide subbitmap "slices" of each wall texture.
So rather than copying pixels, you just draw that sub-bitmaps when rendering a wall.
Thanks for the help all! with your implementation jmasterx, there was an issue where whenever the camera got really close to the wall it sometimes had a massive performance drop for a couple seconds and then would be just fine afterwards. But i fixed that by not locking the texture bitmaps which didn't affect the general performance noticeably.