Allegro.cc - Online Community

Allegro.cc Forums » Programming Questions » optimisation problem

This thread is locked; no one can reply to it. rss feed Print
optimisation problem
giewueron
Member #7,433
July 2006

1void mode_7 (BITMAP *bmp, BITMAP *tile, fixed angle, fixed cx, fixed cy, int y_shift, MODE_7_PARAMS params)
2{
3 // current screen position
4 int screen_x, screen_y;
5 
6 // the distance and horizontal scale of the line we are drawing
7 fixed distance, horizontal_scale;
8 
9 // masks to make sure we don't read pixels outside the tile
10 int mask_x = (tile->w - 1);
11 int mask_y = (tile->h - 1);
12 
13 // step for points in space between two pixels on a horizontal line
14 fixed line_dx, line_dy;
15 
16 // current space position
17 fixed space_x, space_y;
18 
19 for (screen_y = 0; screen_y < bmp->h; screen_y++)
20 {
21 // first calculate the distance of the line we are drawing
22 distance = fmul (params.space_z, params.scale_y) /
23 (screen_y + params.horizon);
24 // then calculate the horizontal scale, or the distance between
25 // space points on this horizontal line
26 horizontal_scale = fdiv (distance, params.scale_x);
27 
28 // calculate the dx and dy of points in space when we step
29 // through all points on this line
30 line_dx = fmul (-fsin(angle), horizontal_scale);
31 line_dy = fmul (fcos(angle), horizontal_scale);
32 
33 // calculate the starting position
34 space_x = cx + fmul (distance, fcos(angle)) - bmp->w/2 * line_dx;
35 space_y = cy + fmul (distance, fsin(angle)) - bmp->w/2 * line_dy;
36 
37 // go through all points in this screen line
38 for (screen_x = 0; screen_x < bmp->w; screen_x++)
39 {
40 // get a pixel from the tile and put it on the screen
41
42
43 if ( fixtoi(space_x) > 0 && fixtoi(space_y) > 0 &&
44 fixtoi(space_x) < tile->w && fixtoi(space_y) < tile->h )
45 {
46 if ( screen_y + y_shift < SCREEN_H - 1 )
47 _putpixel32 (bmp, screen_x, screen_y + y_shift,
48 _getpixel32 (tile, fixtoi (space_x), fixtoi (space_y)));
49 }
50 
51 // advance to the next position in space
52 space_x += line_dx;
53 space_y += line_dy;
54 }
55 }
56}

Is there a way to speed this up? The problem is because of a whole lot of _putpixel32 calls.

Onewing
Member #6,152
August 2005
avatar

Quote:

The problem is because of a whole lot of _putpixel32 calls.

Or the _getpixel32 calls. What exactly is this function suppose to do?

------------
Solo-Games.org | My Tech Blog: The Digital Helm

giewueron
Member #7,433
July 2006

Steve Terry
Member #1,989
March 2002
avatar

First is to use floats and not fixed, second is to use some bitshifts (dividing by 2 is the same as >> 1), and lastly some direct memory access may speed it up by using the line pointers. I also see a lot of use of sin/cos with the same angle value, I would call those once in a variable at the upmost level you can so you aren't calculating it all the time. Lowering your Bpp to 16-bits will also make an improvement but you will have less colors, but judging by your screenshot you could do it in 8-bpp.

___________________________________
[ Facebook ]
Microsoft is not the Borg collective. The Borg collective has got proper networking. - planetspace.de
Bill Gates is in fact Shawn Hargreaves' ßî+çh. - Gideon Weems

giewueron
Member #7,433
July 2006

"First is to use floats and not fixed, second is to use some bitshifts (dividing by 2 is the same as >> 1),"

I think this won't give any significant speed. Those fix* routines maybe even do bitshifts themselves. :)

"and lastly some direct memory access may speed it up by using the line pointers. I also see a lot of use of sin/cos with the same angle value, I would call those once in a variable at the upmost level you can so you aren't calculating it all the time."

Maybe direct memory access hmm. How to use it? How to apply it in 16 bits and 640x480?
Second... from Allegro Docs looks like fixed trigonometry already uses lookup tables.

"Lowering your Bpp to 16-bits will also make an improvement but you will have less colors, but judging by your screenshot you could do it in 8-bpp."

Seems to work cool but no speed change. :-/

I thought about returning to my first idea and to use the textured polygon3d function from Allegro, but I don't know if it would be faster. Maybe I should really switch to OpenGL w/AllegroGL. Would it be possible to use sprites without some bilboarding tricks, just by simple stretch_draw_sprite? Models would take out the oldschool feeling. I wanted to make crappy graphics to make it run on slower PCs but with this routine it seems impossible. Or maybe there's somewhere on the net an addon Allegro library with a draw_sprite routine with skewing (deformation of the sprites border rectangle), stretching and rotating. There's one for AA sprites so maybe ;]

Kitty Cat
Member #2,815
October 2002
avatar

The fixeds are fine, especially since he's using tight loops. Might be better to replace the fixtoi calls with >>16, though (not sure if Allegro inline's the function to do the same thing). Also, check out the exflame example for direct bitmap access. Specificly, you want the functions:

bmp_select
bmp_read_line
bmp_write_line
bmp_read32
bmp_write32
bmp_unwrite_line

--
"Do not meddle in the affairs of cats, for they are subtle and will pee on your computer." -- Bruce Graham

GullRaDriel
Member #3,861
September 2003
avatar

compute bmp->w/2 and bmp->h/2 outside the for(...) loop instead of computing it each loop.

Use a Table of cosinus&sinus that you will have filled since the start of the program.(I dunno if fcos & fsin already do EDIT: they already do)

Try to avoid this line if possible, it is inside the loop and it is not needed if you check boundary correctly before

if ( fixtoi(space_x) > 0 && fixtoi(space_y) > 0 &&
                 fixtoi(space_x) < tile->w && fixtoi(space_y) < tile->h )

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

giewueron
Member #7,433
July 2006

So that direct memory access... I have 16 bits 640x480 bitmap so bmpread32 would get two pixels of the image at once and I would have to do a 320x480 loop -- correct? :)

Kitty Cat
Member #2,815
October 2002
avatar

Quote:

I have 16 bits 640x480 bitmap so bmpread32 would get two pixels of the image at once and I would have to do a 320x480 loop

Yes, though you could use bmp_read16/bmp_write16 instead of having to worry about endian issues.

--
"Do not meddle in the affairs of cats, for they are subtle and will pee on your computer." -- Bruce Graham

Andrei Ellman
Member #3,434
April 2003

giewueron said:

"First is to use floats and not fixed, second is to use some bitshifts (dividing by 2 is the same as >> 1),"

I think this won't give any significant speed. Those fix* routines maybe even do bitshifts themselves. :)

If they did, they would first have to check to see if the number is a power of two. When multiplying and dvividing by powers of two, always use shifts. When using normal arithmetic, the compiler is smart enough to convert "/2" to ">>1", but if the arithmetic is embedded inside a function such as fixdiv(), it is no longer possible for the compiler to optimize.

Also, once you've followed the advice above and pre-calculated as much as you can outside the loop, you should acquire and release the bitmaps to save the bitmaps from having to be locked each time. If you are writing to a memory bitmap or if the screen-memory on your platform does not require a lock, you won't get a speed increase, but otherwise, you will. Have a look at the following functions:

Oh, and in case you haven't done so, use your compiler's optimization flags to optimize your code.

--
Don't let the illegitimates turn you into carbon.

Matt Smith
Member #783
November 2000

Quote:

draw_sprite routine with skewing (deformation of the sprites border rectangle)

polygon3d[_f]() can do this. Give it your sprite as a texture and an array of 4 V3D[_f] for the corners. I'm not sure you want this tho. I suspect you just need stretch_sprite() to get the effect you want.

Go to: