Performance issues (Windows)
.greay

I'm just starting out w/Allegro, and have a very basic prototype for a shooter. It runs acceptably on my computer (for an I'm-still-learning-as-I-go prototype), but runs terribly on Windows (I use OS X). The difference is striking -- on my Mac, it starts to slow down with ~50 objects on screen. Not great, but not too bad for a first shot. But it starts to slow down ~13 objects in Windows, even on a computer with much better specs than my own.

I don't have any experience programming for Windows, so I'm not even sure where to begin looking for the problem. I don't do anything Mac-specific anywhere in the code.

gprof says it's spending a huge amount of time in gfx_directx_unlock_win.

Anyways, the source and Windows executable are here.

HoHo

Are you drawing onto a video bitmap? If yes then describe the operations you do with it (what kind of bitmaps, blending, transparency, per-pixel effects, ...)

Paul whoknows

It runs slow in my P4 2400MHz! but this problem is not Allegro related.
Post the code you use to blit your buffer bitmap to screen.

HoHo

I read your code a bit and found this thing:

  while (!game_end_flag) {

    LOCK_VARIABLE(t);
    LOCK_FUNCTION(inc_timer);
    install_int_ex(inc_timer, BPS_TO_TIMER(120));    

    LOCK_VARIABLE(e);
    LOCK_FUNCTION(inc_enemy);
    install_int_ex(inc_enemy, BPS_TO_TIMER(120));    

    input_update();
// lots more stuff here that isn't interesting ATM

How often are those things executed? I think it is not once per frame but less often. If it is more than once per application startup then it isn't a good thing, you should lock and install stuff once at initialization time, unless you uninstall timers some where else.

Also as I thought you draw things to video bitmap. What is worse is that it seems like you are doing per-pixel stuff there (rectangles, triangles). Try using regular memory bitmap for backbuffer, it should give considerable speedboost.

.greay
Paul whoknows said:

It runs slow in my P4 2400MHz! but this problem is not Allegro related.
Post the code you use to blit your buffer bitmap to screen.

Yeah, I figured it was my fault, not Allegro's ;) Here's pretty much the whole display routine, cut & paste from a couple different files. There's really nothing complicated going on.

1static BITMAP *sandbox, *bkg;
2 
3 sandbox = create_sub_bitmap(screen, screen_x, screen_y, screen_width, screen_height);
4 bkg = create_bitmap(screen_width, screen_height);
5 
6void display_update() {
7 Particle *ship = get_first_object();
8 while (ship) {
9 [ship erase: sandbox to: bkg];
10 [ship draw: sandbox];
11 ship = [ship next];
12 }
13}
14 
15-(void)erase: (BITMAP *)front to: (BITMAP *)back
16{
17 blit(back, front, drawn_x, drawn_y, drawn_x, drawn_y, drawn_l, drawn_h);
18}
19 
20-(void)draw: (BITMAP *)bmp {
21 triangle(bmp, x, y + sizey, x + 10, y + sizey, x + 5, y, color);
22 [self save_state];
23}

HoHo said:

How often are those things executed? I think it is not once per frame but less often. If it is more than once per application startup then it isn't a good thing, you should lock and install stuff once at initialization time, unless you uninstall timers some where else.

Also as I thought you draw things to video bitmap. What is worse is that it seems like you are doing per-pixel stuff there (rectangles, triangles). Try using regular memory bitmap for backbuffer, it should give considerable speedboost.

Yeah, the timer initialization there was a mistake. It should be in the game initialization fn. Those are the only two timers I have, at least thus far.
So I'm basically drawing directly to the screen, yes? Since I draw to the sandbox & that's a sub-bitmap of the screen. Are you saying I should draw to a system-memory bitmap first, and then blit that to the screen?

Ultimately, most of these are going to be pre-rendered bitmaps anyways, but are the drawing functions really that expensive? Or is it just because how I'm going about it?

Alright, I have some stuff to try. Thanks!

HoHo
Quote:

So I'm basically drawing directly to the screen, yes? Since I draw to the sandbox & that's a sub-bitmap of the screen

yes

Quote:

Are you saying I should draw to a system-memory bitmap first, and then blit that to the screen?

yes

Quote:

Ultimately, most of these are going to be pre-rendered bitmaps anyways, but are the drawing functions really that expensive?

They are if you perform them on video bitmaps. Reason is that to get a per-pixel access to video bitmaps you first have to lock them. That probably means the bitmap is downloaded from video RAM to system RAM, updated and uploaded back. That download-modify-upload is done once for every drawing command. The drawing itself should be quite cheap. That is also the reason why gprof showed that most of the time is spent in gfx_directx_unlock_win

[edit]

One simple thing you can do is to make the sandbox a regular system/memory bitmap and see how that works. It should be considerably faster than your current method

.greay

Hmm. I tried it, and the performance change is negligible in OS X (i.e., it's acceptable). Sadly, it's also negligible in Windows (at least on the laptop I'm using). It still slows down ~ 13 objects or so, and gprof says pretty much the same thing:

27.72   1.12   1.12   gfx_directx_unlock_win
21.53   1.99   0.87   _linear_blit32
 7.18   2.28   0.29   blit

...etc.

here's what I've changed:

  sandbox = create_bitmap(screen_width, screen_height);
  bkg = create_bitmap(screen_width, screen_height); // this line's still the same

-(void)erase: (BITMAP *)front to: (BITMAP *)back
{
  blit(back, front, drawn_x, drawn_y, drawn_x, drawn_y, drawn_l, drawn_h);
  blit(back, screen, drawn_x, drawn_y, drawn_x, drawn_y, drawn_l, drawn_h);
}

-(void)draw: (BITMAP *)bmp {
  triangle(bmp, x, y, x + sizex, y, x + sizex / 2, y + sizey, color);
  [self save_state];
  blit(bmp, screen, drawn_x, drawn_y, drawn_x, drawn_y, drawn_l, drawn_h);
}

Oh yeah, and I moved the timer initializations to their proper place, i.e. game_init().

I'll try making the bitmaps expicitly system memory bmps by creating them with create_system_bitmap() instead of simply create_bitmap(), I guess. But I've got to be doing something seriously wrong somewhere. What I don't get is why the disparity in performance between OS X and Windows is so large.

I know I could get a performance boost by using pre-rendered bitmaps instead of drawing the objects, but that doesn't look like it would actually solve my problem, so I'm not going to worry about that yet.

**EDIT**

Ok. Reading some of the docs, it sounded like using acquire_screen() and release_screen() might help. I bookended the display loop with those. The performance is pretty much the same, but now gprof reads thus:

18.18   0.70   0.70   blit
10.13   1.09   0.39   _linear_hline32
 9.35   1.45   0.36   _soft_polygon
 8.57   1.78   0.33   objc_msg_lookup
 5.71   2.00   0.22   ddraw_blit_to_self

... and so on. Pretty much the same, except w/the first two lines gone.

HoHo

I don't quite get what you've changed. Do you blit stuff to both, front (backbufffe?) and screen? You should blit your backbuffer to screen only once per frame, not a single time more.

Usually things go like this:
Clear backbuffer
Blit your images to bacbuffer
blit backbuffer to screen
repeat

Backbuffer and images are all regular memory or system bitmaps, there are no video bitmaps besides screen that Allegro creates for you.

Quote:

Pretty much the same, except w/the first two lines gone.

That's odd. Together those function timings are no where near 100% usage. Are you sure you didn't miss something?

.greay

ohmygod.

I think I found the problem.

BANGS HEAD ON DESK

I'll be with you again in a moment.

...

so yeah. I had the display_update() function called INSIDE the game logic loop. Which means it would loop through all the objects & draw them & draw to the screen ONCE FOR EACH OBJECT. No wonder it was slowing down the more objects there were onscreen. I was drawing the screen objects^2 times each time-step.

Andrei Ellman
.greay said:

so yeah. I had the display_update() function called INSIDE the game logic loop. Which means it would loop through all the objects & draw them & draw to the screen ONCE FOR EACH OBJECT. No wonder it was slowing down the more objects there were onscreen. I was drawing the screen objects^2 times each time-step.

But even so, the OSX version was considerably faster thasn the Windows version.

One thing I suggest you read up on is the Allegro functions for locking and unlocking bitmaps

AE.

HoHo

I always suspected there was some major problem with something. Things shouldn't slow down with so little objects on screen. Still, glad you've found the problem :)

Thread #590785. Printed from Allegro.cc