Hi all,
I'm back on the forum after a long absence due to (ahem!) illness, this time with an age old question which I see posted on this forum dozens of times but never really explained in a way I could understand. I've managed to fix the speed of my game, but performance is choppy because, for whatever reason, the CPU is taking up 100% of the time. I've made the game logic updates event-based, so they are called by an interrupt at regular intervals, and I've tried cutting out the game_display() (the function that contains all the code dealing with drawing the screen) completely to see if the drawing was taking too much time (it throws around 50-60 sprites at a time so it would be understandable, especially under Linux with no GFX hardware acceleration). Without any drawing functions, the game still swallows up 100% CPU time.
This is the main.cc as it stands:
1 | /* Open Invaders |
2 | * (c) 2006 Darryl LeCount |
3 | * |
4 | * |
5 | * This program is free software; you can redistribute it and/or modify |
6 | * it under the terms of the GNU General Public License as published by |
7 | * the Free Software Foundation; either version 2 of the License, or |
8 | * (at your option) any later version. |
9 | * |
10 | * This program is distributed in the hope that it will be useful, |
11 | * but WITHOUT ANY WARRANTY; without even the implied warranty of |
12 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
13 | * GNU General Public License for more details. |
14 | * |
15 | * You should have received a copy of the GNU General Public License |
16 | * along with this program; if not, write to the Free Software |
17 | * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. |
18 | */ |
19 | |
20 | #include "allegro.h" |
21 | #include "functions.h" |
22 | //#include "./alogg/alogg.h" |
23 | #include <iostream> |
24 | |
25 | int fullscreen_mode, frames_missed; |
26 | |
27 | using namespace std; |
28 | |
29 | void interrupt_time_control() |
30 | { |
31 | frames_missed++; |
32 | }; |
33 | |
34 | END_OF_FUNCTION(interrupt_time_control); |
35 | |
36 | int main(int argc, char *argv[]) |
37 | { |
38 | LOCK_FUNCTION(interrupt_keys); |
39 | LOCK_FUNCTION(interrupt_time_control); |
40 | |
41 | LOCK_VARIABLE(frames_missed); |
42 | |
43 | frames_missed=0; |
44 | fullscreen_mode=0; |
45 | |
46 | if(argc>1) |
47 | { |
48 | if(argv[1][0]=='-') |
49 | { |
50 | fullscreen_mode=0; |
51 | |
52 | switch(argv[1][1]) |
53 | { |
54 | case 'f': fullscreen_mode=1; break; |
55 | case 'w': fullscreen_mode=2; break; |
56 | }; |
57 | }; |
58 | }; |
59 | |
60 | initialise_game(); |
61 | cout << "Allegro initialised...\n"; |
62 | |
63 | display_setup(fullscreen_mode); |
64 | cout << "Allegro display established...\n"; |
65 | |
66 | predefine_variables(); |
67 | create_bitmasks(); |
68 | cout << "Collision bitmasks initialised...\n"; |
69 | |
70 | cout << "Have fun!\n"; |
71 | |
72 | intro_sequence(); |
73 | |
74 | while(program_still_active()) |
75 | { |
76 | title_screen(); |
77 | predefine_variables(); |
78 | reset_enemies_position(); |
79 | reset_enemies_state(); |
80 | initialise_ingame_music(); |
81 | install_int(interrupt_time_control,4); |
82 | |
83 | while(game_still_active()) |
84 | { |
85 | for(int repeats=0;repeats<frames_missed;repeats++) |
86 | { |
87 | update_logic(); |
88 | }; |
89 | |
90 | game_display(); |
91 | |
92 | frames_missed=0; |
93 | |
94 | rest(1); |
95 | vsync(); |
96 | }; |
97 | }; |
98 | |
99 | cout << "Thank you for playing!\n"; |
100 | |
101 | allegro_exit(); |
102 | }; |
103 | END_OF_MAIN(); |
Interrupt_keys() looks thus:
Interrupt_time_control() is just this:
void interrupt_time_control() { frames_missed++; };
I've been playing with this for months and am just about ready to chuck it in and delete the lot. Can anyone give me a hand as to why I can't stabilise the frame rate and drop the CPU usage? If there's anything else I should paste in let me know.
Thanks!
Darryl
Thats because by default your program will use 100% of the cpu even if its just repeating a loop. Thats because it's not just repeating the loop, but repeating it as fast as possible.
To give up CPU cycles for other programs you need to call rest or you can also use sleep although I'm not sure which header the latter is in, too much java programming :S
Anyway more on rest: http://alleg.sourceforge.net/stabledocs/en/alleg005.html#rest
[edit]
Woops just noticed you had a rest. Drop vysc. I believe some forum posters mentioned that vysnc would allow timing, but also consume 100% cpu usage while it did. As vsyc doesn't return until its done, rest literally only gets called 60 times per second, if that.
[edit2]
There was also just recently a post on sample game loops, though it's not 100% relevant, and looking through it I didn't really find any stellar examples (most of it is psuedo code) it might help. http://www.allegro.cc/forums/thread/590871
[edit3]
If neither of these helps, I'll take my GWARA Tins entry, and modify it so it doesn't use 100% cpu and post the updated code. Or try to at least. Just let me know.
Your frame dropping algorithm fails to take into account if the logic processing is taking too long. Frame dropping cures when the rendering takes too long, but if the logic is taking to long, more logic updates will be requested while logic is being processed, resulting in potentially large amounts of logic loops per frame, which is possibly the source of your framerate troubles and your 100% CPU usage.
What you need to do is put a limit to how many logic updates you'll allow on one game loop. 4 is a good limit. This prevents your logic loop from over-processing, which will result in slow-down if your system can't handle the amount of logic that needs to be processed, but that will be true of any game using fixed-time logic.
You may also want to check and make sure you're not running 32-bit graphics. 16-bit graphics will run almost twice as fast as 32-bit on any system. Also, avoid strange colour depths such as 15-bit or 24-bit which may invoke compatability layers if your video card can't handle them by default.
Also, vsync() will not eat up so much CPU time that you'll get 100% usage so long as you have rest() or Sleep() statements to give time back to the OS. However, you should place your rest() statement immediately after rendering, not right before vsyncing.
--- Kris Asick (Gemini)
--- http://www.pixelships.com
While your program is running it counts towards using cpu's. Yielding/resting will tell the operating system to stop running your program for the time being. While its doing nothing it is using 0%.
Here is a fix that shouldn't affect any thing. It works exactly the same except that when the count is 0 it returns to the operating system. When the count goes back above 0 it wake the process up. Just replace sdl_semaphore with one from your operating system be it windows or linux (i can help you if you need).I used the evil sdl because allegro has nothing similar/portable IIRC.
//the timer function. basically SemPort will "increment" the tick variable. Uint32 SDLTimer_Callback(Uint32 interval, void* param) { SDL_SemPost(sleepSemaphore); //will wake up the other thread. "increments" the count return interval; } //the main loop while ( !quitMe ) { SDL_SemWait(sleepSemaphore); //sleeps if count = 0, when it is no longer 0 it wakes up and decrements the count gameLogic(); }
edit: i should explain that sleepSemaphore is basically just a counter. It is used for synchronizing between threads but this works perfectly fine in a situation like this. This uses < 1% of my cpu in most cases (also opengl for hardware drawing helps!)
edit2: rest(1) doesn't sleep 1ms esp on windows. Windows is not a real-time operating system. You'll only get accuracy to ~ 10ms or so. SO like 10ms 20ms 30ms etc...
And where might semaphore.h be in MinGW? In MSVS?
Actually that's pretty good thing. Windows is the only platform (other than embedded) that doesn't support pthreads). The only reason why i didn't use sem_wait was because almost all n00bs who post use windows. It's a fact! But I hope that works for vc (or whatever they used).
If it does... then its about time to make a wiki entry and end this line of threads once and for all!
The only reason why i didn't use sem_wait was because almost everyone who posts uses windows.
Fixed
A game with real-time performance requirements using 100% CPU time isn't a problem in itself. In fact, when your game yields its current timeslice to the OS, it must wait until the OS's next round of scheduling. That may be ok for tic-tac-toe. How demanding do you think your game will be on the CPU?
Putting the thread to sleep and then waking it up is not a problem. In fact since timer is run in a seperate thread it gets woken up, then you have to reschedule the main loop thread anyways. When the sem_post occurs it changes the threads state to ready... the same as it would be if it was pre-empted because its time slice expired.
Windows is not real-time, not anything close... so you have to live with what you get i suppose. I guess this is why the thread about delta timing came up where you use getTimeOfDay() and find out how much time really elapsed (in milliseconds). But there are problems with pausing and scheduling (if you're relying upon small differences rather than averages then you'd need to lock everything up during that computation sequence)
A game with real-time performance requirements using 100% CPU time isn't a problem in itself.
If your game is going to make my fans come on, it had better look like Half-Life 2 or I won't be loading it again.
I sucked it in and wrote an allegro wiki entry under timers. http://wiki.allegro.cc/Timers#Yielding_the_CPU
edit: and some simple test code to measure performance. Here's what time outputs:
real 0m59.941s
user 0m0.040s
sys 0m0.012s
So it spends 1 minute of real time.. It spends 0.012 seconds doing system calls. It spends 0.040 seconds actually executing my code! I think that's a pretty good start
That gives a cpu usage of basically 0% (it uses 1.2% of a single second.. now divide that by 60 to get the actual rate)
1 | |
2 | #include <allegro.h> |
3 | #include <semaphore.h> |
4 | //for printing seconds since 1970 or whatever |
5 | #include <stdlib.h> |
6 | |
7 | //number of cycles per second |
8 | #define BPS 60 |
9 | |
10 | //create the mutex |
11 | sem_t timer_sem; |
12 | |
13 | void ticker(void) |
14 | { |
15 | sem_post(&timer_sem); |
16 | } |
17 | END_OF_FUNCTION(ticker); |
18 | |
19 | |
20 | int main(int argc, char** argv) |
21 | { |
22 | sem_init(&timer_sem, 0, 1); //initialize the semaphore, set the tick count to 1 |
23 | allegro_init(); |
24 | |
25 | LOCK_FUNCTION(ticker); |
26 | |
27 | install_timer(); |
28 | install_keyboard(); |
29 | set_color_depth(24); |
30 | set_gfx_mode(GFX_AUTODETECT_WINDOWED, 640, 480, 0, 0); |
31 | install_int_ex(ticker, BPS_TO_TIMER(BPS)); |
32 | unsigned char doLoop = 0xFF; |
33 | |
34 | while(doLoop) |
35 | { |
36 | sem_wait(&timer_sem); |
37 | if (key[KEY_ESC]) |
38 | doLoop = 0; |
39 | //do stuff here |
40 | rectfill(screen, 1, 1, 200, 20, makecol(0,0,0) ); |
41 | textprintf_ex(screen, font, 10, 10, makecol(255, 100, 200),-1, "Time: %d", time(NULL) ); |
42 | //end of loop! |
43 | } |
44 | |
45 | return 0; |
46 | } |
47 | END_OF_MAIN(); |
The thing about sem_wait is that if it gets behind, you get really funny speedups as it plays catch-up.
Like when you alt-tab back to Diablo 2 after leaving it minimized for a few seconds?
Like when you alt-tab back to Diablo 2 after leaving it minimized for a few seconds?
Like that.
OK, this is the third time I've typed this answer...Allegro.cc keeps logging me out for whatever reason and the answer gets lost...
Anyway, the only solution I've found that effectively yields the CPU is to omit the vsync() and include rest(1). However, update_logic() was then executed so seldomly that frames_missed got really high and as a result I had frame rates of around 1 frame every five seconds. I tried limiting frames_missed to four, but that just made the game extremely slow and extremely choppy, although CPU usage was at about 80% there.
Edit: I've just noticed that the title screen, which displays a large sprite in the middle of the screen, two textout_ex's and 500 pixel "stars" in the background is chomping up 90% CPU time on an Athlon 2400 - is Allegro's Linux graphics driver really that bad?
Edit 2: OK, I have the game running doing everything except blitting the backbuffer to the screen, and it runs at a fairly respectable 45% CPU usage. As soon as I do blit(display,screen,0,0,0,0,800,600) though, it just flats out to 90-95%...isn't there a faster way to blit display to screen?
clippy says It looks like you are trying to profile a program. Would you like help?
a) use a profiler to actually determine how much time is spent in each function
b) understand how cpu time is measured and improve your yielding (rest doesn't cut it!)
Thanks - I'll give it a go, if only to bring the 45% down a bit. Still, I'd love to know why a single Allegro function call is chomping up 50% of my CPU time. Just chopping out that single blit call brought it down from 95% to 45%...
The thing about sem_wait is that if it gets behind, you get really funny speedups as it plays catch-up.
How so? Can't you do frame dropping?
sem_wait(&my_sem); do { logic(); } while (sem_trywait(&my_sem) == 0); draw();
I'll give it a go, if only to bring the 45% down a bit. Still, I'd love to know why a single Allegro function call is chomping up 50% of my CPU time. Just chopping out that single blit call brought it down from 95% to 45%..
What i actually mean is that how can you really be sure you're actually spending that much cpu time in a single blit. I really doubt it.
Rest is not an accurate function. Windows is not a real-time operating system (though linux has patches). This means that if you rest(1) it may take 20ms to return to the program or it may return right away, execute a loop, go to sleep, return right away, etc... This is all scheduling dependant.. so many adding a blit changes how windows schedules your process.
A profiler will actually tell you how much time you spend in each function. In my example code above. i have a simple rectfill.. When i change rectfill to cover the entire screen (with a gray colour) my cpu usage becomes:
real 1m0.350s user 0m0.884s sys 0m0.052s
So.. there's some expense in rectfill. Blitting should be comparable in runtime.
Now gprof (my profiler!) outputs: Clearly you can see that after just a
% cumulative self self total time seconds seconds calls Ts/call Ts/call name 0.00 0.00 0.00 4149 0.00 0.00 rectFill 0.00 0.00 0.00 4149 0.00 0.00 textFill 0.00 0.00 0.00 1 0.00 0.00 mainLoop
Now that's funny. The time spent is too small to accurately measure.
edit: there must be a way to do fixed width fonts in this forum!
edit2: thanks baf! and weird.. the post-preview doesn't use the same font.
You can use the [pre][/pre] tags.
Pretty fixed widthness!
How so? Can't you do frame dropping?
No, there's no way to set a semaphore to 0 again (aside from repeatedly polling it, so this would work, but looks ugly):
sem_wait(&sem); while(sem_trywait(&sem) != -1); logic();
Frame dropping doesn't mean dropping logic frames, it means dropping rendered frames If you want to implement a maximum-allowed skip:
sem_wait(&my_sem); do { logic(); while (sem_trywait(&my_sem) == 0 && ++skip < MAX_SKIP); if (skip >= MAX_SKIP) while (sem_trywait(&my_sem) == 0) /* do nothing */; skip = 0; draw();
Alternatively, instead of the empty sem_trywait loop, you could destroy and re-init the semaphore, but that's likely not efficient.
OK, I've profiled it, and I'm still trying to make head or tail of the output. The interesting bit seems to be:
% cumulative self self total time seconds seconds calls us/call us/call name 81.27 0.13 0.13 pmask_load_func 6.25 0.14 0.01 1762 5.68 5.68 check_for_next_level() 6.25 0.15 0.01 875 11.43 11.43 display_background() 6.25 0.16 0.01 main 0.00 0.16 0.00 1762 0.00 0.00 read_input() 0.00 0.16 0.00 1762 0.00 0.00 process_ufo() 0.00 0.16 0.00 1762 0.00 5.68 update_logic() 0.00 0.16 0.00 1762 0.00 0.00 check_if_game_over() 0.00 0.16 0.00 1762 0.00 0.00 collision_detection() 0.00 0.16 0.00 1762 0.00 0.00 move_automatic_items() 0.00 0.16 0.00 1762 0.00 0.00 check_if_extra_life_due() 0.00 0.16 0.00 1762 0.00 0.00 process_enemy_projectiles() 0.00 0.16 0.00 876 0.00 0.00 game_still_active() 0.00 0.16 0.00 875 0.00 11.43 game_display()
pmask_load_func obviously belongs to pmask although I can't imagine why it reports it as taking up 81.27% of the processing time. Commenting out the call to collision_detection() (which contains the only references to pmask.h in the entire loop) does nothing for the performance, commenting out the blit call reduces usage by around 50%.
It looks like you just started the game and quit it and called it "profiling". You have to actually run the code for a while.
I started the game as usual, played it through three levels, lost, went back to the title screen and then exited the game cleanly. I know the profiler needs to be able to have used all of the functions available and it did get the chance...
Edit: I've attached a newly created profile which I created from playing the game for 15 minutes straight.
Well i'm not exactly sure what all those functions do and how the code is structured.. but it appears to me like there really is no bottleneck per say in your code. Just look at the cumulative time. If your program ran for 15 minutes none of those functions really ate all that much time.
total the average number of milliseconds spent in this ms/call function and its descendents per call, if this function is profiled, else blank.
So yours is actually listed in microseconds (us) per call. So if you add up the drawing commands that is not accounting for 7 minutes of total time (giving roughly 50% cpu usage).
Now i'm assuming your game loop is in the main function. If you took it out of main and put it in its own function then you could easily use the first table and fill in some of the gaps.
But it doesn't look like your code it slow at all. The "reported" cpu usage is probably coming from system scheduling.
OK, I found out by using "top" the CPU usage of each process, and I found out something rather interesting - the game itself never takes up more than 60% of the CPU, even at the most intense moment. Xorg, on the other hand, shoots up when the game is running for whatever reason. I'm not sure how I can create a log which would test the same thing running in fullscreen mode. Anyway, I don't think I can hope for much better performance at this time, short of switching the whole thing to OpenGL (which I'd rather avoid given that it shouldn't really be necessary).
Anyway, thanks to all those that contributed with tips and help!
I would just do something like:
1 | int main(int argc, char *argv[]) |
2 | { |
3 | allegro_init(); |
4 | set_color_depth(32); |
5 | set_gfx_mode(GFX_AUTODETECT_WINDOWED,640,480,0,0); |
6 | set_color_conversion(COLORCONV_TOTAL); |
7 | text_mode(-1); |
8 | install_timer(); |
9 | install_keyboard(); |
10 | install_mouse(); |
11 | textprintf(screen,font,0,0,makecol(255,255,255),"Press ESC to exit..."); |
12 | show_mouse(screen); |
13 | |
14 | bool quit = false; |
15 | bool redraw = true; |
16 | |
17 | BITMAP *imageBuffer = create_bitmap(SCREEN_W,SCREEN_H); |
18 | clear_bitmap(imageBuffer); |
19 | |
20 | while ( !quit ) |
21 | { |
22 | if ( keypressed() ) |
23 | { |
24 | if ( key[KEY_ESC] ) |
25 | { |
26 | quit = true; |
27 | clear_keybuf(); |
28 | } |
29 | redraw = true; |
30 | clear_keybuf(); |
31 | } |
32 | if ( redraw ) |
33 | { |
34 | redraw = false; |
35 | clear_bitmap(imageBuffer); |
36 | // Do drawing... |
37 | |
38 | // End drawing... |
39 | blit(imageBuffer,screen,0,0,0,0,imageBuffer->w,imageBuffer->h); |
40 | } |
41 | else |
42 | { |
43 | // Update ai and yield unused time to system... |
44 | rest(1); |
45 | } |
46 | } |
47 | destroy_bitmap(imageBuffer); |
48 | return 0; |
49 | }END_OF_MAIN() |
This tells the system to yield until some event(redraw), which is in this case a key pressed, has occurred. It is up to you and your game logic to set this variable when you need to take control of the processing again. rest(0) and yield_timeslice never has seemed to work for me in windows or linux (several different flavors)...just pass rest(1) and I get the cpu to go to 0%...even during game play!
i havent done any game programming in a while, so take my question with a grain of salt.
Wouldn't it be better to use threads instead of loop+timer/frame rate counter?
I don't think there IS a right or wrong answer to most of game programming...It is what is right for YOU. I have finally decided this after reading posts and books all of which contradict each other. It is ultimately up to you, and what you like. YOU are the one who has to code it and probably maintain it.
Each choice you have asked about has it's weakness and strengths. It is up to you to decide which problems and hurdles you wish to face...I DO know that threads can get you into a LOT of trouble if you are not careful...but you can do a lot of stuff you might not be able to do (at least no easily) without them...I would also imagine threads would be harder to debug as well...but I am certainly not a master of the art...:)
rest(0) and yield_timeslice never has seemed to work for me in windows or linux (several different flavors)...
That's because they're not supposed to reduce CPU usage. They yield to other processes. That's it.
Interesting though.. goalie's method of using semaphores seems to work quite well. If you need finer grain control implement your own semaphore class by wrapping one.
Well, as I understand it, there have been three ways here suggested to reduce CPU usage:
(a) Use rest(1) somewhere in the game loop. The least effective, but the easiest to implement.
(b) Use the Semaphore library. More effective, but a bit more difficult to use. I may tackle this when I'm a little more confident with my programming.
(c) Get straight down to the grit of it and use threads. The most efficient method possible, but can be extremely unstable if you don't know what you're doing.
(b) Use the Semaphore library. More effective, but a bit more difficult to use. I may tackle this when I'm a little more confident with my programming.
What's this "Semaphore library" you're talking about? You mean pthreads.
(c) Get straight down to the grit of it and use threads. The most efficient method possible, but can be extremely unstable if you don't know what you're doing.
I fail to see how this reduces CPU usage. In fact, it increases CPU usage, as now you have two busy loops running around
Well, as I understand it, there have been three ways here suggested to reduce CPU usage:
My method is, as it has always been:
do a frame draw
read a counter, compute number of milliseconds/whatever since last here in loop
if now one or more logic ticks behind, do that many logic ticks
otherwise sleep for the number of milliseconds/whatever left before next update
In SDL I use SDL_GetTicks which returns the number of milliseconds since the app started. In Allegro I have a second thread that updates a timer that is much coarser than milliseconds but gets the job done. My ChristmasHack '05 entry, Nuclear Attack! (Windows, OS X) is one example. On a virtualised copy of Windows 2000 on my MacBook Pro, it uses less than 5% of one CPU core.