TIMING TIMING PLEASE HELP -- FRUSTRATING ME

TIMING TIMING PLEASE HELP -- FRUSTRATING ME

Richard Phipps

Member #1,632

November 2001

That's not fair. We pointed an 'issue' in his assumptions.

Chaos Groove Development Blog
Free Logging System Code & Blog

Shawn Hargreaves

The Progenitor

April 2000

Quote:

If, on the other hand, gamestate isn't buffered then we potentially need critical sections all over the place, and this certainly will add to complexity, but will save us the copying overhead.

But then you're paying the cost of acquiring and releasing all those critical sections, which is far from free. It's actually surprisingly hard to write multithreaded game code that is both correct (ie. not just ignoring a few rare race conditions) and also ends up faster than the simple singlethreaded version! Not impossible, but just a warning to make sure you're aware of this stuff. I've often managed to get a nice speed boost from threading, and then find I'm now spending more time than I saved just executing the synchronization code!

Quote:

Plus thread switches aren't fast. So you're wasting a bunch of time every time the OS has to jump from one to the other.

They're fast enough! It's not like I'll be creating and destroying threads all over the place.

I didn't mean creating and destroying (which is REALLY slow), just the switch operation when the OS decides to suspend one thread and start running another. This is a pretty heavyweight task: it has to save all the CPU state (including FPU registers), reset a load of MMU information (and segment registers if you are on x86), and do a ton of internal OS bookkeeping. You're easily talking 0.01ms per switch, which is enough to quickly add up if you're using threads in places where you don't strictly need them.

Quote:

This would involve a combination of dynamic benchmarking and possible a user-controlled variable (some sort of slider that controls a bias between gameplay and rendering processing).

But what user is going to have a clue how to set that slider? Strikes me as a bad interface to ask questions that nobody is going to know how to answer...

Steve++

Member #1,816

January 2002

It should be faster than single-threading when the frame rate would otherwise exceed the refresh rate. When this isn't the case, I could have the game drop back to a more traditional game loop. Best of both worlds.

The good thing about the slider is that it's just one slider and the user has just half a clue what it does, he can fiddle around with it until it seems right (usually the masses will go for frame rate). Anyway, it's just something optional that may or may not make it in a game. There may be the odd user that prefers gameplay accuracy .

EDIT: I have a question regarding multiple CPU cores and multitasking:
Suppose the rendering thread is limited to rendering at the refresh rate. Then suppose that thread has nothing to do so it waits. Will that thread wakeup at the moment it has something to do, or will it be woken up when it has something to do and the OS scheduler runs? Because if it's the latter, then this isn't a good thing at all. The rendering thread would need to busy wait for an available buffer to avoid being displaced in that core by another thread (which could belong to another process).

Shawn Hargreaves

The Progenitor

April 2000

Quote:

But why?

I guess I don't understand, if you're falling back on a more traditional loop when failing to hit the refresh rate, what's the point in trying to do anything clever when you're above it?

Seems like this is adding a ton of code complexity, potential bugs, making it harder to read and maintain, to speed up something that is by definition already fast enough?

Quote:

Suppose the rendering thread is limited to rendering at the refresh rate. Then suppose that thread has nothing to do so it waits. Will that thread wakeup at the moment it has something to do, or will it be woken up when it has something to do and the OS scheduler runs? Because if it's the latter, then this isn't a good thing at all. The rendering thread would need to busy wait for an available buffer to avoid being displaced in that core by another thread (which could belong to another process).

"it depends" (tm)

If you are careful about how you use your synchronization primitives, that can kick the thread into waking up at the right time. This still requires an OS schedule, which imposes a fairly large amount of fixed overhead, but raising an event will force that to happen when you want. And you can use thread affinity APIs to control which core each thread runs on. But it's incredibly easy to destroy performance by making mistakes setting the affinity (hard to know what the right thing to do is on different machines).

Steve++

Member #1,816

January 2002

Quote:

I guess I don't understand, if you're falling back on a more traditional loop when failing to hit the refresh rate, what's the point in trying to do anything clever when you're above it?

On a single core machine, when the game limits rendering to the refresh rate and the rendering loop is waiting for the next buffer, it will automatically switch to the gameplay thread, giving it more CPU time and therefore more precision.

Thomas Fjellstrom

Member #476

June 2000

Quote:

it will automatically switch

and that switch, along with the switch back to the other thread will eat up more time than you may have to spare.

Its hardly more CPU time.. a proper timing loop will use less CPU than two threads trying to compete for the same resources. All the locks, waiting and context switches will quickly erode any possible benefit.

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

_Dante

Member #7,398

June 2006

Quote:

It most certainly will not switch to the gameplay thread. Once your thread goes to sleep, you have zero control over it. The OS will swap it back in according to normal scheduling rules. The only way around that is of course to use synchonization methods to make sure that the gameplay thread runs at that point and the rendering thread doesn't, but then you're back to a single-threaded model essentially.

Using multiple threads only makes sense if you have operations you can parallelize (and you can't really do that with logic/rendering), or if you're doing I/O and don't need the result of that to continue doing whatever it is that the other thread's doing. In pretty much any other case, you're just thrashing.

-----------------------------
Anatidaephobia: The fear that somehow, somewhere, a duck is watching you

Steve++

Member #1,816

January 2002

Quote:

All the locks, waiting and context switches will quickly erode any possible benefit.

Silly me. I was thinking there was a reason for threads existing.

Quote:

It's not about using less CPU. Games will always use 100%-overhead CPU. It's what we do with the CPU that counts. And redundant rendering certainly doesn't count.

Quote:

It most certainly will not switch to the gameplay thread. Once your thread goes to sleep, you have zero control over it. The OS will swap it back in according to normal scheduling rules. The only way around that is of course to use synchonization methods to make sure that the gameplay thread runs at that point and the rendering thread doesn't, but then you're back to a single-threaded model essentially.

It will automatically switch to the next thread on the queue that isn't waiting for a resource, which will usually be the game logic. Anyway, as far as I know, if a thread waits for a resource, preference is given to threads in the same process, because a timeslice hasn't been reached yet and it wouldn't make sense to switch processes when the current process still has something to do.

Quote:

You most certainly can parallelise logic and rendering. Let's remember that rendering and refreshing the monitor are parallelised and they are much more related than logic and rendering. This can be achieved by buffering the gamestate. Earlier I mentioned copying the gamestate, but in fact, we can just have two or three buffers and write to the next available gamestate buffer while processing game logic. At the end of the logic loop, it is flipped so that it can be made available to the rendering loop.

Anyway, this all very conceptual at the moment, and there are probably a lot of things that need changing. I just know I'm on the right track with multithreading though.

I even saw something on gamedev.net about three threads - gameplay, physics and rendering.

_Dante

Member #7,398

June 2006

Fair enough. 60Hz seems like enough to me, but I suppose for some realtime physics stuff, the extra logic frames would smooth things out.

-----------------------------
Anatidaephobia: The fear that somehow, somewhere, a duck is watching you

Shawn Hargreaves

The Progenitor

April 2000

Quote:

Silly me. I was thinking there was a reason for threads existing.

Of course there are. Two in fact:

- For separating processing that should take place on an independent timeline. Playing background music, for instance, or keeping the UI responsive while encoding an MP3.

- For taking advantage of multiprocessor hardware.

The first reason is not a good one for separating game logic and rendering, because these do not logically belong on independent timelines. The second reason is very relevant, but only if you have multiprocessor hardware!

Quote:

It's not about using less CPU. Games will always use 100%-overhead CPU. It's what we do with the CPU that counts. And redundant rendering certainly doesn't count.

Of course not. But neither does redundant logic computation!

Beware of trying to make your game logic hugely flexible and adaptable to differing CPU resources. Down that route lies madness, and a pile of crazy gameplay bugs that will be almost impossible to resolve. Do you really want to end up with the game being easier to complete on machines with different clock speeds? Or unable to do accurate network prediction because the client and server have a different CPU? Keep it simple and make your logic deterministic.

Also, it seems backward to spend all this effort further optimizing the case where you are already running fast enough, especially when this adds overhead that will hurt performance on slower machines. Surely you should be trying to improve the worst case rather than the best?

Quote:

You most certainly can parallelise logic and rendering.

Absolutely you can. But this is very hard, requires a lot of discipline across the entire codebase, increases complexity, slows development, adds overhead, makes the code harder to read, etc etc.

It only makes sense to pay those costs if you have a good reason for needing to be threaded.

Quote:

I just know I'm on the right track with multithreading though.

Actually I think you could be, but for entirely the wrong reasons.

On a single core processor, multithreading is just crazy. It will make your game slower, every single time.

But not all machines have single processors. If you are targetting dual core hardware, then threading is indeed a good idea.

Steve++

Member #1,816

January 2002

Quote:

Yeah, that completely makes sense.

Quote:

Isn't this the case with the traditional game loop anyway? Doesn't gameplay logic get executed at more frames per second on machines that can render the graphics at more frames per second?

OK, I'm convinced now that multithreading gameplay and rendering isn't such a good idea on a single-core CPU. It has merit for two cores though. Perhaps gameplay+physics in one thread and rendering in another. When four+ core CPUs become common, then we can have gameplay + physics + rendering + one spare for the OS to play with.

Kitty Cat

Member #2,815

October 2002

Quote:

Isn't this the case with the traditional game loop anyway? Doesn't gameplay logic get executed at more frames per second on machines that can render the graphics at more frames per second?

Not if you use a fixed logic rate.

Quote:

But not all machines have single processors. If you are targetting dual core hardware, then threading is indeed a good idea.

But isn't it also true that just because you have a dual-core system doesn't mean that your two threads will run on those two cores? eg. the OS could decide to put both your threads on the same core.

--
"Do not meddle in the affairs of cats, for they are subtle and will pee on your computer." -- Bruce Graham

Lucid Nightmare

Member #5,982

July 2005

Howcome encore hasn't even posted one message after startin off with the thread while everyone else is debating over his topic...

Click Here For My Website

Two Golden Rules of life- Firstly, I'm always right and secondly, if you think otherwise, slap your face and read rule number one again!

HoHo

Member #4,534

April 2004

Quote:

eg. the OS could decide to put both your threads on the same core.

I don't think any OS would be that stupid, not even the one that is used by majority of people.

Only place where OS could mess up threading is on multi-cpu machine with SMT, a'la dualcore P4 with HT. When OS decides tu run two recource hungry threads on one core's two virtual CPU's then the other core will just sit idle.

__________
In theory, there is no difference between theory and practice. But, in practice, there is - Jan L.A. van de Snepscheut
MMORPG's...Many Men Online Role Playing Girls - Radagar
"Is Java REALLY slower? Does STL really bloat your exes? Find out with your friendly host, HoHo, and his benchmarking machine!" - Jakub Wasilewski

Simon Parzer

Member #3,330

March 2003

This thread somehow confuses me... so one should run the logic at the refresh rate of the monitor to ensure fluent graphics? I usually lock my logic to 50 Hz, measure the time needed for drawing and wait the rest of it.
Example: MeasureTime() Logic+Drawing() MeasureTime() Wait(20-TimeDiff) --> Logic is locked to 50Hz, drawing is locked to 50Hz and the game leaves plenty of CPU for other applications. Is this a good way? And how would one go about determining the monitor refresh rate on different platforms?

As for the multithreading, I would only use it for sound, networking and such tasks, not for the internal logic/drawing things.

Kitty Cat

Member #2,815

October 2002

Quote:

I don't think any OS would be that stupid, not even the one that is used by majority of people.

How would it know what threads to put where, though? Even if you turn off as many programs as possible, the system will still have several of its own processes/threads running (even if mostly idle). How would it know your two threads would need to go onto two seperate cores, and not put them onto the same core?

--
"Do not meddle in the affairs of cats, for they are subtle and will pee on your computer." -- Bruce Graham

HoHo

Member #4,534

April 2004

Quote:

How would it know what threads to put where, though?

You should know that not all threads/processes run at the same time. Usually they run for a couple of milliseconds and then get replaced with a thread that has something to do.

It is the exactly the same with dualcore CPU but scheduler has two cores to divide threads to and usually it takes first one that does nothing.

Audric

Member #907

January 2001

Shawn Hargreaves said:

Beware of trying to make your game logic hugely flexible and adaptable to differing CPU resources. Down that route lies madness (...)

sayeth Shawn, A.cc 586057/594632
sorry, I couldn't resist

Quote:

(...) and a pile of crazy gameplay bugs that will be almost impossible to resolve. Do you really want to end up with the game being easier to complete on machines with different clock speeds?

ooooh, is it why a Quake 3 character runs faster and jumps higher when the renderer is set to ultra-low detail ? Indeed, it's evil.

HoHo

Member #4,534

April 2004

There is one more interesting thing about Q3A.
it has SMP support under Linux and probably under windows too. When I run it without SMP support I get ~350FPS at max quality. When I run the SMP version it drops down to "only" ~300FPS. I wonder why is that. Might it be that with slower SMP CPU's it makes sense to multithread it and my super-fast dualcore P4 just spends most of its time switching between threads?

IIRC in some version of Q3A you could run faster and jump higher at some specific FPS. I think it was around 80-something.

Shawn Hargreaves

The Progenitor

April 2000

Quote:

Isn't this the case with the traditional game loop anyway? Doesn't gameplay logic get executed at more frames per second on machines that can render the graphics at more frames per second?

"it depends" (tm)

This is holy war territory among game developers.

One school of thought (mostly PC programmers) believes your logic update should be parameterized on the time since the last update, so you can call it as fast as possible for any given machine.

Pros:

- Can easily lock to any arbitrary refresh rate.
- Can efficiently drop updates if running below refresh rate.

Cons:

- Makes game logic code more complicated.
- Unless you are a math genius, makes game logic give different results depending on effectively random performance details. This may or may not actually be noticable to the player, but even the smallest deviations are pain when implementing things like replays and networking.

The other school of thought (mostly console programmers) believes you should just pick an update rate and stick to it.

Pros:

- Keeps code nice and simple.
- Everything is 100% deterministic.

Cons:

- If the rate you picked is different to the monitor refresh, won't be perfectly smooth. Of course this isn't a problem on consoles where you know the TV goes at 60. You could always just set the PC monitor to that known rate, though.

Personally I prefer the fixed rate approach, because I'm a big fan of keeping things as simple as possible, but game devs will argue for ever about this.

Even if you're going for a variable rate update, though, I still wouldn't ever run multiple updates in between render cycles. Just wait until the next refresh, then run a singe update passing it the appropriate time delta.

Quote:

This is one of the biggest issues commercial game devs are struggling with at the moment, as they figure out how to work with the 3-core Xbox. Splitting gameplay and rendering pretty consistently seems to give good results, although depending on the engine architecture the amount of pain involved ranges from a fair amount to an incredible amount :-) Splitting gameplay and physics generally seems to be less successful: a handful of people are managing to get ok results there, but it really depends on the game if this is feasible. Pathfinding is an obvious candidate for doing in the background perhaps even over multiple frames, and people are also getting good results moving CPU graphics effect computations (fluid simulation, cloth, hair, particles) onto the third core.

Quote:

But isn't it also true that just because you have a dual-core system doesn't mean that your two threads will run on those two cores? eg. the OS could decide to put both your threads on the same core.

That's where it gets fun. You could just trust the OS, and you might be lucky, but who knows. Setting an explicit thread affinity makes things a lot more robust, but then you have to know what kind of processor you are running on, and that's currently quite a pain (not easy to tell the difference between dual core and some hyperthreading configurations, for instance). I have a feeling Vista is adding some new API's to make this easier though.

Quote:

That's what I tend to do to, although I usually go for 60. Not perfect if the monitor happens to be set to something other than your chosen rate, but hey. Opinions differ as to whether the extra smoothness of being able to adapt to different refresh rates is worth the extra complexity and unpredictableness that this introduces into your code.

Quote:

Interesting fact, but hard to say why without a detailed investigation.

It's very possible that the threading overhead is just bigger than the gains they are getting from the parallelism, but there are other possible explanations. For instance maybe on your machine the CPU is so fast that the game ends up totally bottlenecked by GPU performance, and the use of multithreading might be forcing the driver to disable some potentially unsafe GPU level optimisations?

No idea really, but it does confirm that even on a dual core machine, it can be surprisingly hard to actually gain that much of a benefit from threading. It's certainly nowhere near the 2x speed that you might naively expect.

Quote:

IIRC in some version of Q3A you could run faster and jump higher at some specific FPS. I think it was around 80-something.

That sucks. But it is hard to avoid such quirks in variable time update games.

Steve++

Member #1,816

January 2002

Quote:

One school of thought (mostly PC programmers) believes your logic update should be parameterized on the time since the last update, so you can call it as fast as possible for any given machine.

I thought this had become the industry standard.

Quote:

The other school of thought (mostly console programmers) believes you should just pick an update rate and stick to it.

It's good to know this is still acceptable, given the fashionable trend towards variable rate updating. Seriously though, I think this trend is caused by variations between PCs. Not only are CPUs, GPUs and RAM different, we've also got bus speeds, HDD seek times, different soundcard speeds due to different cards having different drivers, etc. On consoles, everything is the same, but on PCs you have to be dynamic.

Now I just have to figure out how to mesh deterministic game logic with nondeterministic render times. I'm sure there must be some way on the PC to synchronise on the refresh rate. Here I go with threads again... perhaps there could be a thread that just vsyncs (somehow) and updates a counter, all in a loop. It would be nice if windows just used the monitor refresh rate as its timer. Then PCs would be serious gaming machines.

Shawn Hargreaves

The Progenitor

April 2000

Quote:

I thought this had become the industry standard.

If anything I think the trend is the other way: more and more PC games are sharing a codebase with Xbox and PS3, and starting to be influenced by console approaches.

Quote:

Seriously though, I think this trend is caused by variations between PCs. Not only are CPUs, GPUs and RAM different, we've also got bus speeds, HDD seek times, different soundcard speeds due to different cards having different drivers, etc.

I think the monitor is the biggest difference.

When it comes to raw computing horsepower, 60 or 70 fps is 'good enough' for anything but the most rabid Quake geeks, so there's no problem just locking a game to that.

But the monitor could easily be set anything from 50 to 120, so your game won't look perfectly smooth unless you happen to match.

Consoles used to have this problem too, with 60 in the US and 50 in Europe. Some ignored it and just always ran at 60, causing jerky updates in their European versions, while others changed their update frequency and tweaked the logic to match, causing inconsistent gameplay across continents.

These days, though, modern European TVs can do 60, so everyone just locks to 60 without any problems.

And things are changing in the PC world, too. LCD monitors don't have the same refresh characteristics as a CRT, and increasing numbers of gamers have their PC connected to an HDTV movie display. I can see those factors making it a lot more appealing to just lock PC titles to 60, too...

Steve++

Member #1,816

January 2002

Quote:

I know what you mean about inconsistency across continents... A friend of mine was a huge Wonder Boy fan in the 80s/early 90s. We have PAL over here, so he played these games in 50Hz. Then one day recently he played the first Wonder Boy (the action oriented one where you have to keep moving or die) in 60Hz and realised that the 50Hz version feels rather pedestrian now. There's a big difference.