Allegro.cc - Online Community

Allegro.cc Forums » Game Design & Concepts » QPC, GTOD, RDTSC

This thread is locked; no one can reply to it. rss feed Print
QPC, GTOD, RDTSC
GullRaDriel
Member #3,861
September 2003
avatar

Everybody have ever try to have some Timer explanations. I saw enough Timer-Related topic, so time is to make something for all.

After reading a lot from the forum, googleing more than needed, I was thinking of a topic who will concentrate all the ressources I have found, for sharing purposes.
I'll be pleased if you only post here topic related things, as code snippets, links, explanations, tips on timer things.


[url http://www.geisswerks.com/ryan/FAQS/timing.html]This[/url] is a nice article where most of the windows timers have been tested. There are a few code snippet.

From this site, two things:

[url http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q274323&]"Performance counter value may unexpectedly leap forward"[/url] from microsoft,

and "Ross Bencina was kind enough to point out to me that rdtsc "is a per-CPU
operation, so on multiprocessor systems you have to be careful that multiple calls
to rdtsc are actually executing on the same CPU."

For allegro timers, you have the [url http://pixwiki.bafsoft.com/wiki/index.php/Timers]AllegroWiki[/url]

For those who need accurate timer under Linux, just use the [url http://www.penguin-soft.com/penguin/man/2/gettimeofday.html?manpath=/man/man2/gettimeofday.2.inc]gettimeofday function[/url]

I also advise you to take a look at the [url http://www.gillius.org/gne/index.htm]GNE library from Gillius[/url], I know it is a networking library, but it have good code inside, regarding timer & cross-compatibility.

Here is a code Gillius generally gave when he answer a timer question:
[code]
//code fragment from his GNE library. The library is LGPL,
//but it is OK to use, modify, and/or redistribute for all
//purposes commerical or non-commercial:
#ifndef WIN32
//For the gettimeofday function.
#include <sys/time.h>
#endif

Time Timer::getCurrentTime() {
Time ret;
#ifdef WIN32
LARGE_INTEGER t, freq;
QueryPerformanceFrequency(&freq);
QueryPerformanceCounter(&t);
ret.setSec(int(t.QuadPart / freq.QuadPart));
ret.setuSec(int((t.QuadPart % freq.QuadPart) * 1000000 / freq.QuadPart));
#else
timeval tv;
gettimeofday(&tv, NULL);
ret.setSec(tv.tv_sec);
ret.setuSec(tv.tv_usec);
#endif
return ret;
}

Time Timer::getAbsoluteTime() {
#ifdef WIN32
Time ret;
_timeb t;
_ftime(&t);
ret.setSec(t.time);
ret.setuSec(t.millitm * 1000);
return ret;
#else
return getCurrentTime();
#endif
}
[/code]


Here are relpatseht's routines, cross-platform too:
Timer.h
[code]
#ifndef TIMER_H
#define TIMER_H

#include <allegro.h>

#ifndef WIN32
#include <cstdlib>
#include <sys/time.h>
#include <unistd.h>
#else
#include <winalleg.h>
#endif

class Timer{
private:
#ifndef WIN32
static LONG_LONG secondsToMicro;
#else
static LARGE_INTEGER currentTime;
static LARGE_INTEGER frequency;
#endif

LONG_LONG start;

public:
static void Intialize();
static LONG_LONG GetCurrentTime();
static LONG_LONG GetPrecision();
static LONG_LONG GetElapsedTimeMsec(LONG_LONG lastTime);
static float GetElapsedTimeSeconds(LONG_LONG lastTime);

void StartStopwatch();
LONG_LONG GetStopwatchTimeMsec();
float GetStopwatchTimeSec();
};

#endif
[/code]

Timer.cpp
[code]
#include "timer.h"

#ifndef WIN32
LONG_LONG Timer::secondsToMicro = 1000000.0;
#else
LARGE_INTEGER Timer::currentTime;
LARGE_INTEGER Timer::frequency;
#endif

void Timer::Intialize(){
#ifdef WIN32
QueryPerformanceFrequency(&frequency);
#endif
}


LONG_LONG Timer::GetCurrentTime(){
#ifdef WIN32
QueryPerformanceCounter(&Timer::currentTime);

return (LONG_LONG)Timer::currentTime.QuadPart;
#else
timeval internalTime;
gettimeofday(&internalTime, NULL);

return (LONG_LONG)internalTime.tv_sec*Timer::secondsToMicro + internalTime.tv_usec;
#endif
}

LONG_LONG Timer::GetPrecision(){
#ifdef WIN32
return (LONG_LONG)Timer::frequency.QuadPart;
#else
return Timer::secondsToMicro;
#endif
}

LONG_LONG Timer::GetElapsedTimeMsec(LONG_LONG lastTime){
#ifdef WIN32
return Timer::GetCurrentTime() - lastTime;
#else
return Timer::GetCurrentTime() - lastTime;
#endif
}

float Timer::GetElapsedTimeSeconds(LONG_LONG lastTime){
#ifdef WIN32
return float(Timer::GetCurrentTime() - lastTime)/Timer::frequency.QuadPart;
#else
return float(Timer::GetCurrentTime() - lastTime)/Timer::secondsToMicro;
#endif
}

void Timer::StartStopwatch(){
start = Timer::GetCurrentTime();
}

LONG_LONG Timer::GetStopwatchTimeMsec(){
#ifdef WIN32
return Timer::GetCurrentTime() - start;
#else
return Timer::GetCurrentTime() - start;
#endif
}

float Timer::GetStopwatchTimeSec(){
#ifdef WIN32
return float(Timer::GetCurrentTime() - start)/Timer::frequency.QuadPart;
#else
return float(Timer::GetCurrentTime() - start)/Timer::secondsToMicro;
#endif
}
[/code]

Here are Kitty Cat's routines,cross-platform too:

Timer.h
[code]
#include <allegro.h>

#ifndef ALLEGRO_WINDOWS
#include <sys/time.h>

class Timer {
protected:
unsigned long current_tic;
unsigned long usecs_per_tic;
struct timeval now, last;

public:
// Starts the timer, generating 'speed' tics per second
void init(int speed)
{
current_tic = 0;
usecs_per_tic = 1000000 / speed;
reset();
};

// Resets the timer, leaving the tic count alone
void reset()
{
gettimeofday(&last, NULL);
now = last;
};

// Store the current tic in _tic, and return the number of tics since the
// last call
unsigned long get_tics(unsigned long &_tic)
{
gettimeofday(&now, NULL);
unsigned long c = ((unsigned long)(now.tv_sec-last.tv_sec)*1000000 +
(unsigned long)(now.tv_usec-last.tv_usec)) /
usecs_per_tic;

last.tv_usec += usecs_per_tic * c;
last.tv_sec += last.tv_usec/1000000;
last.tv_usec %= 1000000;

current_tic += c;
_tic = current_tic;
return c;
};
unsigned long get_tics()
{
unsigned long dummy;
return get_tics(dummy);
}
};

#else

#include <winalleg.h>
#include <mmsystem.h>

class Timer {
protected:
DWORD current_tic;
DWORD clocks_per_tic;
DWORD now, last;

public:
void init(int speed)
{
current_tic = 0;
clocks_per_tic = 1000 / speed;
reset();
};

void reset()
{
last = timeGetTime();
now = last;
};

unsigned long get_tics(unsigned long &_tic)
{
now = timeGetTime();
DWORD c = (now-last) / clocks_per_tic;

last += clocks_per_tic * c;

current_tic += c;
_tic = current_tic;
return (unsigned long)c;
};
unsigned long get_tics()
{
unsigned long dummy;
return get_tics(dummy);
}
};

#endif // _WIN32
[/code]

How to use KC's routines:
[code]
// Start the timers, in tics-per-second
main_timer.init(GAME_SPEED);
fps_timer.init(1);

while(true)
{
unsigned long current_tic, tics_passed;
while((tics_passed=main_timer.get_tics(current_tic)) == 0)
rest(1);

// Lag control
if(tics_passed > 10)
{
tics_passed = 10;
main_timer.reset();
}

while(tics_passed--)
do_logic();

draw();
++fps_ticker;

tics_passed = fps_timer.get_tics();
while(tics_passed > 0)
{
--tics_passed;

fps = fps_ticker;
fps_ticker = 0;
}
}[/code]


I also found Tobias Dammers's routines, with a comment from his post: "Uses QPC in windows if asked to, and falls back on allegro timers otherwise.
Doesn't support gettimeofday() yet, since I don't have a linux box running, but it can easily be extended. You can also hack it at will to use integers to represent time, or libc-time, or whatever you please. You may also want to put the whole thing into a class, or make it C-friendly.
Oh yes, and feel free to re-use."

timer.h:
[code]
#ifndef TIMER_H
#define TIMER_H

void init_timer(bool use_qpc = false);
void stop_timer();
void start_timer(int timer_ms);
float get_timer_delta();
const char* get_timing_method_string();
float get_timer_accuracy();

#endif
[/code]
timer.cpp:
[code]
#include <allegro.h>
#ifdef ALLEGRO_WINDOWS
#include <winalleg.h>
#endif
#include "timer.h"

bool initialized = false;

int timer = 0;
int timer_ms = 0;

#ifdef ALLEGRO_WINDOWS
LARGE_INTEGER last_perf_count_li;
LARGE_INTEGER perf_count_li;
LARGE_INTEGER perf_freq_li;
unsigned long long last_perf_count;
unsigned long long perf_count;
unsigned long long perf_freq;

unsigned long long li_to_ll(LARGE_INTEGER li) {
unsigned long long ll = li.HighPart;
ll = ll << 32;
ll |= li.LowPart;
return ll;
}
#endif

bool qpc_mode = false;

void timer_proc() {
++timer;
}
END_OF_FUNCTION(timer_proc);

void init_timer(bool use_qpc) {
#ifdef ALLEGRO_WINDOWS
if (use_qpc)
qpc_mode = QueryPerformanceFrequency(&perf_freq_li);
else
#endif
qpc_mode = false;

if (qpc_mode)
perf_freq = li_to_ll(perf_freq_li);

if (!qpc_mode) {
LOCK_VARIABLE(timer);
LOCK_FUNCTION(timer_proc);
install_timer();
}

initialized = true;
}

void stop_timer() {
if (!qpc_mode)
remove_int(timer_proc);
}

void start_timer(int _timer_ms) {
if (qpc_mode) {
#ifdef ALLEGRO_WINDOWS
QueryPerformanceCounter(&last_perf_count_li);
last_perf_count = li_to_ll(last_perf_count_li);
#endif
}
else {
stop_timer();
timer_ms = _timer_ms;
install_int(timer_proc, timer_ms);
}
}

float get_timer_delta() {
if (qpc_mode) {
#ifdef ALLEGRO_WINDOWS
QueryPerformanceCounter(&perf_count_li);
perf_count = li_to_ll(perf_count_li);
float result = (float)((double)(perf_count - last_perf_count) / (double)perf_freq);
last_perf_count = perf_count;
return result;
#else
return NULL;
#endif
}
else {
float result = (float)timer * (float)timer_ms * 0.001f;
timer = 0;
return result;
}
}

const char* get_timing_method_string() {
if (!initialized)
return "Not initialized!";
if (qpc_mode)
#ifdef ALLEGRO_WINDOWS
return "QueryPerformanceCounter";
#else
return "Something's terribly wrong";
#endif
return "Allegro timer routines";
}

float get_timer_accuracy() {
if (qpc_mode)
#ifdef ALLEGRO_WINDOWS
return (float)perf_freq;
#else
return 0.0f;
#endif
if (timer_ms)
return 1000.0f / (float)timer_ms;
return 0.0f;
}
[/code]

And Dustin Dettmer gave two links to his timers routines, here they are:
timer.h
[code]
#ifndef TIMER_H
#define TIMER_H

#ifndef WIN32
#include <sys/time.h>
#endif

extern class Timer {
public:

typedef unsigned int time_t;

private:

#ifdef WIN32
unsigned long long startTime;
unsigned long long currentTime;
#else
timeval startTime;
timeval currentTime;
#endif

public:

Timer();

time_t usecs();
time_t msecs();
time_t secs();

}timer;

#endif
[/code]
timer.cpp
[code]
#include "timer.h"
#include "log.h"

#ifdef WIN32
#include <windows.h>
#else
#include <sys/time.h>
#endif

Timer timer;

Timer::Timer()
{
#ifdef WIN32
startTime = GetTickCount();
#else
gettimeofday(&startTime, 0);
#endif
}

Timer::time_t Timer::usecs()
{
#ifdef WIN32
static bool onetime = 0;
static unsigned long long freq = 0;

if(!onetime) {

onetime = true;
QueryPerformanceFrequency((LARGE_INTEGER*)&freq);
}

QueryPerformanceCounter((LARGE_INTEGER*)&currentTime);

currentTime *= 1000000;
currentTime /= freq;

return currentTime - startTime * 1000;
#else
gettimeofday(&currentTime, 0);

return (unsigned long)((currentTime.tv_sec - startTime.tv_sec) / 1000)
+ (unsigned long)((currentTime.tv_usec - startTime.tv_usec));
#endif
}

Timer::time_t Timer::msecs()
{
#ifdef WIN32
currentTime = GetTickCount();

return currentTime - startTime;
#else
gettimeofday(&currentTime, 0);

return (unsigned long)((currentTime.tv_sec - startTime.tv_sec) * 1000)
+ (unsigned long)((currentTime.tv_usec - startTime.tv_usec) / 1000);
#endif
}

Timer::time_t Timer::secs()
{
#ifdef WIN32
currentTime = GetTickCount();

return (currentTime - startTime) / 1000;
#else
gettimeofday(&currentTime, 0);

return (unsigned long)(currentTime.tv_sec - startTime.tv_sec);
#endif
}

#ifdef TIMER_TEST
#include <allegro.h>

int main()
{
allegro_init();
install_keyboard();
set_gfx_mode(GFX_AUTODETECT_WINDOWED, 640, 480, 0, 0);
BITMAP *buffer = create_bitmap(SCREEN_W, SCREEN_H);

while(!key[KEY_ESC]) {

unsigned long secs = timer.secs();
unsigned long msecs = timer.msecs();

clear(buffer);
textprintf_ex(buffer, font, 10, 10, makecol(0xff, 0xff, 0xff), 0,
"Milliseconds: %lu, Seconds: %lu, Difference: %d", msecs, secs, (int)secs * 1000 - (int)msecs);
blit(buffer, screen, 0, 0, 0, 0, SCREEN_W, SCREEN_H);

if((key[KEY_0] && !(secs % 10))
|| (key[KEY_1] && !(secs % 1))
|| (key[KEY_2] && !(secs % 2))
|| (key[KEY_3] && !(secs % 3))
|| (key[KEY_4] && !(secs % 4))
|| (key[KEY_5] && !(secs % 5))
|| (key[KEY_6] && !(secs % 6))
|| (key[KEY_7] && !(secs % 7))
|| (key[KEY_8] && !(secs % 8))
|| (key[KEY_9] && !(secs % 9)))
while(!key[KEY_SPACE]);
}

destroy_bitmap(buffer);

return 0;
}

#endif
[/code]

Birdeeoh simple call to QPC:
"I keep a utilities library of my own writing for all this common stuff and this timing thing is definitely in there. Here's how I do it... the library has it's own function -"
[code]
uint64_t get_qpc()
{
LARGE_INTEGER li;
if( QueryPerformanceCounter( &li ))
return (((uint64_t)li.HighPart) << 32) | li.LowPart;
else
return 0;
}[/code]
"to get the raw counter. Then I have these functions -"
[code]
static uint64_t _starttime;
static uint64_t _freq;

void StartTimeCounter()
{
LARGE_INTEGER li;
if( QueryPerformanceFrequency( &li ))
_freq = (((uint64_t)li.HighPart) << 32) | li.LowPart;
else
_freq = 1; //if the frequency function doesn't work, neither will our system
_starttime = get_qpc();
}

void GetMicroSeconds()
{
return (uint64_t)( ( ( get_qpc() - _starttime )/(double)_freq )*1000000);
}
[/code]
"And that'll return the current uptime in microseconds for your program. Just call StartTimeCounter() right before you start your main loop and GetMicroSeconds() returns the number of microseconds since you started 'er up.

Notice I use the <stdint.h> definition for 64bit integers but with gcc, at least, uint64_t is equivalent to unsigned long long."



[u]Conclusion:[/u]

There are several way on various OS for timing stuff. I hope this topic is full enough of sources, documentations, ... for those in search of timing routines.

I wanna thanks all people here, plus all those who have helped/contribute with them on the various topic taken to make this one.

When editing this topic, I was thinking to a 'reference' topic.
I saw some posting their routines multiple times, and people are always in need of timer documentation.

Hope It'll help you as it have helped me.

Gull.

EDIT: Added Birdeeoh's simple call to QPC. I will rewrite this post for better understanding.



"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

axilmar
Member #1,204
April 2001

This stuff belongs to the tutorials section. Oh wait, there isn't one...

Arthur Kalliokoski
Second in Command
February 2005
avatar

http://www.codeguru.com/cpp/misc/misc/timers/article.php/c3895/

has some stuff for serializing the RDTSC instruction, in other words, if you don't do this, it might get executed at the same time as some stuff you're trying to time. I remember messing with something like this a couple years ago, and got exactly 36 clock ticks for an integer divide every time. This would probably be overkill for simply doing game timing at 100hz or so.

“Throughout history, poverty is the normal condition of man. Advances which permit this norm to be exceeded — here and there, now and then — are the work of an extremely small minority, frequently despised, often condemned, and almost always opposed by all right-thinking people. Whenever this tiny minority is kept from creating, or (as sometimes happens) is driven out of a society, the people then slip back into abject poverty. This is known as "bad luck.”

― Robert A. Heinlein

GullRaDriel
Member #3,861
September 2003
avatar

From Arthur's post:
From http://www.codeguru.com/cpp/misc/misc/timers/article.php/c3895/:

"I have been looking for some possibility to time certain pieces of code without too much overhead on my Intel PIII, Win98 machine. Here is my approach:

Beginning with Pentium, all IA-Family processors have a Real Time-Stamp Counter. This counter can be read at any privilege level, when TSD flag in CR4 is clear. The command for reading the counter is RDTSC. This command will load EDX:EAX with a 64 bit value representing the current time stamp (the time stamp actually is the number of cycles elapsed since reset).

RDTSC is not a serializing instruction, i.e. "it does not necessarily wait until all previous instructions have been executed before reading the counter. Similarly, subsequent instructions may begin execution before the read operation is performed." (Intel Architecture Software Developers Manual Volume 2: Instruction Set Reference, p. 3-604). To avoid inaccuracies, RDTSC has to be bracketed by a serializing instruction. CPUID is such an instruction.

Although it may seem complicated to use RDTSC to time code, it is not. I've written following four macros for this purpose:

1#ifdef _DEBUG
2#define PERF_DECLARE \
3 __int64 MSRB, MSRE; \
4 void *mrsb = &MSRB; \
5 void *mrse = &MSRE; \
6 char perfmtrbuf[100];
7 
8#define PERF_START \
9 {_asm mov eax, 0 \
10 _asm cpuid \
11 _asm rdtsc \
12 _asm mov ebx, mrsb \
13 _asm mov dword ptr [ebx], eax \
14 _asm mov dword ptr [ebx+4], edx \
15 _asm mov eax, 0 \
16 _asm cpuid}
17 
18#define PERF_STOP \
19 {_asm mov eax, 0 \
20 _asm cpuid \
21 _asm rdtsc \
22 _asm mov ebx, mrse \
23 _asm mov dword ptr [ebx], eax \
24 _asm mov dword ptr [ebx+4], edx \
25 _asm mov eax, 0 \
26 _asm cpuid}
27 
28#define PERF_REPORT \
29 {_ui64toa(MSRE-MSRB, perfmtrbuf, 10); \
30 TRACE("Cycles needed: %s\n", perfmtrbuf);}
31#else
32#define PERF_DECLARE
33#define PERF_START
34#define PERF_STOP
35#define PERF_REPORT
36#endif
37// _DEBUG

The function _ui64toa() is used, so stdlib.h has to be included prior defining these macros. Since I was using MFC, I used TRACE to output the results. Any other method for displaying and/or storing perfmtrbuf is good.

Using the macros is very simple:

void SomeFunction(void){
  PERF_DECLARE;
  //...code
  PERF_START;
  //...code to time
  PERF_STOP;
  PERF_REPORT;
  //...code
}

PERF_START and PERF_STOP/PERF_REPORT can bracket as many pieces of code one wants. One cannot pair them overlapped or nested, though. If nesting is a must (although I cannot imagine why it could be), the macros can be changed to accept parameters (the variable names to use).

1. Accuracy

Actually, the timing's accuracy is less-than-expected, at a first glance. Two consecitive timings can show up a difference of many cycles. Executing following function 60 times shows the behaviour of this timing method:

void Test()
{
  PERF_DECLARE;
  int i, j, k;
  PERF_START;
  for(i= 1;i<0xff; i++){
    for(j= 1;j<0xff; j++){
      k = i+j;
    }
  }
  PERF_STOP;
  PERF_REPORT;
}

The mean execution cycle number is 463885.38, with a standard deviation of 4.84%. Following chart shows this:

{"name":"exact_timer.jpg","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/e\/1\/e104ba1e25384dcc208a54e34e84179d.jpg","w":394,"h":299,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/e\/1\/e104ba1e25384dcc208a54e34e84179d"}exact_timer.jpg

Obviously, there is a lower threshold at 457313 cycles. The code mostly executes in this number of cycles, except some peaks when it needs more than that. How come? First, the interrupts are not off during code execution. Second, the code ran under Windows, which could preempt the process' execution. It does not apply for this piece of code, but the initial state of the processor's cache also can eat some cycles.

In conclusion, the code is being timed in it's "natural environment", not in a vacuum. If you need to know the bare execution time of a specific piece of code, repeated timing is a must. For short pieces of code, it is possible to get the exact number of cycles needed at minimun to execute.

2. Compatibility / Portability

The code clearly needs an Intel Pentium processor at minimum. At the time writing this, I have no information about the behavior of other processors. The compiler used must be able to handle the __int64 data type (not ANSI). The _asm is a Microsoft specific keyword; other compiler may define it different. The TRACE macro i have used is actually a question of context, and can be easily replaced with some equvalent. The macros actually do not need Windows at all. The only condition, except a Pentium+ processor is to have the TSD flag in CR4 cleared by the OS.

3. Pitfalls

The RDTSC instruction will cause a #GP fault, if the TSD flag in CR4 is set. At the time writing this I have no information about other operating systems regarding this point.

Both RDTSC and CPUID will generate an #UD exception on processors "older" than Pentium (486, 386 and so on).

If not paired properly, PERF_START and PERF_STOP will return bogus results, but no compile/runtime error.

Timing code that accesses any external device except the main memory, will probably produce totally inconsistent results. This is true for any simmilar timing method, by the way."

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

Go to: