Allegro.cc - Online Community

Allegro.cc Forums » Programming Questions » Fastest sqrt

This thread is locked; no one can reply to it. rss feed Print
Fastest sqrt
kazzmir
Member #1,786
December 2001
avatar

I was using this code for a long time for square root becuase i thought it was faster than the sqrt function in math.h

1long int tsqr( const long int r ) {
2 return r*r;
3}
4 
5long int tsqrt( const long int q ) {
6 
7 if ( q <= 0 )
8 return 0;
9 
10 long int min = 0;
11 
12 while ( tsqr( min ) < q )
13 min += 30;
14 
15 while ( tsqr( min ) > q )
16 min -= 10;
17 
18 while ( tsqr( min ) < q )
19 min++;
20 
21 if ( tsqr( min ) > q )
22 min--;
23 return min;
24 
25}

but it turns out its alot slower than the normal sqrt function. What is the fastest integer square root function? Is the fixed sqrt function really fast?

Bruce Perry
Member #270
April 2000

These days, on a PC, sqrt() is likely to be the fastest solution. The FPU can do it nice and quickly. Compilers will probably inline it too, in some cases.

If you're targeting platforms like the GBA, which have no floating-point hardware, then sqrt() will be slow. Your function, since it works with integers, would probably be faster. Then again, the fixed square root function will also be fast... if you ever find yourself developing for such a platform, the best thing to do is to test it.

--
Bruce "entheh" Perry [ Web site | DUMB | Set Up Us The Bomb !!! | Balls ]
Programming should be fun. That's why I hate C and C++.
The brxybrytl has you.

kazzmir
Member #1,786
December 2001
avatar

Actually i just ran a test, which i thought i had done when i first invented it, on my computer and it was about even with sqrt. So its not faster, but its not as slow as i thought. When running it on the Solaris machines at work my function was about 10 times slower than sqrt. I guess solaris has a good FPU? Either way, i guess i should just use sqrt instead.

Goalie Ca
Member #2,579
July 2002
avatar

you could always try casting the integer to double and use sqrt. I believe that the libraries use inline assembly (fsqrt) for that.

-------------
Bah weep granah weep nini bong!

amarillion
Member #940
January 2001
avatar

Quote:

Is the fixed sqrt function really fast?

not faster than the FPU, since the allegro code is something like this:

fixed fixsqrt (fixed value)
{
    return ftofix (sqrt(fixtof (value)));
}

A J
Member #3,025
December 2002
avatar

if you sorta know what range of numbers you are going to compute, a lookup table might be quick.

___________________________
The more you talk, the more AJ is right. - ML

kazzmir
Member #1,786
December 2001
avatar

hmm, thats a pretty good idea actually. Im not quite sure of the range, but i think its over 10000. Yes, i think that is quite a good idea, thanks alot!

Evert
Member #794
November 2000
avatar

Quote:

I guess solaris has a good FPU?

You mean Sparc's, right? Solaris is just the OS (and not just for Sparc too, there's an i86 port, I think) ;)
Anyway, yeah, Sparc stations are supposed to have some very nifty hardware for doing computational stuff. They were designed for doing scientific computations, after all.

And yeah, a lookup table may be the fastest - though I've heard that if you acces it a lot but suffer from cache misses, it will actually be slower than using the FPU on modern hardware...

Bruce Perry
Member #270
April 2000

amarillion said:

not faster than the FPU, since the allegro code is something like this:

Are you sure of that? I thought Allegro used a look-up table, scaling as necessary (since sqrt(400)=20 can be derived from sqrt(4)=2, except it uses binary).

Allegro may of course have multiple versions, including a generic one that should work everywhere...

--
Bruce "entheh" Perry [ Web site | DUMB | Set Up Us The Bomb !!! | Balls ]
Programming should be fun. That's why I hate C and C++.
The brxybrytl has you.

Thomas Harte
Member #33
April 2000
avatar

The best information I have seen on producing the fastest possible square root for your particular hardware is at http://www.azillionmonkeys.com/qed/sqroot.html. Even PC programmers with modern C compilers might be interested in tips such as "Joachim Stadel wrote in to point out that the 3DNow! instruction set will allow for multiple parallel square roots to be performed. He is correct; here is his [asm] code"

Go to: