![]() |
|
Fastest sqrt |
kazzmir
Member #1,786
December 2001
![]() |
I was using this code for a long time for square root becuase i thought it was faster than the sqrt function in math.h
but it turns out its alot slower than the normal sqrt function. What is the fastest integer square root function? Is the fixed sqrt function really fast? |
Bruce Perry
Member #270
April 2000
|
These days, on a PC, sqrt() is likely to be the fastest solution. The FPU can do it nice and quickly. Compilers will probably inline it too, in some cases. If you're targeting platforms like the GBA, which have no floating-point hardware, then sqrt() will be slow. Your function, since it works with integers, would probably be faster. Then again, the fixed square root function will also be fast... if you ever find yourself developing for such a platform, the best thing to do is to test it. -- |
kazzmir
Member #1,786
December 2001
![]() |
Actually i just ran a test, which i thought i had done when i first invented it, on my computer and it was about even with sqrt. So its not faster, but its not as slow as i thought. When running it on the Solaris machines at work my function was about 10 times slower than sqrt. I guess solaris has a good FPU? Either way, i guess i should just use sqrt instead. |
Goalie Ca
Member #2,579
July 2002
![]() |
you could always try casting the integer to double and use sqrt. I believe that the libraries use inline assembly (fsqrt) for that. ------------- |
amarillion
Member #940
January 2001
![]() |
Quote: Is the fixed sqrt function really fast? not faster than the FPU, since the allegro code is something like this:
-- |
A J
Member #3,025
December 2002
![]() |
if you sorta know what range of numbers you are going to compute, a lookup table might be quick. ___________________________ |
kazzmir
Member #1,786
December 2001
![]() |
hmm, thats a pretty good idea actually. Im not quite sure of the range, but i think its over 10000. Yes, i think that is quite a good idea, thanks alot! |
Evert
Member #794
November 2000
![]() |
Quote: I guess solaris has a good FPU?
You mean Sparc's, right? Solaris is just the OS (and not just for Sparc too, there's an i86 port, I think) And yeah, a lookup table may be the fastest - though I've heard that if you acces it a lot but suffer from cache misses, it will actually be slower than using the FPU on modern hardware... |
Bruce Perry
Member #270
April 2000
|
amarillion said: not faster than the FPU, since the allegro code is something like this: Are you sure of that? I thought Allegro used a look-up table, scaling as necessary (since sqrt(400)=20 can be derived from sqrt(4)=2, except it uses binary). Allegro may of course have multiple versions, including a generic one that should work everywhere... -- |
Thomas Harte
Member #33
April 2000
![]() |
The best information I have seen on producing the fastest possible square root for your particular hardware is at http://www.azillionmonkeys.com/qed/sqroot.html. Even PC programmers with modern C compilers might be interested in tips such as "Joachim Stadel wrote in to point out that the 3DNow! instruction set will allow for multiple parallel square roots to be performed. He is correct; here is his [asm] code" [My site] [Tetrominoes] |
|