Float is being odd

Relevant pieces of code:

float floattemp=1.00;(declared as a global)

Then, in a function:

if(key[KEY_F]) floattemp+=0.1; textprintf_ex(buffer,font,10,34,makecol(0,0,0),-1,"floattemp = %f",floattemp);

What this ends up doing, for me, is to add .1 as expected but after a while it ends up adding .000009 instead, or something along those lines. You can see the last digit "not keeping up" with the whole thing so I get 2.899999, then 4.09998 later, then 5.299997, etc.

Yet, if I start at 10.00 it soon goes to 12.200001, later 21.900045, etc.

Any idea what causes this?

Typical floating point inaccuracy? I thought it always did that ....

Floating point numbers don't have unlimited precision. The fractional part is stored via sums of . So if the fractional portion isn't easily created by sums of that, you'll lose precision.

For instance:

0.500 = 1/2

0.750 = 1/2 + 1/4

0.875 = 1/2 + 1/4 + 1/8

0.9375 = 1/2 + 1/4 + 1/8 + 1/16

Single precision float has 24 effective bits for the fractional portion. See this for more information.

That's normal.

To understand why that's happening you have to understand the concept of "Floating Point". The idea is that of the 4 or 8 bytes storing the number, one or more are used to describe the decimal point, the others are used to describe the value. Thus there is a range between the first and last digit you can adhere to and if you start to exit that range your accuracy decreases because the value can no longer get big enough to handle it.

I believe standard floats use the format IEEE, which means one byte (8-bits) describes the decimal point, the other three (24-bits) describe the actual value.

A 24-Bit number has a range of 16,777,216, so you can have at most about 6 or 7 significant digits in your float with any accuracy.

To see this in action, try storing the number 12345.12345 into a standard float. You're going to notice that it gets truncated.

However, you can still compare that number you just stored using == with 12345.12345 as the comparison value and still get TRUE because 12345.12345 as a constant still needs to be converted the same way to a floating point number to do the comparison, thus it turns out to be the same.

That's probably a lot more info than you needed, but the point is, floating point isn't 100% accurate, so integer based calculations will always be off by just a tiny amount.

--- Kris Asick (Gemini)

--- http://www.pixelships.com

More generally speaking, you can't convert fraction numbers from one base to another without losing precision. In a decimal system the value 1/10 is exactly 0.1 but 1/9 is 0.111111111... and no matter how many ones you add to it, you won't get it right. But the value 1/9 could be described as 0.1 in a base 9 system and that would be the precise value.

Since the computer calculates with a base 2 system (binary numbers), I guess the decimal value 0.1 ends up in some difficult floating point value with an unlimited number of digits.

Interesting! Thanks all. I didn't know floats worked that way.

As background, I currently have a gravity of 1 (integer) and it's too fast so I was going to convert the velocity of my character and the gravity to float, which should be fine, now that I know what's going on.

This thread is somewhat related (I have had the same problem)

http://www.allegro.cc/forums/thread/590483

Quote:

I believe standard floats use the format IEEE, which means one byte (8-bits) describes the decimal point, the other three (24-bits) describe the actual value.

A 24-Bit number has a range of 16,777,216, so you can have at most about 6 or 7 significant digits in your float with any accuracy.

To see this in action, try storing the number 12345.12345 into a standard float. You're going to notice that it gets truncated.

That's not completely accurate. It has a 1 bit sign, an 8 bit exponent, and a 23 bit significand. However, after normalization a leading 1 is assumed, so you effectively get 24 bits in the significand. This just means that the numbers 0 and 1 have to be specially defined.

For instance, the binary number:

100001.1001 (33.5625) would end up being being stored as `0 10000100 00001100100000000000000`. Note that the leading 1 gets dropped in the significand.

You are correct in that you get about 6 accurate digits for the general case, but technically if you are requesting a fraction that is easily represented by sums of 1/2^{n}, then the number of digits is much higher.

The largest number smaller than 1:

int main(void) { int f = (0x7E << 23) | (0x7FFFFF); printf("%.20f\n", *((float *)&f)); return 0; }

Output: 0.99999994039535522

So if that happened to be exactly the number you wanted, you are getting 17 digits of precision.

I'd like to second Timorg's link, Carrus85 wrote a very detailed explanation in that thread to more or less the exact same question.

I usually use something like this to comapre floats.

bool roughlyEqual(double a, double b, double tol) { return (fabs(a - b) <= tol); }

Quote:

That's not completely accurate.

I know, but I didn't want to go searching for bit counts. I had just woken up.

...isn't it 1/7/24 though, not 1/8/23 for single precision floats?

--- Kris Asick (Gemini)

--- http://www.pixelships.com

It's 1/8/23, but when you normalize the binary number and drop the leading 1, you get 1/8/24.

HoHo: Ah yes, a pretty good read (that same document was referenced for my numeric programming class).

As for the post about floating point numbers, here it is:

http://www.allegro.cc/forums/thread/590483/657055#target

Quote:

What Every Computer Scientist Should Know About Floating-Point Arithmetic

Thank you. This will help some friends of mine.

Years ago there were jokes about some Pentium processor that failed badly in simple 1 + 1 calculus.

"How many intel engineers are needed for changing the damn light bulb?"

"0.99999999999999997869958476."

[edit]

*googlegoogle*

Quote:

That's not completely accurate. It has a 1 bit sign, an 8 bit exponent, and a 23 bit significand. However, after normalization a leading 1 is assumed, so you effectively get 24 bits in the significand. This just means that the numbers 0 and 1 have to be specially defined.

For instance, the binary number:

100001.1001 (33.5625) would end up being being stored as 0 10000100 00001100100000000000000. Note that the leading 1 gets dropped in the significand.

That's interesting. I didn't know the about the dropping of the leading 1. It's a clever way to squeeze out an extra bit of precision - obviously the first significant bit has to be a 1, because the only other choice is 0! It's interesting that that trick is only possible in base 2.

The exponent is just an ordinary *signed* integer with 8 bits, right?

Quote:

The exponent is just an ordinary signed integer with 8 bits, right?

No, it's a biased exponent.

Wikipedia said:

The exponent is biased by 2^{8 − 1} − 1 = 127 in this case (Exponents in the range −126 to +127 are representable. See the above explanation to understand why biasing is done). An exponent of −127 would be biased to the value 0 but this is reserved to encode that the value is a denormalized number or zero. An exponent of 128 would be biased to the value 255 but this is reserved to encode an infinity or not a number (NaN). See the chart above.

If the leading zero weren't dropped, then an 8-bit exponent would give you 0 => -127 and 255 => +128. However, the extrema are reserved to define 0 and infinity. (Zero cannot otherwise be represented since the leading 1 is assumed.) So the effective range is -126 to 127. Losing two values is a very small price to pay for gaining an extra bit. With a 7-bit exponent, you'd only get -63 to 64.