MSVC 6 floating point maths bug.

Andrei Ellman

I think I have found a bug in MSVC 6's handling of floating-point code.

To demonstrate the problem, create a new workspace in MSVC 6 which is a Win32 console app and use it to compile the following code (as either C or C++) without tweaking any of the project-settings. When it asks you to enter three integers, enter 255 three times.

1/* Program to demonstrate some anomalous behavouor of floating point calculations in MSVC */
2 
3#include <stdio.h>
4 
5// #include <math.h>  /* It makes no difference if <math.h> is included or not */
6 
7 
8#define MIN(x,y)     (((x) < (y)) ? (x) : (y))
9#define MAX(x,y)     (((x) > (y)) ? (x) : (y))
10 
11 
12 
13int main(void)
14{
15  int ia, ib, ic;
16  float fa, fb, fc;
17  float fx, fy;
18  float max, min, delta;
19 
20 
21  printf("Enter three integers (to demonstrate the problem, enter 255 three times)\n");
22  scanf("%i %i %i", &ia, &ib, &ic);
23 
24 
25  fa = (float)ia / 255.0f;
26  fb = (float)ib / 255.0f;
27  fc = (float)ic / 255.0f;
28 
29  max = MAX(fa, MAX(fb, fc));
30  min = MIN(fa, MIN(fb, fc));
31 
32 
33  fx = max+min;
34 
35  delta = max - min;
36 
37  if(delta == 0.0f)
38    fy = 0.0f;
39  else
40    fy = delta/(2.0f-fx);
41 
42 
43  printf("\nfx == %f, fy == %f , fc == %f\n\n",fx, fy, fc); /* For some strange reason, the problem doesn't show up if fc is ommitted */
44  printf("delta == %f\n\n", delta);
45//  printf("max, min == %f, %f\n\n", max, min); /* For some strange reason, if this line is un-commented, the problem doesn't show up */
46 
47 
48  return 0;
49}

I am using MSVC 6 with service-pack 5 applied under Windows 2000. My machine is a 600MHZ AMD Athlon.

I also tried the code under Cygwin's GCC ( gcc version 3.2 20020927 (prerelease) ) and I've not been able to re-produce the problem. I am using the following command to compile:gcc -g testmsvcfloatbug.c -O3 -ffast-math -fomit-frame-pointer -o testmsvcfloatbug_gcc.exe. I have tried various combinations of this command involving replacing -O3 with -O2, using the -fno-strength-reduce flag, using -march=pentium2 and removing the -ffast-math flag. Despite this, the GCC version always outputs the correct results.

Anyway, here is the correct output which is what I would expect it to be.

Enter three integers (to demonstrate the problem, enter 255 three times)
255
255
255

fx == 2.000000, fy == 0.000000 , fc == 1.000000

delta == 0.000000

Here is what happens under MSVC

Enter three integers (to demonstrate the problem, enter 255 three times)
255
255
255

fx == 2.000000, fy == 1.#INF00 , fc == 1.000000

delta == 0.000000

The problem in MSVC appears regardless of whether the code is compiled as C or C++. However, in the debug-build, it behaves correctly. Also, if I replace the floats with doubles, it behaves correctly as well.

The code contains a line that compares a floating point variable 'delta' to 0.0f using '==' without any form of epsilon. Seeing that 255 was entered 3 times, then 'delta' whould be the result of subtracting two equal numbers which is 1.0f/255.0f - 1.0f/255.0f which should be 0.0f, right?

If I change the compare from "if(delta == 0.0f)" to "if(delta < 1.0f/1024.0f)", the problem disappears. One theory is that the FPU uses a system of odd-even rounding. That is, one "1.0f/255.0f" is rounded down to fit a float and the other is rounded up. The result of this is a very small difference that appears in 'delta' when the maximum and minimum are subtracted, which makes it past the "==0.0f" test and causes the divide to produce such a large number that it's as good as infinity. However, at the end, it always prints delta out to be "0.000000".

Also, by messing around with the number of floats used (eg. try un-commenting the last printf), the problem disappears as well. This would imply that there's a bug somewhere in MSVC's code for handling floats.

But nethertheless, the fact remains that GCC and MSVC (release-build) produce floating-point code with different behaviour which is demonstrated by the example program in this post.

Is this a know issure in MSVC 6? What other versions of MSVC exhibit this behavoiur?

AE.

damage

Hmm, that's a weird bug. You could try looking at the actual fp assembly output. Microsoft probably would want to know about, if it isn't fixed in VC7 yet.

On the topic of weird bugs, here's a cool GCC bug I've tried up to GCC 3.2.2 (I've yet to compile 3.3). Any fool can crash their own program but it takes a lot of work to crash the compiler.

1template <class T> 
2class A {
3    class B;
4};
5 
6template <class T>
7class A<T> :: B {
8    B(T t);
9};
10 
11template <class T>
12void              // error
13A<T> :: B :: 
14B (T t) {
15}
16 
17template A;

This may sound obscure but it actually had me scratching my head and trying to figure out what I had done wrong, as the compiler kept crashing instead of giving an error message.

CGamesPlay

Did you try it with %g or %E instead of %f?

Obviously it is some problem with the stack, but I'd like to think it's a problem with the format specifier and the size it's pulling from the stack.

Andrei Ellman

I've done a bit more research into this. I've been tweaking MSVC's optimisation options to narrow down the bug somewhat. In the C/C++ tab of the Project Settings dialog, select "optimisations" from dropdown list. In the optimisations dropdown list, select customise. Leave all checkboxes unchecked except "global optimisations". This is where the bug lies. if "global optimisations" is unchecked, then the code works as it should. Selecting the CPU type has no effect.

The MSVC Assembly source and the machine-code bytes generated using an "Assembly with Machine Code" listing. The only change between the optimised and un-optimised versions is that the "global optimisations" checkbox is checked.

In gcc, I used the following commandline to generate the source[code]gcc -S testmsvcfloatbug.c -O3 -ffast-math -fomit-frame-pointer[/code]
I have not examined the code in great detail. At least the MSVC version and the GCC version use the same float-constant for 1/255: 998277249 decimal or 3b808081 hex

Anyway, here is the assembly output.

Un-optimised MSVC version
[code]
TITLE E:\programi\wackodev\testmsvcfloatbug\testmsvcfloatbug.c
.386P
include listing.inc
if @Version gt 510
.model FLAT
else
_TEXT SEGMENT PARA USE32 PUBLIC 'CODE'
_TEXT ENDS
_DATA SEGMENT DWORD USE32 PUBLIC 'DATA'
_DATA ENDS
CONST SEGMENT DWORD USE32 PUBLIC 'CONST'
CONST ENDS
_BSS SEGMENT DWORD USE32 PUBLIC 'BSS'
_BSS ENDS
_TLS SEGMENT DWORD USE32 PUBLIC 'TLS'
_TLS ENDS
FLAT GROUP _DATA, CONST, _BSS
ASSUME CS: FLAT, DS: FLAT, SS: FLAT
endif
PUBLIC _main
PUBLIC __real@4@4006ff00000000000000
PUBLIC __real@4@00000000000000000000
PUBLIC __real@4@40008000000000000000
EXTRN _printf:NEAR
EXTRN _scanf:NEAR
EXTRN __fltused:NEAR
_DATA SEGMENT
$SG782 DB 'Enter three integers (to demonstrate the problem, enter '
DB '255 three times)', 0aH, 00H
ORG $+2
$SG783 DB '%i %i %i', 00H
ORG $+3
$SG789 DB 0aH, 'fx == %f, fy == %f , fc == %f', 0aH, 0aH, 00H
ORG $+3
$SG790 DB 'delta == %f', 0aH, 0aH, 00H
_DATA ENDS
; COMDAT __real@4@4006ff00000000000000
; File E:\programi\wackodev\testmsvcfloatbug\testmsvcfloatbug.c
CONST SEGMENT
__real@4@4006ff00000000000000 DD 0437f0000r ; 255
CONST ENDS
; COMDAT __real@4@00000000000000000000
CONST SEGMENT
__real@4@00000000000000000000 DD 000000000r ; 0
CONST ENDS
; COMDAT __real@4@40008000000000000000
CONST SEGMENT
__real@4@40008000000000000000 DD 040000000r ; 2
CONST ENDS
_TEXT SEGMENT
_ia$ = -28
_ib$ = -36
_ic$ = -40
_fa$ = -44
_fb$ = -4
_fc$ = -8
_fx$ = -16
_fy$ = -20
_max$ = -12
_min$ = -32
_delta$ = -24
_main PROC NEAR
; File E:\programi\wackodev\testmsvcfloatbug\testmsvcfloatbug.c
; Line 14
00000 55 push ebp
00001 8b ec mov ebp, esp
00003 83 ec 44 sub esp, 68 ; 00000044H
; Line 21
00006 68 00 00 00 00 push OFFSET FLAT:$SG782
0000b e8 00 00 00 00 call _printf
00010 83 c4 04 add esp, 4
; Line 22
00013 8d 45 d8 lea eax, DWORD PTR _ic$[ebp]
00016 50 push eax
00017 8d 4d dc lea ecx, DWORD PTR _ib$[ebp]
0001a 51 push ecx
0001b 8d 55 e4 lea edx, DWORD PTR _ia$[ebp]
0001e 52 push edx
0001f 68 00 00 00 00 push OFFSET FLAT:$SG783
00024 e8 00 00 00 00 call _scanf
00029 83 c4 10 add esp, 16 ; 00000010H
; Line 25
0002c db 45 e4 fild DWORD PTR _ia$[ebp]
0002f d8 35 00 00 00
00 fdiv DWORD PTR __real@4@4006ff00000000000000
00035 d9 5d d4 fstp DWORD PTR _fa$[ebp]
; Line 26
00038 db 45 dc fild DWORD PTR _ib$[ebp]
0003b d8 35 00 00 00
00 fdiv DWORD PTR __real@4@4006ff00000000000000
00041 d9 5d fc fstp DWORD PTR _fb$[ebp]
; Line 27
00044 db 45 d8 fild DWORD PTR _ic$[ebp]
00047 d8 35 00 00 00
00 fdiv DWORD PTR __real@4@4006ff00000000000000
0004d d9 5d f8 fstp DWORD PTR _fc$[ebp]
; Line 29
00050 d9 45 fc fld DWORD PTR _fb$[ebp]
00053 d8 5d f8 fcomp DWORD PTR _fc$[ebp]
00056 df e0 fnstsw ax
00058 f6 c4 41 test ah, 65 ; 00000041H
0005b 75 08 jne SHORT $L794
0005d 8b 45 fc mov eax, DWORD PTR _fb$[ebp]
00060 89 45 d0 mov DWORD PTR -48+[ebp], eax
00063 eb 06 jmp SHORT $L795
$L794:
00065 8b 4d f8 mov ecx, DWORD PTR _fc$[ebp]
00068 89 4d d0 mov DWORD PTR -48+[ebp], ecx
$L795:
0006b d9 45 d4 fld DWORD PTR _fa$[ebp]
0006e d8 5d d0 fcomp DWORD PTR -48+[ebp]
00071 df e0 fnstsw ax
00073 f6 c4 41 test ah, 65 ; 00000041H
00076 75 08 jne SHORT $L798
00078 8b 55 d4 mov edx, DWORD PTR _fa$[ebp]
0007b 89 55 cc mov DWORD PTR -52+[ebp], edx
0007e eb 21 jmp SHORT $L799
$L798:
00080 d9 45 fc fld DWORD PTR _fb$[ebp]
00083 d8 5d f8 fcomp DWORD PTR _fc$[ebp]
00086 df e0 fnstsw ax
00088 f6 c4 41 test ah, 65 ; 00000041H
0008b 75 08 jne SHORT $L796
0008d 8b 45 fc mov eax, DWORD PTR _fb$[ebp]
00090 89 45 c8 mov DWORD PTR -56+[ebp], eax
00093 eb 06 jmp SHORT $L797
$L796:
00095 8b 4d f8 mov ecx, DWORD PTR _fc$[ebp]
00098 89 4d c8 mov DWORD PTR -56+[ebp], ecx
$L797:
0009b 8b 55 c8 mov edx, DWORD PTR -56+[ebp]
0009e 89 55 cc mov DWORD PTR -52+[ebp], edx
$L799:
000a1 8b 45 cc mov eax, DWORD PTR -52+[ebp]
000a4 89 45 f4 mov DWORD PTR _max$[ebp], eax
; Line 30
000a7 d9 45 fc fld DWORD PTR _fb$[ebp]
000aa d8 5d f8 fcomp DWORD PTR _fc$[ebp]
000ad df e0 fnstsw ax
000af f6 c4 01 test ah, 1
000b2 74 08 je SHORT $L800
000b4 8b 4d fc mov ecx, DWORD PTR _fb$[ebp]
000b7 89 4d c4 mov DWORD PTR -60+[ebp], ecx
000ba eb 06 jmp SHORT $L801
$L800:
000bc 8b 55 f8 mov edx, DWORD PTR _fc$[ebp]
000bf 89 55 c4 mov DWORD PTR -60+[ebp], edx
$L801:
000c2 d9 45 d4 fld DWORD PTR _fa$[ebp]
000c5 d8 5d c4 fcomp DWORD PTR -60+[ebp]
000c8 df e0 fnstsw ax
000ca f6 c4 01 test ah, 1
000cd 74 08 je SHORT $L804
000cf 8b 45 d4 mov eax, DWORD PTR _fa$[ebp]
000d2 89 45 c0 mov DWORD PTR -64+[ebp], eax
000d5 eb 21 jmp SHORT $L805
$L804:
000d7 d9 45 fc fld DWORD PTR _fb$[ebp]
000da d8 5d f8 fcomp DWORD PTR _fc$[ebp]
000dd df e0 fnstsw ax
000df f6 c4 01 test ah, 1
000e2 74 08 je SHORT $L802
000e4 8b 4d fc mov ecx, DWORD PTR _fb$[ebp]
000e7 89 4d bc mov DWORD PTR -68+[ebp], ecx
000ea eb 06 jmp SHORT $L803
$L802:
000ec 8b 55 f8 mov edx, DWORD PTR _fc$[ebp]
000ef 89 55 bc mov DWORD PTR -68+[ebp], edx
$L803:
000f2 8b 45 bc mov eax, DWORD PTR -68+[ebp]
000f5 89 45 c0 mov DWORD PTR -64+[ebp], eax
$L805:
000f8 8b 4d c0 mov ecx, DWORD PTR -64+[ebp]
000fb 89 4d e0 mov DWORD PTR _min$[ebp], ecx
; Line 33
000fe d9 45 f4 fld DWORD PTR _max$[ebp]
00101 d8 45 e0 fadd DWORD PTR _min$[ebp]
00104 d9 5d f0 fstp DWORD PTR _fx$[ebp]
; Line 35
00107 d9 45 f4 fld DWORD PTR _max$[ebp]
0010a d8 65 e0 fsub DWORD PTR _min$[ebp]
0010d d9 55 e8 fst DWORD PTR _delta$[ebp]
; Line 37
00110 d8 1d 00 00 00
00 fcomp DWORD PTR __real@4@00000000000000000000
00116 df e0 fnstsw ax
00118 f6 c4 40 test ah, 64 ; 00000040H
0011b 74 09 je SHORT $L787
; Line 38
0011d c7 45 ec 00 00
00 00 mov DWORD PTR _fy$[ebp], 0
; Line 39
00124 eb 0f jmp SHORT $L788
$L787:
; Line 40
00126 d9 05 00 00 00
00 fld DWORD PTR __real@4@40008000000000000000
0012c d8 65 f0 fsub DWORD PTR _fx$[ebp]
0012f d8 7d e8 fdivr DWORD PTR _delta$[ebp]
00132 d9 5d ec fstp DWORD PTR _fy$[ebp]
$L788:
; Line 43
00135 d9 45 f8 fld DWORD PTR _fc$[ebp]
00138 83 ec 08 sub esp, 8
0013b dd 1c 24 fstp QWORD PTR [esp]
0013e d9 45 ec fld DWORD PTR _fy$[ebp]
00141 83 ec 08 sub esp, 8
00144 dd 1c 24 fstp QWORD PTR [esp]
00147 d9 45 f0 fld DWORD PTR _fx$[ebp]
0014a 83 ec 08 sub esp, 8
0014d dd 1c 24 fstp QWORD PTR [esp]
00150 68 00 00 00 00 push OFFSET FLAT:$SG789
00155 e8 00 00 00 00 call _printf
0015a 83 c4 1c add esp, 28 ; 0000001cH
; Line 44
0015d d9 45 e8 fld DWORD PTR _delta$[ebp]
00160 83 ec 08 sub esp, 8
00163 dd 1c 24 fstp QWORD PTR [esp]
00166 68 00 00 00 00 push OFFSET FLAT:$SG790
0016b e8 00 00 00 00 call _printf
00170 83 c4 0c add esp, 12 ; 0000000cH
; Line 48
00173 33 c0 xor eax, eax
; Line 49
00175 8b e5 mov esp, ebp
00177 5d pop ebp
00178 c3 ret 0
_main ENDP
_TEXT ENDS
END
[/code]

MSVC version with only the "global optimisations" optimisation (this is where the bug rears it's ugly head).
[code]
TITLE E:\programi\wackodev\testmsvcfloatbug\testmsvcfloatbug.c
.386P
include listing.inc
if @Version gt 510
.model FLAT
else
_TEXT SEGMENT PARA USE32 PUBLIC 'CODE'
_TEXT ENDS
_DATA SEGMENT DWORD USE32 PUBLIC 'DATA'
_DATA ENDS
CONST SEGMENT DWORD USE32 PUBLIC 'CONST'
CONST ENDS
_BSS SEGMENT DWORD USE32 PUBLIC 'BSS'
_BSS ENDS
_TLS SEGMENT DWORD USE32 PUBLIC 'TLS'
_TLS ENDS
FLAT GROUP _DATA, CONST, _BSS
ASSUME CS: FLAT, DS: FLAT, SS: FLAT
endif
PUBLIC _main
PUBLIC __real@4@3ff78080808080808000
PUBLIC __real@4@00000000000000000000
PUBLIC __real@4@40008000000000000000
EXTRN _printf:NEAR
EXTRN _scanf:NEAR
EXTRN __fltused:NEAR
_DATA SEGMENT
$SG782 DB 'Enter three integers (to demonstrate the problem, enter '
DB '255 three times)', 0aH, 00H
ORG $+2
$SG783 DB '%i %i %i', 00H
ORG $+3
$SG789 DB 0aH, 'fx == %f, fy == %f , fc == %f', 0aH, 0aH, 00H
ORG $+3
$SG790 DB 'delta == %f', 0aH, 0aH, 00H
_DATA ENDS
; COMDAT __real@4@3ff78080808080808000
; File E:\programi\wackodev\testmsvcfloatbug\testmsvcfloatbug.c
CONST SEGMENT
__real@4@3ff78080808080808000 DD 03b808081r ; 0.00392157
CONST ENDS
; COMDAT __real@4@00000000000000000000
CONST SEGMENT
__real@4@00000000000000000000 DD 000000000r ; 0
CONST ENDS
; COMDAT __real@4@40008000000000000000
CONST SEGMENT
__real@4@40008000000000000000 DD 040000000r ; 2
CONST ENDS
_TEXT SEGMENT
_ia$ = -12
_ib$ = -16
_ic$ = -20
_fa$ = -8
_fb$ = -4
_fx$ = -4
_delta$ = -8
_main PROC NEAR
; File E:\programi\wackodev\testmsvcfloatbug\testmsvcfloatbug.c
; Line 14
00000 55 push ebp
00001 8b ec mov ebp, esp
00003 83 ec 14 sub esp, 20 ; 00000014H
; Line 21
00006 68 00 00 00 00 push OFFSET FLAT:$SG782
0000b e8 00 00 00 00 call _printf
; Line 22
00010 8d 45 ec lea eax, DWORD PTR _ic$[ebp]
00013 8d 4d f0 lea ecx, DWORD PTR _ib$[ebp]
00016 50 push eax
00017 8d 55 f4 lea edx, DWORD PTR _ia$[ebp]
0001a 51 push ecx
0001b 52 push edx
0001c 68 00 00 00 00 push OFFSET FLAT:$SG783
00021 e8 00 00 00 00 call _scanf
; Line 25
00026 db 45 f4 fild DWORD PTR _ia$[ebp]
00029 83 c4 14 add esp, 20 ; 00000014H
0002c d8 0d 00 00 00
00 fmul DWORD PTR __real@4@3ff78080808080808000
00032 d9 5d f8 fstp DWORD PTR _fa$[ebp]
; Line 26
00035 db 45 f0 fild DWORD PTR _ib$[ebp]
00038 d8 0d 00 00 00
00 fmul DWORD PTR __real@4@3ff78080808080808000
0003e d9 5d fc fstp DWORD PTR _fb$[ebp]
; Line 27
00041 db 45 ec fild DWORD PTR _ic$[ebp]
00044 d8 0d 00 00 00
00 fmul DWORD PTR __real@4@3ff78080808080808000
; Line 29
0004a d9 45 fc fld DWORD PTR _fb$[ebp]
0004d d8 d9 fcomp ST(1)
0004f df e0 fnstsw ax
00051 f6 c4 41 test ah, 65 ; 00000041H
00054 75 05 jne SHORT $L794
00056 d9 45 fc fld DWORD PTR _fb$[ebp]
00059 eb 02 jmp SHORT $L795
$L794:
0005b d9 c0 fld ST(0)
$L795:
0005d d9 45 f8 fld DWORD PTR _fa$[ebp]
00060 d8 d9 fcomp ST(1)
00062 df e0 fnstsw ax
00064 f6 c4 41 test ah, 65 ; 00000041H
00067 dd d8 fstp ST(0)
00069 75 05 jne SHORT $L798
0006b d9 45 f8 fld DWORD PTR _fa$[ebp]
0006e eb 13 jmp SHORT $L797
$L798:
00070 d9 45 fc fld DWORD PTR _fb$[ebp]
00073 d8 d9 fcomp ST(1)
00075 df e0 fnstsw ax
00077 f6 c4 41 test ah, 65 ; 00000041H
0007a 75 05 jne SHORT $L796
0007c d9 45 fc fld DWORD PTR _fb$[ebp]
0007f eb 02 jmp SHORT $L797
$L796:
00081 d9 c0 fld ST(0)
$L797:
; Line 30
00083 d9 45 fc fld DWORD PTR _fb$[ebp]
00086 d8 da fcomp ST(2)
00088 df e0 fnstsw ax
0008a f6 c4 01 test ah, 1
0008d 74 05 je SHORT $L800
0008f d9 45 fc fld DWORD PTR _fb$[ebp]
00092 eb 02 jmp SHORT $L801
$L800:
00094 d9 c1 fld ST(1)
$L801:
00096 d9 45 f8 fld DWORD PTR _fa$[ebp]
00099 d8 d9 fcomp ST(1)
0009b df e0 fnstsw ax
0009d f6 c4 01 test ah, 1
000a0 dd d8 fstp ST(0)
000a2 74 05 je SHORT $L804
000a4 d9 45 f8 fld DWORD PTR _fa$[ebp]
000a7 eb 13 jmp SHORT $L803
$L804:
000a9 d9 45 fc fld DWORD PTR _fb$[ebp]
000ac d8 da fcomp ST(2)
000ae df e0 fnstsw ax
000b0 f6 c4 01 test ah, 1
000b3 74 05 je SHORT $L802
000b5 d9 45 fc fld DWORD PTR _fb$[ebp]
000b8 eb 02 jmp SHORT $L803
$L802:
000ba d9 c1 fld ST(1)
$L803:
; Line 33
000bc d9 c0 fld ST(0)
000be d8 c2 fadd ST(0), ST(2)
000c0 d9 5d fc fstp DWORD PTR _fx$[ebp]
; Line 35
000c3 d9 c9 fxch ST(1)
000c5 d8 e1 fsub ST(0), ST(1)
000c7 d9 5d f8 fstp DWORD PTR _delta$[ebp]
000ca dd d8 fstp ST(0)
; Line 37
000cc d9 45 f8 fld DWORD PTR _delta$[ebp]
000cf d8 1d 00 00 00
00 fcomp DWORD PTR __real@4@00000000000000000000
000d5 df e0 fnstsw ax
000d7 f6 c4 40 test ah, 64 ; 00000040H
000da 74 08 je SHORT $L787
; Line 38
000dc d9 05 00 00 00
00 fld DWORD PTR __real@4@00000000000000000000
; Line 39
000e2 eb 0c jmp SHORT $L788
$L787:
; Line 40
000e4 d9 05 00 00 00
00 fld DWORD PTR __real@4@40008000000000000000
000ea d8 65 fc fsub DWORD PTR _fx$[ebp]
000ed d8 7d f8 fdivr DWORD PTR _delta$[ebp]
$L788:
; Line 43
000f0 83 ec 08 sub esp, 8
000f3 d9 c9 fxch ST(1)
000f5 dd 1c 24 fstp QWORD PTR [esp]
000f8 83 ec 08 sub esp, 8
000fb dd 1c 24 fstp QWORD PTR [esp]
000fe d9 45 fc fld DWORD PTR _fx$[ebp]
00101 83 ec 08 sub esp, 8
00104 dd 1c 24 fstp QWORD PTR [esp]
00107 68 00 00 00 00 push OFFSET FLAT:$SG789
0010c e8 00 00 00 00 call _printf
; Line 44
00111 d9 45 f8 fld DWORD PTR _delta$[ebp]
00114 83 c4 14 add esp, 20 ; 00000014H
00117 dd 1c 24 fstp QWORD PTR [esp]
0011a 68 00 00 00 00 push OFFSET FLAT:$SG790
0011f e8 00 00 00 00 call _printf
00124 83 c4 0c add esp, 12 ; 0000000cH
; Line 48
00127 33 c0 xor eax, eax
; Line 49
00129 8b e5 mov esp, ebp
0012b 5d pop ebp
0012c c3 ret 0
_main ENDP
_TEXT ENDS
END
[/code]

Optimised GCC version (-O3 -ffast-math -fomit-frame-pointer)
[code]
.file "testmsvcfloatbug.c"
.def ___main; .scl 2; .type 32; .endef
.text
.align 32
LC0:
.ascii "Enter three integers (to demonstrate the problem, enter 255 three times)\0"
LC1:
.ascii "%i %i %i\0"
.align 32
LC6:
.ascii "\12fx == %f, fy == %f , fc == %f\12\12\0"
LC7:
.ascii "delta == %f\12\12\0"
.align 4
LC2:
.long 998277249
.align 4
LC4:
.long 1073741824
.align 2
.align 16
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
pushl %ebp
movl %esp, %ebp
subl $56, %esp
xorl %eax, %eax
andl $-16, %esp
call __alloca
call ___main
movl $LC0, (%esp)
call _puts
movl $LC1, (%esp)
leal -8(%ebp), %edx
leal -4(%ebp), %ecx
movl %edx, 8(%esp)
leal -12(%ebp), %edx
movl %ecx, 4(%esp)
movl %edx, 12(%esp)
call _scanf
fildl -12(%ebp)
fildl -8(%ebp)
flds LC2
fildl -4(%ebp)
fxch %st(3)
fmul %st(1), %st
fxch %st(2)
fmul %st(1), %st
fxch %st(3)
fmulp %st, %st(1)
fld %st(1)
fcom %st(3)
fnstsw %ax
sahf
jae L2
fstp %st(2)
fld %st(2)
fxch %st(2)
L2:
fxch %st(2)
fcom %st(1)
fnstsw %ax
sahf
jae L3
fstp %st(0)
fld %st(0)
L3:
fxch %st(2)
fcom %st(3)
fnstsw %ax
sahf
fld %st(0)
jbe L9
fstp %st(0)
L4:
fxch %st(3)
fcom %st(1)
fnstsw %ax
sahf
jbe L10
fstp %st(0)
L5:
fld %st(1)
fsub %st(1), %st
fxch %st(2)
faddp %st, %st(1)
fldz
fxch %st(2)
fstps -16(%ebp)
fld %st(1)
flds -16(%ebp)
fcomp %st(3)
fnstsw %ax
fstp %st(2)
sahf
je L8
fstp %st(1)
flds LC4
flds -16(%ebp)
fxch %st(1)
fsub %st(2), %st
fdivrp %st, %st(1)
fxch %st(1)
L8:
fstpl 4(%esp)
fstpl 12(%esp)
fstpl 20(%esp)
movl $LC6, (%esp)
call _printf
flds -16(%ebp)
movl $LC7, (%esp)
fstpl 4(%esp)
call _printf
leave
xorl %eax, %eax
ret
.align 16
L10:
fstp %st(1)
jmp L5
.align 16
L9:
fstp %st(4)
jmp L4
.def _puts; .scl 2; .type 32; .endef
.def _scanf; .scl 2; .type 32; .endef
.def _printf; .scl 2; .type 32; .endef
[/code]

Perhaps someone could try this out on other versions of MSVC. Does the problem still occur in MSVC NET 2002 (MSVC 7) or MSVC NET 2003 (MSVC 7.1)?

[quote CGamesPlay]
Did you try it with %g or %E instead of %f?
[/quote]
I'm only used to using MSVC from the IDE so I'm not sure what these switches do and how to change them from the IDE.

[quote damage]
but it takes a lot of work to crash the compiler.
[/quote]
Not if the compiler is MSVC. Sometimes when the linker is run from the IDE, it crashes. Re-build the project and the linker does not crash anymore.

AE.

Matthew Leverton

Same results in MSVC 7.

gillius

I've noticed that ever since I've gone to MSVC7, I haven't been able to properly code any complicated project with floats in release mode.

In one project I had a vector class and I was spewing particles out in random directions that worked perfectly in release mode. For some reason in release mode the particles would only spew out with their x, y, and z values equal to 1, 0, or -1, rather than lots of interesting values inbetween, so my particles spewing out in all directions turned from a uniformish distribution to only comming out in an exact line in a few directions. It was if it was rounding all of the numbers as if I casted to int.

In my second project I wrote a skeletal animation system. I do all of my timing and quaternion/vector interpolations using seconds and using floats. In release mode with some models it misses some keyframes and doesn't find them (I play part of the animations by looking for keyframes within a certain time). With some other things it does weird random stuff. When I turned on "improve floating point consistency" the problems went away. I tried to debug in release mode but it was just way way too hard to do that, but it appears that a lot of key numbers were wrong or just off, sometimes by like 50% or more from value at the same time in the debug app, which agrees with values that I know are correct.

This make no sense that the option would help. The option forces the compiler to write the variable to memory (truncating it from 80-bit to 32-bit precision) inbetween every calculation rather than leaving the number in the register. To me that seems to mean that the option would REDUCE accuracy. It didn't seem to lower my framerate though so I just leave the option turned on but all of these events make me wonder if I'm coding something wrong.

Andrei Ellman

gillius said:

I've noticed that ever since I've gone to MSVC7, I haven't been able to properly code any complicated project with floats in release mode.

What were you using before MSVC 7?

Have you considered installing MinGW32 (or if you're feeling masochistic, CygWin) so you can build working release-builds of your projects? That way, you can use all of MSVC's IDE's wonderful debugging-aids, and when it comes to building the release build, you can use GCC under MiGW32. I've noticed GCC 3.2's code is significantly faster then MSVC 6's code. I don't have code-speed comparisons for MSVC 7.

AE.

gillius

MSVC7 is much faster than 6. I used to use DJGPP, then I used MSVC6, and now I use MSVC.NET. I know that MSVC6 optimized a lot better than GCC 2.9x and that MSVC.NET is a much better optimizer than MSVC6. Most particularly that I like about MSVC.NET is that it can inline functions across object file boundaries, which is good because I can write clearer because I'm not required to write code in headers for optimal performance.

I do know that GCC 3 has much better optimization. I don't know how GCC 3.2+ compares to MSVC.NET and I don't know if GCC 3.2+ supports cross-objectfile optimizations/inlining.

As for mingw and whatnot, yes I do have it, and I use GCC 3.1 to compile in Linux. But I didn't use Allegro on these last 2 projects, and I had to write them using DirectX8.1 and Win32 applications, so I had no real reason to try to set up mingw32 when MSVC.NET works perfectly well.

Korval

First, I don't think you have an real floating-point error. Or, at least, it isn't something you should ever rely on out of any compiler/hardware. Try this:

Change your code to check to see if ((float)ia == 255.0f). Even if you type in the integer 255 and do the typecast, I don't think it is required for these to be equivalent.

Also, try using 256 instead of 255. As it can be exactly represented as a float, it may work.

Andrei Ellman

Korval said:

Even if you type in the integer 255 and do the typecast, I don't think it is required for these to be equivalent.

The only time casting an int to a float causes loss of data is when the number of significant binary-digits (bits) in the int is greater than the mumber of bits used to represent the mantissa in the float (23, but thanks to the way floats work, it's effectively 24). This means that any int with 24 or less significant bits. can be safely converted to a float without loss of data. 255 only requires 8-bits to be represented (or 9 if it's a signed value), so it can be converted to a float without any loss of accuracy. However, 1/255 cannot be accurately converted into a float.

Korval said:

Also, try using 256 instead of 255. As it can be exactly represented as a float, it may work.

The whole idea of the program is that it demonstrates a bug in the assembler optimisations of the compiler. By examining the sourcecode that I posted earlier, both the GCC and the MSVC versions use exactly the same constant to represent 1/255.

As for dividing by 256 instead of 255, the reason I'm using 255 is because I want the value to be divided by 255 - not 256. I'm trying to convert a value from 0-255 to a value from 0-1, so dividing by 256 won't give me the result I want.

So far, I've not been able to re-produce the problem on GCC (although I've nowhere near exhausted it's plethora of optimisation-switches), but on MSVC, I can re-produce it on demand, which means there's a bug in the compiler and not in the CPU.

Matthew Leverton said:

Same results in MSVC 7.

Just out of interest, did the machine you tried this on have an Intel or AMD CPU? Was it MSVC 7 (NET 2002) or MSVC 7.1 (NET 2003)?

AE.

Matthew Leverton

AMD XP, and I believe it is 2002 edition of MSVC 7.

Korval

Quote:

I still fail to see how this is a bug.

BTW, what is "delta" in the cases where it produces the "wrong" results? Is it +0.0f, or -0.0f (or something else)?

orz

Andrei Ellman: Here's my guess as to what's happening: You calculate your three identical values, fa fb and fc. Then you compare them to find the biggest and smallest. However, while you're doing this, in optimizing mode, it's trying to keep those values in registers, it can ends up writting out one or two to memory instead. While they remain in registers, they have long double precision, because that's the precision of the FP registers, but because they're declared as floats, the ones that get written to memory are truncated to single precision. So when the comparisons are done, it finds that the full precision one is slightly different from the single-precision one, so max and min end up having a tiny difference.

A look at the asm seems to support this theory - notice how in the un-optimized version there are three fstp's for the three divisions, but in the optimized version there are only two - the third value is kept in a register.

Anyway, it's obnoxious behavior, but it's not exactly a bug. Try checking the "improve float consistency" checkbox under project settings -> C/C++ -> optimizations.

Andrei Ellman

I've done yet more research. I tried checking all of MSVC's optimisation checkboxes that I expect it to use when using "Maximize Speed" (why oh why oh why does MSVC have to be awkward and hide which optimisation-checkboxes are checked when chosing one of the pre-defined optimisation-profiles?). Now, I was able to get the problem to appear depending on whether or not the "Improve Float Consistency" checkbox was checked (if it was checked, the code behaved as it should). So it appears Orz's theory to what's going on is what's likely to be causing the problem. I also changed the printf format specifier for floats from %f to %e and as a result, was able to see that there was in fact a very small value in delta that was too insignificant to notice if I printf'd floats with %f. When fy was 1.#INF00e+000, delta was always 5.913898e-008 and when the code was behaving as it should, both values were 0.000000e+000 . I also tried changing all 255s to 256s and the problem disappeared.

Meanwhile in GCC, I discovered the -ffloat-store flag and wondered if I could reproduce the problem by switching it on and off. I tried the four combinations of the flags -ffloat-store and -ffast-math (whaile using -O2) and still couldn't reproduce the problem in GCC. Perhaps GCC and MSVC initialise the state-word of the FPU to a slightly different value.

So the moral of this story is: Don't ever use '==' or '!=' with floats, even when it looks obvious that they will be equal. Instead, test to see if a range of values within the limits of an error-value (epsilon) overlaps with the other number's error-range. See this thread for suggestions on how to implement such comparisons. When using GCC, compile with the -Wfloat-equal warning-flag to get warnings when using == and != with floats. This is one of the flags not turned on by default when using -W and -Wall so you must always explicitly pass -Wfloat-equal .

AE.

Thread #269409. Printed from Allegro.cc

1	/* Program to demonstrate some anomalous behavouor of floating point calculations in MSVC */
2
3	#include <stdio.h>
4
5	// #include <math.h> /* It makes no difference if <math.h> is included or not */
6
7
8	#define MIN(x,y) (((x) < (y)) ? (x) : (y))
9	#define MAX(x,y) (((x) > (y)) ? (x) : (y))
10
11
12
13	int main(void)
14	{
15	int ia, ib, ic;
16	float fa, fb, fc;
17	float fx, fy;
18	float max, min, delta;
19
20
21	printf("Enter three integers (to demonstrate the problem, enter 255 three times)\n");
22	scanf("%i %i %i", &ia, &ib, &ic);
23
24
25	fa = (float)ia / 255.0f;
26	fb = (float)ib / 255.0f;
27	fc = (float)ic / 255.0f;
28
29	max = MAX(fa, MAX(fb, fc));
30	min = MIN(fa, MIN(fb, fc));
31
32
33	fx = max+min;
34
35	delta = max - min;
36
37	if(delta == 0.0f)
38	fy = 0.0f;
39	else
40	fy = delta/(2.0f-fx);
41
42
43	printf("\nfx == %f, fy == %f , fc == %f\n\n",fx, fy, fc); /* For some strange reason, the problem doesn't show up if fc is ommitted */
44	printf("delta == %f\n\n", delta);
45	// printf("max, min == %f, %f\n\n", max, min); /* For some strange reason, if this line is un-commented, the problem doesn't show up */
46
47
48	return 0;
49	}

1	template <class T>
2	class A {
3	class B;
4	};
5
6	template <class T>
7	class A<T> :: B {
8	B(T t);
9	};
10
11	template <class T>
12	void // error
13	A<T> :: B ::
14	B (T t) {
15	}
16
17	template A;