![]() |
|
MSVC 6 floating point maths bug. |
Andrei Ellman
Member #3,434
April 2003
|
I think I have found a bug in MSVC 6's handling of floating-point code. To demonstrate the problem, create a new workspace in MSVC 6 which is a Win32 console app and use it to compile the following code (as either C or C++) without tweaking any of the project-settings. When it asks you to enter three integers, enter 255 three times.
I am using MSVC 6 with service-pack 5 applied under Windows 2000. My machine is a 600MHZ AMD Athlon. I also tried the code under Cygwin's GCC ( gcc version 3.2 20020927 (prerelease) ) and I've not been able to re-produce the problem. I am using the following command to compile:gcc -g testmsvcfloatbug.c -O3 -ffast-math -fomit-frame-pointer -o testmsvcfloatbug_gcc.exe. I have tried various combinations of this command involving replacing -O3 with -O2, using the -fno-strength-reduce flag, using -march=pentium2 and removing the -ffast-math flag. Despite this, the GCC version always outputs the correct results. Anyway, here is the correct output which is what I would expect it to be. Enter three integers (to demonstrate the problem, enter 255 three times) 255 255 255 fx == 2.000000, fy == 0.000000 , fc == 1.000000 delta == 0.000000 Here is what happens under MSVC Enter three integers (to demonstrate the problem, enter 255 three times) 255 255 255 fx == 2.000000, fy == 1.#INF00 , fc == 1.000000 delta == 0.000000 The problem in MSVC appears regardless of whether the code is compiled as C or C++. However, in the debug-build, it behaves correctly. Also, if I replace the floats with doubles, it behaves correctly as well. The code contains a line that compares a floating point variable 'delta' to 0.0f using '==' without any form of epsilon. Seeing that 255 was entered 3 times, then 'delta' whould be the result of subtracting two equal numbers which is 1.0f/255.0f - 1.0f/255.0f which should be 0.0f, right? If I change the compare from "if(delta == 0.0f)" to "if(delta < 1.0f/1024.0f)", the problem disappears. One theory is that the FPU uses a system of odd-even rounding. That is, one "1.0f/255.0f" is rounded down to fit a float and the other is rounded up. The result of this is a very small difference that appears in 'delta' when the maximum and minimum are subtracted, which makes it past the "==0.0f" test and causes the divide to produce such a large number that it's as good as infinity. However, at the end, it always prints delta out to be "0.000000". Also, by messing around with the number of floats used (eg. try un-commenting the last printf), the problem disappears as well. This would imply that there's a bug somewhere in MSVC's code for handling floats. But nethertheless, the fact remains that GCC and MSVC (release-build) produce floating-point code with different behaviour which is demonstrated by the example program in this post. Is this a know issure in MSVC 6? What other versions of MSVC exhibit this behavoiur? AE. -- |
damage
Member #3,438
April 2003
|
Hmm, that's a weird bug. You could try looking at the actual fp assembly output. Microsoft probably would want to know about, if it isn't fixed in VC7 yet. On the topic of weird bugs, here's a cool GCC bug I've tried up to GCC 3.2.2 (I've yet to compile 3.3). Any fool can crash their own program but it takes a lot of work to crash the compiler.
This may sound obscure but it actually had me scratching my head and trying to figure out what I had done wrong, as the compiler kept crashing instead of giving an error message. ____ |
CGamesPlay
Member #2,559
July 2002
![]() |
Did you try it with %g or %E instead of %f? Obviously it is some problem with the stack, but I'd like to think it's a problem with the format specifier and the size it's pulling from the stack. -- Ryan Patterson - <http://cgamesplay.com/> |
Andrei Ellman
Member #3,434
April 2003
|
I've done a bit more research into this. I've been tweaking MSVC's optimisation options to narrow down the bug somewhat. In the C/C++ tab of the Project Settings dialog, select "optimisations" from dropdown list. In the optimisations dropdown list, select customise. Leave all checkboxes unchecked except "global optimisations". This is where the bug lies. if "global optimisations" is unchecked, then the code works as it should. Selecting the CPU type has no effect.
The MSVC Assembly source and the machine-code bytes generated using an "Assembly with Machine Code" listing. The only change between the optimised and un-optimised versions is that the "global optimisations" checkbox is checked. In gcc, I used the following commandline to generate the source[code]gcc -S testmsvcfloatbug.c -O3 -ffast-math -fomit-frame-pointer[/code] I have not examined the code in great detail. At least the MSVC version and the GCC version use the same float-constant for 1/255: 998277249 decimal or 3b808081 hex Anyway, here is the assembly output. Un-optimised MSVC version [code] TITLE E:\programi\wackodev\testmsvcfloatbug\testmsvcfloatbug.c .386P include listing.inc if @Version gt 510 .model FLAT else _TEXT SEGMENT PARA USE32 PUBLIC 'CODE' _TEXT ENDS _DATA SEGMENT DWORD USE32 PUBLIC 'DATA' _DATA ENDS CONST SEGMENT DWORD USE32 PUBLIC 'CONST' CONST ENDS _BSS SEGMENT DWORD USE32 PUBLIC 'BSS' _BSS ENDS _TLS SEGMENT DWORD USE32 PUBLIC 'TLS' _TLS ENDS FLAT GROUP _DATA, CONST, _BSS ASSUME CS: FLAT, DS: FLAT, SS: FLAT endif PUBLIC _main PUBLIC __real@4@4006ff00000000000000 PUBLIC __real@4@00000000000000000000 PUBLIC __real@4@40008000000000000000 EXTRN _printf:NEAR EXTRN _scanf:NEAR EXTRN __fltused:NEAR _DATA SEGMENT $SG782 DB 'Enter three integers (to demonstrate the problem, enter ' DB '255 three times)', 0aH, 00H ORG $+2 $SG783 DB '%i %i %i', 00H ORG $+3 $SG789 DB 0aH, 'fx == %f, fy == %f , fc == %f', 0aH, 0aH, 00H ORG $+3 $SG790 DB 'delta == %f', 0aH, 0aH, 00H _DATA ENDS ; COMDAT __real@4@4006ff00000000000000 ; File E:\programi\wackodev\testmsvcfloatbug\testmsvcfloatbug.c CONST SEGMENT __real@4@4006ff00000000000000 DD 0437f0000r ; 255 CONST ENDS ; COMDAT __real@4@00000000000000000000 CONST SEGMENT __real@4@00000000000000000000 DD 000000000r ; 0 CONST ENDS ; COMDAT __real@4@40008000000000000000 CONST SEGMENT __real@4@40008000000000000000 DD 040000000r ; 2 CONST ENDS _TEXT SEGMENT _ia$ = -28 _ib$ = -36 _ic$ = -40 _fa$ = -44 _fb$ = -4 _fc$ = -8 _fx$ = -16 _fy$ = -20 _max$ = -12 _min$ = -32 _delta$ = -24 _main PROC NEAR ; File E:\programi\wackodev\testmsvcfloatbug\testmsvcfloatbug.c ; Line 14 00000 55 push ebp 00001 8b ec mov ebp, esp 00003 83 ec 44 sub esp, 68 ; 00000044H ; Line 21 00006 68 00 00 00 00 push OFFSET FLAT:$SG782 0000b e8 00 00 00 00 call _printf 00010 83 c4 04 add esp, 4 ; Line 22 00013 8d 45 d8 lea eax, DWORD PTR _ic$[ebp] 00016 50 push eax 00017 8d 4d dc lea ecx, DWORD PTR _ib$[ebp] 0001a 51 push ecx 0001b 8d 55 e4 lea edx, DWORD PTR _ia$[ebp] 0001e 52 push edx 0001f 68 00 00 00 00 push OFFSET FLAT:$SG783 00024 e8 00 00 00 00 call _scanf 00029 83 c4 10 add esp, 16 ; 00000010H ; Line 25 0002c db 45 e4 fild DWORD PTR _ia$[ebp] 0002f d8 35 00 00 00 00 fdiv DWORD PTR __real@4@4006ff00000000000000 00035 d9 5d d4 fstp DWORD PTR _fa$[ebp] ; Line 26 00038 db 45 dc fild DWORD PTR _ib$[ebp] 0003b d8 35 00 00 00 00 fdiv DWORD PTR __real@4@4006ff00000000000000 00041 d9 5d fc fstp DWORD PTR _fb$[ebp] ; Line 27 00044 db 45 d8 fild DWORD PTR _ic$[ebp] 00047 d8 35 00 00 00 00 fdiv DWORD PTR __real@4@4006ff00000000000000 0004d d9 5d f8 fstp DWORD PTR _fc$[ebp] ; Line 29 00050 d9 45 fc fld DWORD PTR _fb$[ebp] 00053 d8 5d f8 fcomp DWORD PTR _fc$[ebp] 00056 df e0 fnstsw ax 00058 f6 c4 41 test ah, 65 ; 00000041H 0005b 75 08 jne SHORT $L794 0005d 8b 45 fc mov eax, DWORD PTR _fb$[ebp] 00060 89 45 d0 mov DWORD PTR -48+[ebp], eax 00063 eb 06 jmp SHORT $L795 $L794: 00065 8b 4d f8 mov ecx, DWORD PTR _fc$[ebp] 00068 89 4d d0 mov DWORD PTR -48+[ebp], ecx $L795: 0006b d9 45 d4 fld DWORD PTR _fa$[ebp] 0006e d8 5d d0 fcomp DWORD PTR -48+[ebp] 00071 df e0 fnstsw ax 00073 f6 c4 41 test ah, 65 ; 00000041H 00076 75 08 jne SHORT $L798 00078 8b 55 d4 mov edx, DWORD PTR _fa$[ebp] 0007b 89 55 cc mov DWORD PTR -52+[ebp], edx 0007e eb 21 jmp SHORT $L799 $L798: 00080 d9 45 fc fld DWORD PTR _fb$[ebp] 00083 d8 5d f8 fcomp DWORD PTR _fc$[ebp] 00086 df e0 fnstsw ax 00088 f6 c4 41 test ah, 65 ; 00000041H 0008b 75 08 jne SHORT $L796 0008d 8b 45 fc mov eax, DWORD PTR _fb$[ebp] 00090 89 45 c8 mov DWORD PTR -56+[ebp], eax 00093 eb 06 jmp SHORT $L797 $L796: 00095 8b 4d f8 mov ecx, DWORD PTR _fc$[ebp] 00098 89 4d c8 mov DWORD PTR -56+[ebp], ecx $L797: 0009b 8b 55 c8 mov edx, DWORD PTR -56+[ebp] 0009e 89 55 cc mov DWORD PTR -52+[ebp], edx $L799: 000a1 8b 45 cc mov eax, DWORD PTR -52+[ebp] 000a4 89 45 f4 mov DWORD PTR _max$[ebp], eax ; Line 30 000a7 d9 45 fc fld DWORD PTR _fb$[ebp] 000aa d8 5d f8 fcomp DWORD PTR _fc$[ebp] 000ad df e0 fnstsw ax 000af f6 c4 01 test ah, 1 000b2 74 08 je SHORT $L800 000b4 8b 4d fc mov ecx, DWORD PTR _fb$[ebp] 000b7 89 4d c4 mov DWORD PTR -60+[ebp], ecx 000ba eb 06 jmp SHORT $L801 $L800: 000bc 8b 55 f8 mov edx, DWORD PTR _fc$[ebp] 000bf 89 55 c4 mov DWORD PTR -60+[ebp], edx $L801: 000c2 d9 45 d4 fld DWORD PTR _fa$[ebp] 000c5 d8 5d c4 fcomp DWORD PTR -60+[ebp] 000c8 df e0 fnstsw ax 000ca f6 c4 01 test ah, 1 000cd 74 08 je SHORT $L804 000cf 8b 45 d4 mov eax, DWORD PTR _fa$[ebp] 000d2 89 45 c0 mov DWORD PTR -64+[ebp], eax 000d5 eb 21 jmp SHORT $L805 $L804: 000d7 d9 45 fc fld DWORD PTR _fb$[ebp] 000da d8 5d f8 fcomp DWORD PTR _fc$[ebp] 000dd df e0 fnstsw ax 000df f6 c4 01 test ah, 1 000e2 74 08 je SHORT $L802 000e4 8b 4d fc mov ecx, DWORD PTR _fb$[ebp] 000e7 89 4d bc mov DWORD PTR -68+[ebp], ecx 000ea eb 06 jmp SHORT $L803 $L802: 000ec 8b 55 f8 mov edx, DWORD PTR _fc$[ebp] 000ef 89 55 bc mov DWORD PTR -68+[ebp], edx $L803: 000f2 8b 45 bc mov eax, DWORD PTR -68+[ebp] 000f5 89 45 c0 mov DWORD PTR -64+[ebp], eax $L805: 000f8 8b 4d c0 mov ecx, DWORD PTR -64+[ebp] 000fb 89 4d e0 mov DWORD PTR _min$[ebp], ecx ; Line 33 000fe d9 45 f4 fld DWORD PTR _max$[ebp] 00101 d8 45 e0 fadd DWORD PTR _min$[ebp] 00104 d9 5d f0 fstp DWORD PTR _fx$[ebp] ; Line 35 00107 d9 45 f4 fld DWORD PTR _max$[ebp] 0010a d8 65 e0 fsub DWORD PTR _min$[ebp] 0010d d9 55 e8 fst DWORD PTR _delta$[ebp] ; Line 37 00110 d8 1d 00 00 00 00 fcomp DWORD PTR __real@4@00000000000000000000 00116 df e0 fnstsw ax 00118 f6 c4 40 test ah, 64 ; 00000040H 0011b 74 09 je SHORT $L787 ; Line 38 0011d c7 45 ec 00 00 00 00 mov DWORD PTR _fy$[ebp], 0 ; Line 39 00124 eb 0f jmp SHORT $L788 $L787: ; Line 40 00126 d9 05 00 00 00 00 fld DWORD PTR __real@4@40008000000000000000 0012c d8 65 f0 fsub DWORD PTR _fx$[ebp] 0012f d8 7d e8 fdivr DWORD PTR _delta$[ebp] 00132 d9 5d ec fstp DWORD PTR _fy$[ebp] $L788: ; Line 43 00135 d9 45 f8 fld DWORD PTR _fc$[ebp] 00138 83 ec 08 sub esp, 8 0013b dd 1c 24 fstp QWORD PTR [esp] 0013e d9 45 ec fld DWORD PTR _fy$[ebp] 00141 83 ec 08 sub esp, 8 00144 dd 1c 24 fstp QWORD PTR [esp] 00147 d9 45 f0 fld DWORD PTR _fx$[ebp] 0014a 83 ec 08 sub esp, 8 0014d dd 1c 24 fstp QWORD PTR [esp] 00150 68 00 00 00 00 push OFFSET FLAT:$SG789 00155 e8 00 00 00 00 call _printf 0015a 83 c4 1c add esp, 28 ; 0000001cH ; Line 44 0015d d9 45 e8 fld DWORD PTR _delta$[ebp] 00160 83 ec 08 sub esp, 8 00163 dd 1c 24 fstp QWORD PTR [esp] 00166 68 00 00 00 00 push OFFSET FLAT:$SG790 0016b e8 00 00 00 00 call _printf 00170 83 c4 0c add esp, 12 ; 0000000cH ; Line 48 00173 33 c0 xor eax, eax ; Line 49 00175 8b e5 mov esp, ebp 00177 5d pop ebp 00178 c3 ret 0 _main ENDP _TEXT ENDS END [/code] MSVC version with only the "global optimisations" optimisation (this is where the bug rears it's ugly head). [code] TITLE E:\programi\wackodev\testmsvcfloatbug\testmsvcfloatbug.c .386P include listing.inc if @Version gt 510 .model FLAT else _TEXT SEGMENT PARA USE32 PUBLIC 'CODE' _TEXT ENDS _DATA SEGMENT DWORD USE32 PUBLIC 'DATA' _DATA ENDS CONST SEGMENT DWORD USE32 PUBLIC 'CONST' CONST ENDS _BSS SEGMENT DWORD USE32 PUBLIC 'BSS' _BSS ENDS _TLS SEGMENT DWORD USE32 PUBLIC 'TLS' _TLS ENDS FLAT GROUP _DATA, CONST, _BSS ASSUME CS: FLAT, DS: FLAT, SS: FLAT endif PUBLIC _main PUBLIC __real@4@3ff78080808080808000 PUBLIC __real@4@00000000000000000000 PUBLIC __real@4@40008000000000000000 EXTRN _printf:NEAR EXTRN _scanf:NEAR EXTRN __fltused:NEAR _DATA SEGMENT $SG782 DB 'Enter three integers (to demonstrate the problem, enter ' DB '255 three times)', 0aH, 00H ORG $+2 $SG783 DB '%i %i %i', 00H ORG $+3 $SG789 DB 0aH, 'fx == %f, fy == %f , fc == %f', 0aH, 0aH, 00H ORG $+3 $SG790 DB 'delta == %f', 0aH, 0aH, 00H _DATA ENDS ; COMDAT __real@4@3ff78080808080808000 ; File E:\programi\wackodev\testmsvcfloatbug\testmsvcfloatbug.c CONST SEGMENT __real@4@3ff78080808080808000 DD 03b808081r ; 0.00392157 CONST ENDS ; COMDAT __real@4@00000000000000000000 CONST SEGMENT __real@4@00000000000000000000 DD 000000000r ; 0 CONST ENDS ; COMDAT __real@4@40008000000000000000 CONST SEGMENT __real@4@40008000000000000000 DD 040000000r ; 2 CONST ENDS _TEXT SEGMENT _ia$ = -12 _ib$ = -16 _ic$ = -20 _fa$ = -8 _fb$ = -4 _fx$ = -4 _delta$ = -8 _main PROC NEAR ; File E:\programi\wackodev\testmsvcfloatbug\testmsvcfloatbug.c ; Line 14 00000 55 push ebp 00001 8b ec mov ebp, esp 00003 83 ec 14 sub esp, 20 ; 00000014H ; Line 21 00006 68 00 00 00 00 push OFFSET FLAT:$SG782 0000b e8 00 00 00 00 call _printf ; Line 22 00010 8d 45 ec lea eax, DWORD PTR _ic$[ebp] 00013 8d 4d f0 lea ecx, DWORD PTR _ib$[ebp] 00016 50 push eax 00017 8d 55 f4 lea edx, DWORD PTR _ia$[ebp] 0001a 51 push ecx 0001b 52 push edx 0001c 68 00 00 00 00 push OFFSET FLAT:$SG783 00021 e8 00 00 00 00 call _scanf ; Line 25 00026 db 45 f4 fild DWORD PTR _ia$[ebp] 00029 83 c4 14 add esp, 20 ; 00000014H 0002c d8 0d 00 00 00 00 fmul DWORD PTR __real@4@3ff78080808080808000 00032 d9 5d f8 fstp DWORD PTR _fa$[ebp] ; Line 26 00035 db 45 f0 fild DWORD PTR _ib$[ebp] 00038 d8 0d 00 00 00 00 fmul DWORD PTR __real@4@3ff78080808080808000 0003e d9 5d fc fstp DWORD PTR _fb$[ebp] ; Line 27 00041 db 45 ec fild DWORD PTR _ic$[ebp] 00044 d8 0d 00 00 00 00 fmul DWORD PTR __real@4@3ff78080808080808000 ; Line 29 0004a d9 45 fc fld DWORD PTR _fb$[ebp] 0004d d8 d9 fcomp ST(1) 0004f df e0 fnstsw ax 00051 f6 c4 41 test ah, 65 ; 00000041H 00054 75 05 jne SHORT $L794 00056 d9 45 fc fld DWORD PTR _fb$[ebp] 00059 eb 02 jmp SHORT $L795 $L794: 0005b d9 c0 fld ST(0) $L795: 0005d d9 45 f8 fld DWORD PTR _fa$[ebp] 00060 d8 d9 fcomp ST(1) 00062 df e0 fnstsw ax 00064 f6 c4 41 test ah, 65 ; 00000041H 00067 dd d8 fstp ST(0) 00069 75 05 jne SHORT $L798 0006b d9 45 f8 fld DWORD PTR _fa$[ebp] 0006e eb 13 jmp SHORT $L797 $L798: 00070 d9 45 fc fld DWORD PTR _fb$[ebp] 00073 d8 d9 fcomp ST(1) 00075 df e0 fnstsw ax 00077 f6 c4 41 test ah, 65 ; 00000041H 0007a 75 05 jne SHORT $L796 0007c d9 45 fc fld DWORD PTR _fb$[ebp] 0007f eb 02 jmp SHORT $L797 $L796: 00081 d9 c0 fld ST(0) $L797: ; Line 30 00083 d9 45 fc fld DWORD PTR _fb$[ebp] 00086 d8 da fcomp ST(2) 00088 df e0 fnstsw ax 0008a f6 c4 01 test ah, 1 0008d 74 05 je SHORT $L800 0008f d9 45 fc fld DWORD PTR _fb$[ebp] 00092 eb 02 jmp SHORT $L801 $L800: 00094 d9 c1 fld ST(1) $L801: 00096 d9 45 f8 fld DWORD PTR _fa$[ebp] 00099 d8 d9 fcomp ST(1) 0009b df e0 fnstsw ax 0009d f6 c4 01 test ah, 1 000a0 dd d8 fstp ST(0) 000a2 74 05 je SHORT $L804 000a4 d9 45 f8 fld DWORD PTR _fa$[ebp] 000a7 eb 13 jmp SHORT $L803 $L804: 000a9 d9 45 fc fld DWORD PTR _fb$[ebp] 000ac d8 da fcomp ST(2) 000ae df e0 fnstsw ax 000b0 f6 c4 01 test ah, 1 000b3 74 05 je SHORT $L802 000b5 d9 45 fc fld DWORD PTR _fb$[ebp] 000b8 eb 02 jmp SHORT $L803 $L802: 000ba d9 c1 fld ST(1) $L803: ; Line 33 000bc d9 c0 fld ST(0) 000be d8 c2 fadd ST(0), ST(2) 000c0 d9 5d fc fstp DWORD PTR _fx$[ebp] ; Line 35 000c3 d9 c9 fxch ST(1) 000c5 d8 e1 fsub ST(0), ST(1) 000c7 d9 5d f8 fstp DWORD PTR _delta$[ebp] 000ca dd d8 fstp ST(0) ; Line 37 000cc d9 45 f8 fld DWORD PTR _delta$[ebp] 000cf d8 1d 00 00 00 00 fcomp DWORD PTR __real@4@00000000000000000000 000d5 df e0 fnstsw ax 000d7 f6 c4 40 test ah, 64 ; 00000040H 000da 74 08 je SHORT $L787 ; Line 38 000dc d9 05 00 00 00 00 fld DWORD PTR __real@4@00000000000000000000 ; Line 39 000e2 eb 0c jmp SHORT $L788 $L787: ; Line 40 000e4 d9 05 00 00 00 00 fld DWORD PTR __real@4@40008000000000000000 000ea d8 65 fc fsub DWORD PTR _fx$[ebp] 000ed d8 7d f8 fdivr DWORD PTR _delta$[ebp] $L788: ; Line 43 000f0 83 ec 08 sub esp, 8 000f3 d9 c9 fxch ST(1) 000f5 dd 1c 24 fstp QWORD PTR [esp] 000f8 83 ec 08 sub esp, 8 000fb dd 1c 24 fstp QWORD PTR [esp] 000fe d9 45 fc fld DWORD PTR _fx$[ebp] 00101 83 ec 08 sub esp, 8 00104 dd 1c 24 fstp QWORD PTR [esp] 00107 68 00 00 00 00 push OFFSET FLAT:$SG789 0010c e8 00 00 00 00 call _printf ; Line 44 00111 d9 45 f8 fld DWORD PTR _delta$[ebp] 00114 83 c4 14 add esp, 20 ; 00000014H 00117 dd 1c 24 fstp QWORD PTR [esp] 0011a 68 00 00 00 00 push OFFSET FLAT:$SG790 0011f e8 00 00 00 00 call _printf 00124 83 c4 0c add esp, 12 ; 0000000cH ; Line 48 00127 33 c0 xor eax, eax ; Line 49 00129 8b e5 mov esp, ebp 0012b 5d pop ebp 0012c c3 ret 0 _main ENDP _TEXT ENDS END [/code] Optimised GCC version (-O3 -ffast-math -fomit-frame-pointer) [code] .file "testmsvcfloatbug.c" .def ___main; .scl 2; .type 32; .endef .text .align 32 LC0: .ascii "Enter three integers (to demonstrate the problem, enter 255 three times)\0" LC1: .ascii "%i %i %i\0" .align 32 LC6: .ascii "\12fx == %f, fy == %f , fc == %f\12\12\0" LC7: .ascii "delta == %f\12\12\0" .align 4 LC2: .long 998277249 .align 4 LC4: .long 1073741824 .align 2 .align 16 .globl _main .def _main; .scl 2; .type 32; .endef _main: pushl %ebp movl %esp, %ebp subl $56, %esp xorl %eax, %eax andl $-16, %esp call __alloca call ___main movl $LC0, (%esp) call _puts movl $LC1, (%esp) leal -8(%ebp), %edx leal -4(%ebp), %ecx movl %edx, 8(%esp) leal -12(%ebp), %edx movl %ecx, 4(%esp) movl %edx, 12(%esp) call _scanf fildl -12(%ebp) fildl -8(%ebp) flds LC2 fildl -4(%ebp) fxch %st(3) fmul %st(1), %st fxch %st(2) fmul %st(1), %st fxch %st(3) fmulp %st, %st(1) fld %st(1) fcom %st(3) fnstsw %ax sahf jae L2 fstp %st(2) fld %st(2) fxch %st(2) L2: fxch %st(2) fcom %st(1) fnstsw %ax sahf jae L3 fstp %st(0) fld %st(0) L3: fxch %st(2) fcom %st(3) fnstsw %ax sahf fld %st(0) jbe L9 fstp %st(0) L4: fxch %st(3) fcom %st(1) fnstsw %ax sahf jbe L10 fstp %st(0) L5: fld %st(1) fsub %st(1), %st fxch %st(2) faddp %st, %st(1) fldz fxch %st(2) fstps -16(%ebp) fld %st(1) flds -16(%ebp) fcomp %st(3) fnstsw %ax fstp %st(2) sahf je L8 fstp %st(1) flds LC4 flds -16(%ebp) fxch %st(1) fsub %st(2), %st fdivrp %st, %st(1) fxch %st(1) L8: fstpl 4(%esp) fstpl 12(%esp) fstpl 20(%esp) movl $LC6, (%esp) call _printf flds -16(%ebp) movl $LC7, (%esp) fstpl 4(%esp) call _printf leave xorl %eax, %eax ret .align 16 L10: fstp %st(1) jmp L5 .align 16 L9: fstp %st(4) jmp L4 .def _puts; .scl 2; .type 32; .endef .def _scanf; .scl 2; .type 32; .endef .def _printf; .scl 2; .type 32; .endef [/code] Perhaps someone could try this out on other versions of MSVC. Does the problem still occur in MSVC NET 2002 (MSVC 7) or MSVC NET 2003 (MSVC 7.1)? [quote CGamesPlay] Did you try it with %g or %E instead of %f? [/quote] I'm only used to using MSVC from the IDE so I'm not sure what these switches do and how to change them from the IDE. [quote damage] but it takes a lot of work to crash the compiler. [/quote] Not if the compiler is MSVC. Sometimes when the linker is run from the IDE, it crashes. Re-build the project and the linker does not crash anymore. AE. -- |
Matthew Leverton
Supreme Loser
January 1999
![]() |
Same results in MSVC 7. |
gillius
Member #119
April 2000
|
I've noticed that ever since I've gone to MSVC7, I haven't been able to properly code any complicated project with floats in release mode. In one project I had a vector class and I was spewing particles out in random directions that worked perfectly in release mode. For some reason in release mode the particles would only spew out with their x, y, and z values equal to 1, 0, or -1, rather than lots of interesting values inbetween, so my particles spewing out in all directions turned from a uniformish distribution to only comming out in an exact line in a few directions. It was if it was rounding all of the numbers as if I casted to int. In my second project I wrote a skeletal animation system. I do all of my timing and quaternion/vector interpolations using seconds and using floats. In release mode with some models it misses some keyframes and doesn't find them (I play part of the animations by looking for keyframes within a certain time). With some other things it does weird random stuff. When I turned on "improve floating point consistency" the problems went away. I tried to debug in release mode but it was just way way too hard to do that, but it appears that a lot of key numbers were wrong or just off, sometimes by like 50% or more from value at the same time in the debug app, which agrees with values that I know are correct. This make no sense that the option would help. The option forces the compiler to write the variable to memory (truncating it from 80-bit to 32-bit precision) inbetween every calculation rather than leaving the number in the register. To me that seems to mean that the option would REDUCE accuracy. It didn't seem to lower my framerate though so I just leave the option turned on but all of these events make me wonder if I'm coding something wrong. Gillius |
Andrei Ellman
Member #3,434
April 2003
|
gillius said: I've noticed that ever since I've gone to MSVC7, I haven't been able to properly code any complicated project with floats in release mode. What were you using before MSVC 7? Have you considered installing MinGW32 (or if you're feeling masochistic, CygWin) so you can build working release-builds of your projects? That way, you can use all of MSVC's IDE's wonderful debugging-aids, and when it comes to building the release build, you can use GCC under MiGW32. I've noticed GCC 3.2's code is significantly faster then MSVC 6's code. I don't have code-speed comparisons for MSVC 7. AE. -- |
gillius
Member #119
April 2000
|
MSVC7 is much faster than 6. I used to use DJGPP, then I used MSVC6, and now I use MSVC.NET. I know that MSVC6 optimized a lot better than GCC 2.9x and that MSVC.NET is a much better optimizer than MSVC6. Most particularly that I like about MSVC.NET is that it can inline functions across object file boundaries, which is good because I can write clearer because I'm not required to write code in headers for optimal performance. I do know that GCC 3 has much better optimization. I don't know how GCC 3.2+ compares to MSVC.NET and I don't know if GCC 3.2+ supports cross-objectfile optimizations/inlining. As for mingw and whatnot, yes I do have it, and I use GCC 3.1 to compile in Linux. But I didn't use Allegro on these last 2 projects, and I had to write them using DirectX8.1 and Win32 applications, so I had no real reason to try to set up mingw32 when MSVC.NET works perfectly well. Gillius |
Korval
Member #1,538
September 2001
![]() |
First, I don't think you have an real floating-point error. Or, at least, it isn't something you should ever rely on out of any compiler/hardware. Try this: Change your code to check to see if ((float)ia == 255.0f). Even if you type in the integer 255 and do the typecast, I don't think it is required for these to be equivalent. Also, try using 256 instead of 255. As it can be exactly represented as a float, it may work. |
Andrei Ellman
Member #3,434
April 2003
|
Korval said: Even if you type in the integer 255 and do the typecast, I don't think it is required for these to be equivalent. The only time casting an int to a float causes loss of data is when the number of significant binary-digits (bits) in the int is greater than the mumber of bits used to represent the mantissa in the float (23, but thanks to the way floats work, it's effectively 24). This means that any int with 24 or less significant bits. can be safely converted to a float without loss of data. 255 only requires 8-bits to be represented (or 9 if it's a signed value), so it can be converted to a float without any loss of accuracy. However, 1/255 cannot be accurately converted into a float. Korval said: Also, try using 256 instead of 255. As it can be exactly represented as a float, it may work. The whole idea of the program is that it demonstrates a bug in the assembler optimisations of the compiler. By examining the sourcecode that I posted earlier, both the GCC and the MSVC versions use exactly the same constant to represent 1/255. As for dividing by 256 instead of 255, the reason I'm using 255 is because I want the value to be divided by 255 - not 256. I'm trying to convert a value from 0-255 to a value from 0-1, so dividing by 256 won't give me the result I want. So far, I've not been able to re-produce the problem on GCC (although I've nowhere near exhausted it's plethora of optimisation-switches), but on MSVC, I can re-produce it on demand, which means there's a bug in the compiler and not in the CPU. Matthew Leverton said: Same results in MSVC 7. Just out of interest, did the machine you tried this on have an Intel or AMD CPU? Was it MSVC 7 (NET 2002) or MSVC 7.1 (NET 2003)? AE. -- |
Matthew Leverton
Supreme Loser
January 1999
![]() |
AMD XP, and I believe it is 2002 edition of MSVC 7. |
Korval
Member #1,538
September 2001
![]() |
Quote: So far, I've not been able to re-produce the problem on GCC (although I've nowhere near exhausted it's plethora of optimisation-switches), but on MSVC, I can re-produce it on demand, which means there's a bug in the compiler and not in the CPU. I still fail to see how this is a bug. BTW, what is "delta" in the cases where it produces the "wrong" results? Is it +0.0f, or -0.0f (or something else)? |
orz
Member #565
August 2000
|
Andrei Ellman: Here's my guess as to what's happening: You calculate your three identical values, fa fb and fc. Then you compare them to find the biggest and smallest. However, while you're doing this, in optimizing mode, it's trying to keep those values in registers, it can ends up writting out one or two to memory instead. While they remain in registers, they have long double precision, because that's the precision of the FP registers, but because they're declared as floats, the ones that get written to memory are truncated to single precision. So when the comparisons are done, it finds that the full precision one is slightly different from the single-precision one, so max and min end up having a tiny difference. A look at the asm seems to support this theory - notice how in the un-optimized version there are three fstp's for the three divisions, but in the optimized version there are only two - the third value is kept in a register. Anyway, it's obnoxious behavior, but it's not exactly a bug. Try checking the "improve float consistency" checkbox under project settings -> C/C++ -> optimizations. |
Andrei Ellman
Member #3,434
April 2003
|
I've done yet more research. I tried checking all of MSVC's optimisation checkboxes that I expect it to use when using "Maximize Speed" (why oh why oh why does MSVC have to be awkward and hide which optimisation-checkboxes are checked when chosing one of the pre-defined optimisation-profiles?). Now, I was able to get the problem to appear depending on whether or not the "Improve Float Consistency" checkbox was checked (if it was checked, the code behaved as it should). So it appears Orz's theory to what's going on is what's likely to be causing the problem. I also changed the printf format specifier for floats from %f to %e and as a result, was able to see that there was in fact a very small value in delta that was too insignificant to notice if I printf'd floats with %f. When fy was 1.#INF00e+000, delta was always 5.913898e-008 and when the code was behaving as it should, both values were 0.000000e+000 . I also tried changing all 255s to 256s and the problem disappeared. Meanwhile in GCC, I discovered the -ffloat-store flag and wondered if I could reproduce the problem by switching it on and off. I tried the four combinations of the flags -ffloat-store and -ffast-math (whaile using -O2) and still couldn't reproduce the problem in GCC. Perhaps GCC and MSVC initialise the state-word of the FPU to a slightly different value. So the moral of this story is: Don't ever use '==' or '!=' with floats, even when it looks obvious that they will be equal. Instead, test to see if a range of values within the limits of an error-value (epsilon) overlaps with the other number's error-range. See this thread for suggestions on how to implement such comparisons. When using GCC, compile with the -Wfloat-equal warning-flag to get warnings when using == and != with floats. This is one of the flags not turned on by default when using -W and -Wall so you must always explicitly pass -Wfloat-equal . AE. -- |
|