Allegro.cc - Online Community

Allegro.cc Forums » Programming Questions » AGAIN floats

This thread is locked; no one can reply to it. rss feed Print
 1   2 
AGAIN floats
damage
Member #3,438
April 2003

What I am sure of is that on a 32-bit processor (IA32) such as the Pentium and many of it's successors, there is only a 32-bit bus between the CPU and memory, so the CPU can read/write a 32-bit float (32-bits) in one memory-fetch, and it takes two memory-fetches to read/write a 64-bit double to memory, and three to read an 80-bit number.

That's true; I was wrong. I should have qualified my original statement, I suppose.

But I guess with a lot FP code the actual operations take longer than the memory accesses... I'm just confused because I have tried a lot of my code with both doubles and floats and not noticed any significant differences.

(question: is this what a 'long double' is or does it have even more bits?).

I don't think so; I thought that x86 supported 3 types of float types: single precision (32bit), double precision (64bit) and double extended precision (80bit, and which is the same as the so-called "long double").

____
Don't have anything private. Don't do anything silly like having a hidden name and address field with get_name and set_address and get_name and set_name functions. - Bjarne Stroustrop, creator of C++

Korval
Member #1,538
September 2001
avatar

Quote:

What I am sure of is that on a 32-bit processor (IA32) such as the Pentium and many of it's successors

As an aside, IA32 doesn't exist. Intel has chips that use the IA64 instruction set, but this is not based on a pre-existing IA32 instruction set. If you want to refer to the instruciton set of 32-bit Pentiums, the correct terminology would be x86-32.

Thomas Fjellstrom
Member #476
June 2000
avatar

Really? I hear alot of people call the "x86-32" platform IA32. All it stands for is "IntelArchetecture32"... Now how can that be wrong?

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Bob
Free Market Evangelist
September 2000
avatar

Quote:

As an aside, IA32 doesn't exist.

Make sure you tell Intel about this ;D ;D ;D

Quote:

What I am sure of is that on a 32-bit processor (IA32) such as the Pentium and many of it's successors, there is only a 32-bit bus between the CPU and memory, so the CPU can read/write a 32-bit float (32-bits) in one memory-fetch, and it takes two memory-fetches to read/write a 64-bit double to memory, and three to read an 80-bit number.

That's a very very simplistic view of what's going on. On the Pentium Pro and up, the memory bus is 64 bits wide(*). However, the CPU's MMU doesn't talk to memory directly, but to cache first. Cache is read in lines of 32, 64 or 128 bytes (Pentium to P4).

What happens is that when you use doubles instead of floats, your memory usage (roughly) doubles. This means the CPU can store less variables in cache, which will force it to spill more of it onto main RAM, which is slower.

(*) multiple interleaved RDRAM channels are combined to form a 64-bit bus, but RDRAM itself is only 8 or 16-bits wide.

--
- Bob
[ -- All my signature links are 404 -- ]

Andrei Ellman
Member #3,434
April 2003

damage said:

(question: is this what a 'long double' is or does it have even more bits?).

I don't think so; I thought that x86 supported 3 types of float types: single precision (32bit), double precision (64bit) and double extended precision (80bit, and which is the same as the so-called "long double").

I don't think the x87 (the part of the x86 that does FPU ops which was a separete chip on 386 and 486SX based PCs) can natively support 64-bit precision. It only does 32 and 80 bit precision (this is true on the 387, 487 and early Pentiums. I'm not sure about the modern Pentiums, Athlons, etc.). Whereas C only supports floats and doubles (I'm not sure how well 'long doubles' are supported in standard C and if they only appear as language-extensions). This means that there's a possibility that operations that could cause overflow or underflow in 64-bits work fine in 80-bits. So if code that uses doubles is ported from the x87 to a platform that can support 64-bit maths as easy (or better) than 80-bit maths, it will produce slightly different results, or in the worse case, underflow or overflow when the x87 version won't. Is there something in the ANSI C standard that says calculations involving doubles always have to be done with 64-bit precision? And if the x87 only natively supports 80-bits, how would the compiler write code to comply to the standard? And do the compilers comply to the standard and sacrifice efficiency as a result or do they cheat and do the maths in 80-bit mode and return the result truncated to 64-bits?

float tip of the day:
If perserving accurcy is really important, it is best to write "c*a + c*b" instead of c*(a+b). This is beacuse additions and subtractions are prone to losing accuracy if the exponents of a and b are different (the lower order bits on the lowest value become insignificant when converting the exponent to the exponent of the higher value, so it's best to save additions and subtractions to be done as late as possible in the calculation). Of course this is a lot slower as it uses an extra multiply, but as usual, it depends on whether you want speed or acuracy.

AE.

--
Don't let the illegitimates turn you into carbon.

Frank Drebin
Member #2,987
December 2002
avatar

one closing question:
0 can be represented exactly by a float !!?
(so if (float_var==0) works)

gnolam
Member #2,030
March 2002
avatar

Yes, 0 can be represented exactly (in more than one way, even). But if you had read the previous threads, you would have noticed that it isn't so much the fact that some numbers can't be represented exactly that should make you be wary of what you're doing, it's that you will have problems reaching that exact value. So the only time I would use "if (float_var == 0)" was if I had set float_var to 0 myself.

--
Move to the Democratic People's Republic of Vivendi Universal (formerly known as Sweden) - officially democracy- and privacy-free since 2008-06-18!

Evert
Member #794
November 2000
avatar

To be more precise, there is -0.0f and +0.0f, which behave identically under computations.
But as a rule of thumb, you should never use == when dealing with floats. Say x and y tend to the same value a. At some point, x = a-e1 and y = a-e2. In this case, x and y will be close, but you can't count on e1==e2 being true, even if they should be without numerical calculations.
But that's just what everyone has been saying worded differently. :)

Frank Drebin
Member #2,987
December 2002
avatar

Korval
Member #1,538
September 2001
avatar

Quote:

Make sure you tell Intel about this

Have I mentioned before that I hate Intel? Can't these people stop making up names for things? The whole point of calling their 64-bit architecture IA64 is so that it is immediately differentiated from x86 chips. Calling an x86 chip an IA32 chip violates this, rather important, distinction.

Thomas Fjellstrom
Member #476
June 2000
avatar

The IA32 name has been around for years ::)

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Zaphos
Member #1,468
August 2001

A.E. said:

float tip of the day:
If perserving accurcy is really important, it is best to write "c*a + c*b" instead of c*(a+b).

Will an optimizing compiler change "a*b+a*c" to "a*(b+c)" automatically? It seems like such a switch would be logical to make ???

Andrei Ellman
Member #3,434
April 2003

Zaphos said:

Will an optimizing compiler change "a*b+a*c" to "a*(b+c)" automatically? It seems like such a switch would be logical to make

Good point. I picked up the tip from an e-book about x86 machine-code. An assembler doesn't normally do optimisations so there would be no risk of it being optimised. As for writing the code in C, the compiler has no means of taking a hint that accuracy is more important than speed. In GCC, you can switch off individual optimisations. It might be possible to switch off the optimisations that re-arrange equasions for speed, but that switch might also prevent other optimisations from being done so you will have to order your mathematical equasions manually in the form that gives the most speed (or the most accuracy depending on what you want).

Korval said:

Have I mentioned before that I hate Intel? Can't these people stop making up names for things?

They've had a long history of being awkward with their naming conventions.

The Intel 80386 came in two flavours, the SX and DX. The same for the 80486 but SX and DX mean different things. I'm not quite sure what it means in 80386's (might have something to do with a 16-bit memory-bus on the SX's), but on an 80486, the DX has a built in maths-co-processor (FPU), and the SX means that a separate FPU (the 80487) must be installed separately. The '80' bit was dropped because people just pronounced the last 3 digits, so they were re-named the i386 and i486. Somehow, the i586 wasn't good enough for the Pentium, so the name "Pentium" was used ('pent' meaning five). This meant that the next generation of CPUs would have to be called the "Hexium" or even worse, the "Sexium". Red-faced Intel marketing staff then came up with the name "Pentium Pro" for the i686 line of CPUs. The problem with the i686 was that although it improved rapidly on the execution of 32-bit instructions, it was actually slower at executing 16-bit instructions. Intel had made a boo-boo and had assumed most people would be running mostly 32-bit code by then (1996), but Windows95 - although a 32-bit OS - still had huge chunks of 16-bit code in it, so it actually ran slower on a Pentium Pro than a Pentium. A year later, they created the Pentium MMX. As Intel were worried about Windows 95 and it's 16-bit code, they based the Pentium MMX on the i586 instead of the i686 (tho' I'm not really sure on this). Finally in 1998, the Pentium Pro and MMX branches were merged and the Pentium II came into being. Either they had sped up the 16-bit instruction execution, or they realised that Windows 98 which was about to be released had a greater prpoprtion of 32-bit code than Windows 95. By now, the ix86 naming convention had been abandoned, and they just kept adding Numbers to the end of their pentiums, and each new pentium had an additional name (eg: PIII = "Katmai"). Since the Pentium II, Intel have been releasing cheaper versions of their CPUs with smaller caches known as the Celerons.

Rival chip-maker AMD are also guilty of producing misleading names. IIRC They re-named their early Athlons to Durons and released a new faster batch of Athlons.

My god, I'd hate to see what these guys call their kids.

AE.

--
Don't let the illegitimates turn you into carbon.

Evert
Member #794
November 2000
avatar

Quote:

(might have something to do with a 16-bit memory-bus on the SX's)

IIRC, that's exactly the difference between a 386SX and a DX, yes.
Interestingly enough, the 8086 an d 8088 (which we probably should call 86 and 88 if we call the 80386 the 386, but whatever) have the same sort of difference: the 8086 was a 16 bit processor with a 16 bit bus. The 8088 was a 16 bit processor with a 8 bit bus. The difference between the 80186 and 80188 (which haven't been around long) is similar.

Bob
Free Market Evangelist
September 2000
avatar

Quote:

As for writing the code in C, the compiler has no means of taking a hint that accuracy is more important than speed.

GCC has -ffast-math and -fieee-fp (or something like that). -ffast-math implies -fno-ieee-fp.

Quote:

it was actually slower at executing 16-bit instructions.

Not really. The Pentium Pro had two Achiles heels:
- Mixing operations of different sizes on the same registers would cause partial register stalls of 6 cycles (stall penalty reduced on the P2, but I'm not sure to how much).
- Prefixes would cost one additional cycle to process (most prefixes became 'free' in the P2)

There's nothing that's 16-bit specific; it's just that the code people wrote (or that compilers outputted) wasn't adapted to the CPU (or vice-versa, depending on your point-of-view :) )

Quote:

The IA32 name has been around for years

Yeah, around the same time Itanic was announced, back in 1995 or so.

--
- Bob
[ -- All my signature links are 404 -- ]

Thomas Fjellstrom
Member #476
June 2000
avatar

:o I could swear.. Oh well :D Hmm... What to by an Itanic, or a Quad Opteron with 6GB of memory drool

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Mars
Member #971
February 2001
avatar

BTW I just found the GCC warning option that always warns if you try to use == with floats. :)

--
This posting is a natural product. The slight variations in spelling and grammar enhance its individual character and beauty and in no way are to be considered flaws or defects.

Andrei Ellman
Member #3,434
April 2003

Mars said:

BTW I just found the GCC warning option that always warns if you try to use == with floats.

It's -Wfloat-equal

--
Don't let the illegitimates turn you into carbon.

 1   2 


Go to: