Allegro.cc - Online Community

Allegro.cc Forums » Programming Questions » Strange bug in transmission of float values over TCP/IP

This thread is locked; no one can reply to it. rss feed Print
 1   2   3 
Strange bug in transmission of float values over TCP/IP
bamccaig
Member #7,536
July 2006
avatar

I get the same results as axilmar with both programs in both 32-bit Linux and 32-bit Windows (compiled with GCC and MinGW).

Arthur Kalliokoski
Second in Command
February 2005
avatar

t.cpp:17: warning: this decimal constant is unsigned only in ISO C90
t.cpp: In function 'int main()':
t.cpp:18: warning: dereferencing type-punned pointer will break strict-aliasing rules

I didn't investigate further

“Throughout history, poverty is the normal condition of man. Advances which permit this norm to be exceeded — here and there, now and then — are the work of an extremely small minority, frequently despised, often condemned, and almost always opposed by all right-thinking people. Whenever this tiny minority is kept from creating, or (as sometimes happens) is driven out of a society, the people then slip back into abject poverty. This is known as "bad luck.”

― Robert A. Heinlein

axilmar
Member #1,204
April 2001

The problem is that the float value that has its bytes swapped is pushed to the floating point stack. When popped from the stack, it is rounded by the hardware, but the value has swapped bytes, and therefore the rounding is wrong.

Arthur Kalliokoski
Second in Command
February 2005
avatar

The rounding, if any, would only occur on the least significant bits.

“Throughout history, poverty is the normal condition of man. Advances which permit this norm to be exceeded — here and there, now and then — are the work of an extremely small minority, frequently despised, often condemned, and almost always opposed by all right-thinking people. Whenever this tiny minority is kept from creating, or (as sometimes happens) is driven out of a society, the people then slip back into abject poverty. This is known as "bad luck.”

― Robert A. Heinlein

Thomas Fjellstrom
Member #476
June 2000
avatar

The rounding, if any, would only occur on the least significant bits.

The problem is his endian swapping function is passing the "swapped" value as a float, and it shouldn't. At that point it still has some of the data swapped. Which is bad. Instead it should be passed as an unsigned int, or something.

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Arthur Kalliokoski
Second in Command
February 2005
avatar

it should be passed as an unsigned int

Ah, ok. As long as it's not the other way around (nan's)

“Throughout history, poverty is the normal condition of man. Advances which permit this norm to be exceeded — here and there, now and then — are the work of an extremely small minority, frequently despised, often condemned, and almost always opposed by all right-thinking people. Whenever this tiny minority is kept from creating, or (as sometimes happens) is driven out of a society, the people then slip back into abject poverty. This is known as "bad luck.”

― Robert A. Heinlein

Thomas Fjellstrom
Member #476
June 2000
avatar

Or heck, an array of unsigned char might be best. That way nothing should muck with the data before the swapper.

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

bamccaig
Member #7,536
July 2006
avatar

t.cpp:17: warning: this decimal constant is unsigned only in ISO C90
t.cpp: In function 'int main()':
t.cpp:18: warning: dereferencing type-punned pointer will break strict-aliasing rules

I didn't investigate further

I saw this as well with both programs (-Wall). Interestingly, on my server running Gentoo in a XEN VM, I don't get this warning and I get the intended results from both programs.

It's referring to this line, IIRC: unsigned long l = 3206974079;

Thomas Fjellstrom
Member #476
June 2000
avatar

It probably depends on the optimization level, and the version of GCC.

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Evert
Member #794
November 2000
avatar

bamccaig said:

I get the same results as axilmar with both programs in both 32-bit Linux and 32-bit Windows (compiled with GCC and MinGW).

Interesting:

eglebbk@morgaine:~/tmp>gcc -Wall -m64 -O0 test.c 
eglebbk@morgaine:~/tmp>./a.out 
7F 8A 26 BF
7F 8A 26 BF
eglebbk@morgaine:~/tmp>gcc -Wall -m64 -O2 test.c 
eglebbk@morgaine:~/tmp>./a.out 
7F 8A 26 BF
7F 8A 26 BF
eglebbk@morgaine:~/tmp>gcc -Wall -m32 -O0 test.c 
test.c: In function ‘main’:
test.c:16: warning: this decimal constant is unsigned only in ISO C90
eglebbk@morgaine:~/tmp>./a.out 
7F 8A 26 BF
7F CA 26 BF
eglebbk@morgaine:~/tmp>gcc -Wall -m32 -O2 test.c 
test.c: In function ‘main’:
test.c:16: warning: this decimal constant is unsigned only in ISO C90
eglebbk@morgaine:~/tmp>./a.out 
7F 8A 26 BF
7F CA 26 BF
eglebbk@morgaine:~/tmp>gcc -Wall -m32 -O3 test.c 
test.c: In function ‘main’:
test.c:16: warning: this decimal constant is unsigned only in ISO C90
eglebbk@morgaine:~/tmp>./a.out 
7F 8A 26 BF
7F 8A 26 BF
eglebbk@morgaine:~/tmp>

So 32 bit vs 64 bit makes a difference, and compiler flags make a difference. No real surprise there, I guess.
EDIT: I guess that in the last instance the problem doesn't show up because the compiler optimises away the conversion.

The problem is his endian swapping function is passing the "swapped" value as a float, and it shouldn't. At that point it still has some of the data swapped.

Indeed.
I guess the lesson is to use integer datatypes whenever you're dealing with bit patterns directly in any way.

Or heck, an array of unsigned char might be best. That way nothing should muck with the data before the swapper.

I'd still use a union. :P
Possibly about one of the few things a union is really useful for.

GullRaDriel
Member #3,861
September 2003
avatar

As stated before, I had the problem with floats, but not with doubles. Results may vary from target to target.

I assume that the best way to send floating values and keep them as they are is to send them converted into a string, and recv converting them back to a float.

Links that could help:
http://codeidol.com/csharp/csharp-network/Using-The-Csharp-Sockets-Helper-Classes/Moving-Data-across-the-Network/

Last search:

http://www.experts-exchange.com/Programming/Languages/C/Q_20266384.html said:

Some remarks from an old sod who's been there and done that. Never convert a float (or double) to its decimal representation because it's so soft for your hands and it transmits so portably over a socket or whatever.

Most floating point numbers cannot be represented exactly in decimal AND binary radix notation. Rounding errors
will kill you somewhere in the near future.

I consider the lack of htonf, htond and their counterparts
a severe omission from the htonX suite of macros/functions.

And there is more misery showing up here; passing a struct
to another hardware platform doesn't make any sense either; how about internal and trailing padding? How about alignment of the individual members? Consider this:

struct _my_thingy {
int i;
float f;
}

Maybe on my architecture the sizeof(struct _my_thingy) == 8, what all the Intelians expect. What about 64 bit integers then? what about 8 byte alignment on some machines?

As a general rule of thumb: don't pass structs around from earth to mars. Instead, unravel them into their individual members and pass those around instead, which brings us back again at the question -- how do I send a float over a cable somewhere to something else?

Here's a possible portable way of doing this. The function:

double frexp(double x, int* exp);

returns a normalized (double) floatng point number, and stores the (binary) exponent in *exp, given any number x.

We're halfway there, given the htonl() macro/function, we can send the exponent to the other world safely. What about that normalized mantissa? The mantissa happens to be a number in the range [1/2 ... 1) if x was non zero, otherwise this normalized number equals zero also.

A cheap trick (assuming a float number can be stored in
at most four bytes, including the exponent) is this:

long mant = f*0x40000000;

Variable mant contains the mantissa, multiplied by a
huge number, just enough to keep all binary digits.

This long int number can be transmitted to the other
world using htonl() again.

The other world receives the exponent, uses ntohl() to
transform it back to its internal format. Next the mantissa is received, ntohl() is applied again, the result is divided by 0x40000000 and finally the function ldexp:

double ldexp(double mant, int exp);

is applied in order to get the original number back again in the alien format.

I know it's quite a job to get things 'portable', but all MS dependent assumptions are simply show stopppers here ...

kind regards,

Jos

EDIT:
Last minute gem: You can also use the xdr() function which is given to do what you want.

You may also want to have a look at JSON and BSON.

Also, TheBeejGuide has been updated and now have a little "how to send" data ^^

I'm done for it !

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

Arthur Kalliokoski
Second in Command
February 2005
avatar

If doubles aren't causing a problem, I'd say it's just because there are fewer rounding differences, you just haven't found the values that will mess up yet.

“Throughout history, poverty is the normal condition of man. Advances which permit this norm to be exceeded — here and there, now and then — are the work of an extremely small minority, frequently despised, often condemned, and almost always opposed by all right-thinking people. Whenever this tiny minority is kept from creating, or (as sometimes happens) is driven out of a society, the people then slip back into abject poverty. This is known as "bad luck.”

― Robert A. Heinlein

axilmar
Member #1,204
April 2001

I assume that the best way to send floating values and keep them as they are is to send them converted into a string, and recv converting them back to a float.

No, there is no need for that. Just reverse the float in-place and send it over the network. The real problem is when the reversed float is pushed to the FP stack by the callee, and then popped from the FP stack from the caller. Then it gets messed up.

BAF
Member #2,981
December 2002
avatar

Quote:

float f2 = ntohf(ntohf(f1));

Why the hell are you calling ntohf twice? Shouldn't that be ntohf(htonf(f1))? ???

GullRaDriel
Member #3,861
September 2003
avatar

I'm OK with what Arthur said about the double, but I'm not totally convinced with your quote, axilmar.

Nothing's more portable than the converted to ascii trick.

Bwah ! I don't care.

;D

EDIT: You should really go and read the beej's link I posted before. It's a lot more complete than the previous versions.

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

Sevalecan
Member #4,686
June 2004
avatar

BAF said:

Why the hell are you calling ntohf twice? Shouldn't that be ntohf(htonf(f1))?

Yeah, that'll fuck it up.

TeamTerradactyl: SevalecanDragon: I should shoot you for even CONSIDERING coding like that, but I was ROFLing too hard to stand up. I love it!
My blog about computer nonsense, etc.

Arthur Kalliokoski
Second in Command
February 2005
avatar

How would it fuck it up? Both ntohs() and htonl() simply swap the bytes on little endian machines but leave them intact on big-endian machines, and if you do it twice you have the original again.

#SelectExpand
1#include <stdio.h> 2#include <netinet/in.h> 3 4int a = 0x12345678; 5int b; 6 7int main(void) 8{ 9 printf("Original number is 0x%X\n",a); 10 printf("Converting with ntohl() gives "); 11 b = ntohl(a); 12 printf("0x%X\n",b); 13 printf("Convert back with ntohl() gives "); 14 a = ntohl(b); 15 printf("0x%X\n",a); 16 17 printf("\nNow doing it the \"right\" way with ntohl() and htonl()\n"); 18 printf("Original number is 0x%X\n",a); 19 printf("Converting with ntohl() gives "); 20 b = ntohl(a); 21 printf("0x%X\n",b); 22 printf("Convert back with htonl() gives "); 23 a = htonl(b); 24 printf("0x%X\n",a); 25 26 return 0; 27}

“Throughout history, poverty is the normal condition of man. Advances which permit this norm to be exceeded — here and there, now and then — are the work of an extremely small minority, frequently despised, often condemned, and almost always opposed by all right-thinking people. Whenever this tiny minority is kept from creating, or (as sometimes happens) is driven out of a society, the people then slip back into abject poverty. This is known as "bad luck.”

― Robert A. Heinlein

GullRaDriel
Member #3,861
September 2003
avatar

And so, in which way calling it twice IS useful ?

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

axilmar
Member #1,204
April 2001

BAF said:

Why the hell are you calling ntohf twice? Shouldn't that be ntohf(htonf(f1))?

Both ntohf and htonf do the exact same job: they reverse the bytes of the variable.

but I'm not totally convinced with your quote, axilmar.

Nothing's more portable than the converted to ascii trick.

IEEE 754 floats have a specific representation, so there is no need to convert it to ASCII.

And so, in which way calling it twice IS useful ?

It's not useful in an application, I just put it up to demonstrate the problem.

When transmitting float values over the network, the transmitter application does 'htonf(f)' and the receiver does 'ntohf(f)'.

Evert
Member #794
November 2000
avatar

So, did you solve the problem by not interpreting the value as a float until the bits had been properly unscrambled?

axilmar
Member #1,204
April 2001

Evert said:

So, did you solve the problem by not interpreting the value as a float until the bits had been properly unscrambled?

I solved the problem by doing in-place reversal of bytes in the packet to be transmitted.

bamccaig
Member #7,536
July 2006
avatar

Interesting that the ntohf functions written in this thread don't actually account for the host system's endianness. :P

axilmar
Member #1,204
April 2001

bamccaig said:

Interesting that the ntohf functions written in this thread don't actually account for the host system's endianness.

Please elaborate? the endianess swapping function I posted converts a float value from little endian to big endian, and it's useful in 80x86 systems; it's not cross platform.

GullRaDriel
Member #3,861
September 2003
avatar

axilmar said:

IEEE 754 floats have a specific representation, so there is no need to convert it to ASCII.

The mainframes in use at our office does not stand IEEE 754. Plus they are EBCDIC ^^

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

Oscar Giner
Member #2,207
April 2002
avatar

If you only support x86 systems, what's the point in converting the values to big endian? Just send everything in little endian.

 1   2   3 


Go to: