Strange bug in transmission of float values over TCP/IP

Strange bug in transmission of float values over TCP/IP

bamccaig

Member #7,536

July 2006

I get the same results as axilmar with both programs in both 32-bit Linux and 32-bit Windows (compiled with GCC and MinGW).

--
I mean the best with what I say. It doesn't always sound that way.

Arthur Kalliokoski

Second in Command

February 2005

t.cpp:17: warning: this decimal constant is unsigned only in ISO C90
t.cpp: In function 'int main()':
t.cpp:18: warning: dereferencing type-punned pointer will break strict-aliasing rules

I didn't investigate further

They all watch too much MSNBC... they get ideas.

axilmar

Member #1,204

April 2001

The problem is that the float value that has its bytes swapped is pushed to the floating point stack. When popped from the stack, it is rounded by the hardware, but the value has swapped bytes, and therefore the rounding is wrong.

ALGUI: c++11 A5 GUI library.

Arthur Kalliokoski

Second in Command

February 2005

The rounding, if any, would only occur on the least significant bits.

They all watch too much MSNBC... they get ideas.

Thomas Fjellstrom

Member #476

June 2000

Arthur Kalliokoski said:

The rounding, if any, would only occur on the least significant bits.

The problem is his endian swapping function is passing the "swapped" value as a float, and it shouldn't. At that point it still has some of the data swapped. Which is bad. Instead it should be passed as an unsigned int, or something.

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Arthur Kalliokoski

Second in Command

February 2005

Thomas Fjellstrom said:

it should be passed as an unsigned int

Ah, ok. As long as it's not the other way around (nan's)

They all watch too much MSNBC... they get ideas.

Thomas Fjellstrom

Member #476

June 2000

Or heck, an array of unsigned char might be best. That way nothing should muck with the data before the swapper.

bamccaig

Member #7,536

July 2006

Arthur Kalliokoski said:

t.cpp:17: warning: this decimal constant is unsigned only in ISO C90
t.cpp: In function 'int main()':
t.cpp:18: warning: dereferencing type-punned pointer will break strict-aliasing rules

I didn't investigate further

I saw this as well with both programs (-Wall). Interestingly, on my server running Gentoo in a XEN VM, I don't get this warning and I get the intended results from both programs.

It's referring to this line, IIRC: unsigned long l = 3206974079;

--
I mean the best with what I say. It doesn't always sound that way.

Thomas Fjellstrom

Member #476

June 2000

It probably depends on the optimization level, and the version of GCC.

Evert

Member #794

November 2000

bamccaig said:

I get the same results as axilmar with both programs in both 32-bit Linux and 32-bit Windows (compiled with GCC and MinGW).

Interesting:

eglebbk@morgaine:~/tmp>gcc -Wall -m64 -O0 test.c 
eglebbk@morgaine:~/tmp>./a.out 
7F 8A 26 BF
7F 8A 26 BF
eglebbk@morgaine:~/tmp>gcc -Wall -m64 -O2 test.c 
eglebbk@morgaine:~/tmp>./a.out 
7F 8A 26 BF
7F 8A 26 BF
eglebbk@morgaine:~/tmp>gcc -Wall -m32 -O0 test.c 
test.c: In function ‘main’:
test.c:16: warning: this decimal constant is unsigned only in ISO C90
eglebbk@morgaine:~/tmp>./a.out 
7F 8A 26 BF
7F CA 26 BF
eglebbk@morgaine:~/tmp>gcc -Wall -m32 -O2 test.c 
test.c: In function ‘main’:
test.c:16: warning: this decimal constant is unsigned only in ISO C90
eglebbk@morgaine:~/tmp>./a.out 
7F 8A 26 BF
7F CA 26 BF
eglebbk@morgaine:~/tmp>gcc -Wall -m32 -O3 test.c 
test.c: In function ‘main’:
test.c:16: warning: this decimal constant is unsigned only in ISO C90
eglebbk@morgaine:~/tmp>./a.out 
7F 8A 26 BF
7F 8A 26 BF
eglebbk@morgaine:~/tmp>

So 32 bit vs 64 bit makes a difference, and compiler flags make a difference. No real surprise there, I guess.
EDIT: I guess that in the last instance the problem doesn't show up because the compiler optimises away the conversion.

Thomas Fjellstrom said:

The problem is his endian swapping function is passing the "swapped" value as a float, and it shouldn't. At that point it still has some of the data swapped.

Indeed.
I guess the lesson is to use integer datatypes whenever you're dealing with bit patterns directly in any way.

Thomas Fjellstrom said:

Or heck, an array of unsigned char might be best. That way nothing should muck with the data before the swapper.

I'd still use a union.
Possibly about one of the few things a union is really useful for.

GullRaDriel

Member #3,861

September 2003

As stated before, I had the problem with floats, but not with doubles. Results may vary from target to target.

I assume that the best way to send floating values and keep them as they are is to send them converted into a string, and recv converting them back to a float.

Links that could help:
http://codeidol.com/csharp/csharp-network/Using-The-Csharp-Sockets-Helper-Classes/Moving-Data-across-the-Network/

Last search:

http://www.experts-exchange.com/Programming/Languages/C/Q_20266384.html said:

Some remarks from an old sod who's been there and done that. Never convert a float (or double) to its decimal representation because it's so soft for your hands and it transmits so portably over a socket or whatever.

Most floating point numbers cannot be represented exactly in decimal AND binary radix notation. Rounding errors
will kill you somewhere in the near future.

I consider the lack of htonf, htond and their counterparts
a severe omission from the htonX suite of macros/functions.

And there is more misery showing up here; passing a struct
to another hardware platform doesn't make any sense either; how about internal and trailing padding? How about alignment of the individual members? Consider this:

struct _my_thingy {
int i;
float f;
}

Maybe on my architecture the sizeof(struct _my_thingy) == 8, what all the Intelians expect. What about 64 bit integers then? what about 8 byte alignment on some machines?

As a general rule of thumb: don't pass structs around from earth to mars. Instead, unravel them into their individual members and pass those around instead, which brings us back again at the question -- how do I send a float over a cable somewhere to something else?

Here's a possible portable way of doing this. The function:

double frexp(double x, int* exp);

returns a normalized (double) floatng point number, and stores the (binary) exponent in *exp, given any number x.

We're halfway there, given the htonl() macro/function, we can send the exponent to the other world safely. What about that normalized mantissa? The mantissa happens to be a number in the range [1/2 ... 1) if x was non zero, otherwise this normalized number equals zero also.

A cheap trick (assuming a float number can be stored in
at most four bytes, including the exponent) is this:

long mant = f*0x40000000;

Variable mant contains the mantissa, multiplied by a
huge number, just enough to keep all binary digits.

This long int number can be transmitted to the other
world using htonl() again.

The other world receives the exponent, uses ntohl() to
transform it back to its internal format. Next the mantissa is received, ntohl() is applied again, the result is divided by 0x40000000 and finally the function ldexp:

double ldexp(double mant, int exp);

is applied in order to get the original number back again in the alien format.

I know it's quite a job to get things 'portable', but all MS dependent assumptions are simply show stopppers here ...

kind regards,

Jos

EDIT:
Last minute gem: You can also use the xdr() function which is given to do what you want.

You may also want to have a look at JSON and BSON.

Also, TheBeejGuide has been updated and now have a little "how to send" data ^^

I'm done for it !

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

Arthur Kalliokoski

Second in Command

February 2005

If doubles aren't causing a problem, I'd say it's just because there are fewer rounding differences, you just haven't found the values that will mess up yet.

They all watch too much MSNBC... they get ideas.

axilmar

Member #1,204

April 2001

GullRaDriel said:

I assume that the best way to send floating values and keep them as they are is to send them converted into a string, and recv converting them back to a float.

No, there is no need for that. Just reverse the float in-place and send it over the network. The real problem is when the reversed float is pushed to the FP stack by the callee, and then popped from the FP stack from the caller. Then it gets messed up.

ALGUI: c++11 A5 GUI library.

BAF

Member #2,981

December 2002

Quote:

float f2 = ntohf(ntohf(f1));

Why the hell are you calling ntohf twice? Shouldn't that be ntohf(htonf(f1))?

BAF.zone | SantaHack!

GullRaDriel

Member #3,861

September 2003

I'm OK with what Arthur said about the double, but I'm not totally convinced with your quote, axilmar.

Nothing's more portable than the converted to ascii trick.

Bwah ! I don't care.

EDIT: You should really go and read the beej's link I posted before. It's a lot more complete than the previous versions.

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

Sevalecan

Member #4,686

June 2004

BAF said:

Why the hell are you calling ntohf twice? Shouldn't that be ntohf(htonf(f1))?

Yeah, that'll fuck it up.

TeamTerradactyl: SevalecanDragon: I should shoot you for even CONSIDERING coding like that, but I was ROFLing too hard to stand up. I love it!
My blog about computer nonsense, etc.

Arthur Kalliokoski

Second in Command

February 2005

How would it fuck it up? Both ntohs() and htonl() simply swap the bytes on little endian machines but leave them intact on big-endian machines, and if you do it twice you have the original again.

#SelectExpand
  1#include <stdio.h>
  2#include <netinet/in.h>
  3
  4int a = 0x12345678;
  5int b;
  6
  7int main(void)
  8{
  9  printf("Original number is 0x%X\n",a);
 10  printf("Converting with ntohl() gives ");
 11  b = ntohl(a);
 12  printf("0x%X\n",b);
 13  printf("Convert back with ntohl() gives ");
 14  a = ntohl(b);
 15  printf("0x%X\n",a);
 16
 17  printf("\nNow doing it the \"right\" way with ntohl() and htonl()\n");
 18  printf("Original number is 0x%X\n",a);
 19  printf("Converting with ntohl() gives ");
 20  b = ntohl(a);
 21  printf("0x%X\n",b);
 22  printf("Convert back with htonl() gives ");
 23  a = htonl(b);
 24  printf("0x%X\n",a);
 25  
 26  return 0;
 27}

They all watch too much MSNBC... they get ideas.

GullRaDriel

Member #3,861

September 2003

And so, in which way calling it twice IS useful ?

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

axilmar

Member #1,204

April 2001

BAF said:

Why the hell are you calling ntohf twice? Shouldn't that be ntohf(htonf(f1))?

Both ntohf and htonf do the exact same job: they reverse the bytes of the variable.

GullRaDriel said:

but I'm not totally convinced with your quote, axilmar.

Nothing's more portable than the converted to ascii trick.

IEEE 754 floats have a specific representation, so there is no need to convert it to ASCII.

GullRaDriel said:

And so, in which way calling it twice IS useful ?

It's not useful in an application, I just put it up to demonstrate the problem.

When transmitting float values over the network, the transmitter application does 'htonf(f)' and the receiver does 'ntohf(f)'.

ALGUI: c++11 A5 GUI library.

Evert

Member #794

November 2000

So, did you solve the problem by not interpreting the value as a float until the bits had been properly unscrambled?

axilmar

Member #1,204

April 2001

Evert said:

So, did you solve the problem by not interpreting the value as a float until the bits had been properly unscrambled?

I solved the problem by doing in-place reversal of bytes in the packet to be transmitted.

ALGUI: c++11 A5 GUI library.

bamccaig

Member #7,536

July 2006

Interesting that the ntohf functions written in this thread don't actually account for the host system's endianness.

--
I mean the best with what I say. It doesn't always sound that way.

axilmar

Member #1,204

April 2001

bamccaig said:

Interesting that the ntohf functions written in this thread don't actually account for the host system's endianness.

Please elaborate? the endianess swapping function I posted converts a float value from little endian to big endian, and it's useful in 80x86 systems; it's not cross platform.

ALGUI: c++11 A5 GUI library.

GullRaDriel

Member #3,861

September 2003

axilmar said:

IEEE 754 floats have a specific representation, so there is no need to convert it to ASCII.

The mainframes in use at our office does not stand IEEE 754. Plus they are EBCDIC ^^

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

Oscar Giner

Member #2,207

April 2002

If you only support x86 systems, what's the point in converting the values to big endian? Just send everything in little endian.

--
[Website | e-mail]
[Tetris Unlimited] [AllegAVI | AlText]