Allegro.cc - Online Community

Allegro.cc Forums » Programming Questions » bool faster than bitwise?

This thread is locked; no one can reply to it. rss feed Print
 1   2 
bool faster than bitwise?
Mark Oates
Member #1,146
March 2001
avatar

is it faster in practice to list your boolean values individually, or pack them into a bit and get their values with bitwise operators?

In magic land, it seems like bitwise would be faster to get because it's more 'low level'. But I imagine it would actually be slower since you need to operate on the bit before doing the actual true/false comparison.

--
Visit CLUBCATT.com for cat shirts, cat mugs, puzzles, art and more <-- coupon code ALLEGRO4LIFE at checkout and get $3 off any order of 3 or more items!

AllegroFlareAllegroFlare DocsAllegroFlare GitHub

Arthur Kalliokoski
Second in Command
February 2005
avatar

There are single assembly language instructions to test individual bits, and you can test multiple bits at once with immediate operands, although IIRC those are slightly slower than comparing entire integers to zero (boolean). As always, time it both ways (with rdtsc, if possible) and go with the faster version. If the times are so close as to be inconclusive, go with the "easy to understand" version.

Nota bene: If you have a huge amount of items to test, using the bitfields may be faster since they won't blow out the L1 cache.

They all watch too much MSNBC... they get ideas.

Milan Mimica
Member #3,877
September 2003
avatar

I challenge you to write a test program which proves one technique is faster than the other.

ImLeftFooted
Member #3,935
October 2003
avatar

bool isBoolFasterThanBitwise()
{
    return rand() & 1;
}

(It uses both bitwise and bool!)

Jonatan Hedborg
Member #4,886
July 2004
avatar

I'd bet my bamccaig's left nut this is never going to be a performance bottleneck in any real application.

That being said; I would imagine the bool version is faster, since it deals with the CPU's native word size and has fewer operations. Cache issues might come into effect at some point... If you have gazillions of these checks, it might be a memory usage issue more than a performance issue.

James Stanley
Member #7,275
May 2006
avatar

I wrote an eratosthenes sieve prime calculator, and I assumed that using bools (actually it was C so I used chars...) would be faster than using a bit vector. At one point I started running out of memory so I switched to using a bit vector and was astounded by how much faster it was.

But as Jonatan says, in any real program the bit vector is unlikely to be the performance bottleneck.

Shravan
Member #10,724
February 2009
avatar

Unless limited by the amount of memory, I would prefer to use bool.

Trent Gamblin
Member #261
April 2000
avatar

I would bet that boolean is faster, since at a minimum, a bit comes from a byte, and there is the extra bitwise operation to get at the bit you want plus there's the division at a higher level to get which byte you want.

I would use bitfields where you need a lot/have low memory requirements and booleans elsewhere. I doubt it'll make a big difference, except memory wise.

Oscar Giner
Member #2,207
April 2002
avatar

Bitwise will usually be faster. But it depends on how you access those bits. If you access them in a random order it may get slower, but accessing them in order will be a lot faster than using bools. The operators needed to access a single bit are very simple, most of them requiring just 1 CPU cycle, so they're not really a problem. Packing the data in bits will greatly reduce the memory bandwidth usage and will help cache a lot too. Of course the difference becomes noticeable when you access a lot of these bits (for a perfect pixel collision algortithm for example).

As a note, std::vector<bool> is specialized to use a single bit to store each bool.

James Stanley
Member #7,275
May 2006
avatar

I would bet that boolean is faster, since at a minimum, a bit comes from a byte, and there is the extra bitwise operation to get at the bit you want plus there's the division at a higher level to get which byte you want.

This is what I was saying in my post about prime calculation. If the array is big enough, then the bit vector wins on performance (as well as memory usage) because you get 8 times as much information in the cache.

Evert
Member #794
November 2000
avatar

In magic land, it seems like bitwise would be faster to get because it's more 'low level'. But I imagine it would actually be slower since you need to operate on the bit before doing the actual true/false comparison.

Correct.
You can argue both sides of this argument, in the end, it'll depend on the particular application. If you're using enough bools that you're hammering the processor cache, then using bit fields may be faster because you can simply fit more data in the cache. If you're not filling up the CPU cache, then a plain bool (int) will probably be faster, since you don't have to do the extra work of extracting the bits.

Trent Gamblin
Member #261
April 2000
avatar

Ok, you made me do it:

#SelectExpand
1// test1.c 2#include <stdio.h> 3#include <stdlib.h> 4#include <time.h> 5 6#define NUM 1024*1024*100 7 8char bools[NUM]; 9 10int main(void) 11{ 12 int i; 13 int count = 0; 14 clock_t start, end; 15 16 srand(time(NULL)); 17 18 for (i = 0; i < NUM; i++) { 19 bools[i] = rand() % 2; 20 } 21 22 start = clock(); 23 24 for (i = 0; i < NUM; i++) { 25 if (bools[i]) { 26 count++; 27 } 28 } 29 30 end = clock(); 31 32 printf("count = %d\n", count); 33 34 printf("time = %g seconds\n", (float)(end-start)/CLOCKS_PER_SEC); 35}

#SelectExpand
1// test2.c 2 3#include <stdio.h> 4#include <stdlib.h> 5#include <time.h> 6 7#define NUM 1024*1024*100 8 9char bools[NUM/8]; 10 11int main(void) 12{ 13 int i; 14 int count = 0; 15 clock_t start, end; 16 int bit = 0; 17 int byte = 0; 18 19 srand(time(NULL)); 20 21 for (i = 0; i < NUM/8; i++) { 22 bools[i] = rand() % 256; 23 } 24 25 start = clock(); 26 27 for (i = 0; i < NUM; i++) { 28 if ((bools[byte] >> bit) & 1) { 29 count++; 30 } 31 bit++; 32 if (bit == 8) { 33 byte++; 34 bit = 0; 35 } 36 } 37 38 end = clock(); 39 40 printf("count = %d\n", count); 41 42 printf("time = %g seconds\n", (float)(end-start)/CLOCKS_PER_SEC); 43}

test1 results:

count = 52424828
time = 0.990623 seconds
count = 52432056
time = 0.990727 seconds
count = 52430310
time = 0.991179 seconds

test2 results:

count = 52423770
time = 1.33507 seconds
count = 52422236
time = 1.33386 seconds
count = 52423791
time = 1.33561 seconds

Jonatan Hedborg
Member #4,886
July 2004
avatar

Could you try it with

for (i = 0; i < NUM; i++) {
    if ((bools[byte] >> bit) & 1) {
      count++;
    }
    bit++;
    if (bit == 8) {
      byte++;
      bit = 0;
    }
  }

Replaced by

for (i = 0; i < NUM; i++) {
    int b = i/8;
    if ((bools[b] >> (i - b*8)) & 1) {
      count++;
    }
  }

?

Just curious if "if" and increments is faster than a "%".

EDIT: err, my bad. Needs some more modification ;)

EDIT2: That..might work?

Trent Gamblin
Member #261
April 2000
avatar

That won't even work, since it never increases byte.

Jonatan Hedborg
Member #4,886
July 2004
avatar

Updated my code. That should work?

Trent Gamblin
Member #261
April 2000
avatar

count = 52426098
time = 1.5705 seconds

[edit] Offtopic but interesting just how random rand() is on my system :)

Jonatan Hedborg
Member #4,886
July 2004
avatar

I wonder what the difference would be for random access of bits, with such a large data set.

Edit: I missed one obvious optimization (that's what you get for only using high-level languages :( )

  for (i = 0; i < NUM; i++) {
      int b = i>>3;
      if ((bools[b] >> (i - b*8)) & 1) {
        count++;
      }
    }

Elias
Member #358
May 2000

What about:

for (i = 0; i < NUM; i++) {
    if ((bools[i] & 1) {
        count++;
    }
}

Yes, it will only test one of the possible 8 bits, but in your test case above you're mostly measuring the speed of changing your bit variable and not what we are interested in which is the boolean check. In actual code you would also test against a fixed bit (like 1) so it's the speed we're interested in.

[edit:}

Or alternatively, have another loop where you do just your bit manipulation (but not the check) then subtract that time from the result.

--
"Either help out or stop whining" - Evert

Trent Gamblin
Member #261
April 2000
avatar

First test machine was Intel C2D. I tested on a single core amd and got:

1.3 seconds
1.5 seconds
2.0 seconds

For test1, test2, and jonathans test.

[edit]

Elias, Actually, what we're interested in is testing a bunch of bits. Timing == vs & 1 is meaningless.

[edit2]

Jonathan, with that shift it's 1.4 seconds.

Elias
Member #358
May 2000

Another test would be, for actual bitfields:

#SelectExpand
1#include <time.h> 2 3#define NUM 1024*1024*100 4 5struct Bools 6{ 7 int bool1 : 1; 8}; 9 10struct Bools bools[NUM]; 11 12int main(void) 13{ 14 int i; 15 int count = 0; 16 clock_t start, end; 17 18 srand(time(NULL)); 19 20 for (i = 0; i < NUM; i++) { 21 bools[i].bool1 = rand() % 2; 22 } 23 24 start = clock(); 25 26 for (i = 0; i < NUM; i++) { 27 if (bools[i].bool1) { 28 count++; 29 } 30 } 31 32 end = clock(); 33 34 printf("count = %d\n", count); 35 36 printf("time = %g seconds\n", (float)(end-start)/CLOCKS_PER_SEC); 37}

[edit:]

Elias, Actually, what we're interested in is testing a bunch of bits. Timing == vs & 1 is meaningless.

No, it makes no difference which bit you test. But you can use this instead to test all 8, but without the time comsuming code to change a variable:

#SelectExpand
1for (i = 0; i < NUM; i++) { 2 if ((bools[i] & 1) { 3 count++; 4 } 5 if ((bools[i] & 2) { 6 count++; 7 } 8 if ((bools[i] & 4) { 9 count++; 10 } 11 if ((bools[i] & 8) { 12 count++; 13 } 14 if ((bools[i] & 16) { 15 count++; 16 } 17 if ((bools[i] & 32) { 18 count++; 19 } 20 if ((bools[i] & 64) { 21 count++; 22 } 23 if ((bools[i] & 128) { 24 count++; 25 } 26}

--
"Either help out or stop whining" - Evert

Trent Gamblin
Member #261
April 2000
avatar

That test is only slightly slower than the bool test. I'll bet the compiler is using bytes there anyhow, and the compiler adds a bit of overhead

count = 52431537
time = 1.08432 seconds

Mark Oates
Member #1,146
March 2001
avatar

the compiler adds a bit of overhead

You couldn't ask for less overhead! :D

--
Visit CLUBCATT.com for cat shirts, cat mugs, puzzles, art and more <-- coupon code ALLEGRO4LIFE at checkout and get $3 off any order of 3 or more items!

AllegroFlareAllegroFlare DocsAllegroFlare GitHub

Trent Gamblin
Member #261
April 2000
avatar

I think the tests so far aren't very realistic. I'll whip up something better...

Oscar Giner
Member #2,207
April 2002
avatar

Of course if you write suboptimal code it's going to be slower :P

Why don't you try this instead:

#SelectExpand
1#include <stdio.h> 2#include <stdlib.h> 3#include <time.h> 4 5#define NUM 1024*1024*100 6 7char bools[NUM/8]; 8 9int main(void) 10{ 11 int i; 12 int count = 0; 13 clock_t start, end; 14 int bit = 0; 15 16 srand(time(NULL)); 17 18 for (i = 0; i < NUM/8; i++) { 19 bools[i] = rand() % 256; 20 } 21 22 start = clock(); 23 24 for (i = 0; i < NUM; i++) { 25 if (bools[i>>3] & (0x80 >> (i&0x7))) count++; 26 } 27 28 end = clock(); 29 30 printf("count = %d\n", count); 31 32 printf("time = %g seconds\n", (float)(end-start)/CLOCKS_PER_SEC); 33}

Test1: 0.34 sec
Test2: 0.145 sec

I don't have much time, maybe I've done some error with all that bitwise operations, so correct me if there's a misstake.

<edit>
Corrected small error: it was i&0x7, no i&0x8. (I want a mask of the last 3 bits).

bamccaig
Member #7,536
July 2006
avatar

I'd bet my bamccaig's left nut this is never going to be a performance bottleneck in any real application.

You stay away from my left nut. >:(

 1   2 


Go to: