![]() |
|
This thread is locked; no one can reply to it.
![]() ![]() |
1
2
|
bool faster than bitwise? |
Jonatan Hedborg
Member #4,886
July 2004
![]() |
That's some fancy bitwise magic you got there Oscar, nice EDIT: count = 52440724 vs. count = 52432916 Still, it's better than the other bitfield-tests.
|
Trent Gamblin
Member #261
April 2000
![]() |
It's still about 1/10 of a second slower here... what machine do you have that it runs in 0.1 seconds? O_O. Also, it tests the bytes in reverse order of the other program. Here's some random access tests, in these the bool method is a bit faster: 1// randtest1.c
2#include <stdio.h>
3#include <stdlib.h>
4#include <time.h>
5
6#define NUM 1024*1024*100
7#define SQRT_NUM 10240
8#define SIZE 50
9#define LOOPS 50000
10#define TESTS 5
11
12char bools[NUM];
13int r[LOOPS*2];
14
15int main(void)
16{
17 clock_t start, end;
18 int test;
19 int loop;
20 int i, j;
21 int count = 0;
22
23 srand(time(NULL));
24
25 for (i = 0; i < NUM; i++) {
26 bools[i] = rand() % 2;
27 if (i < LOOPS*2)
28 r[i] = ((float)(rand()%RAND_MAX)/RAND_MAX)*(SQRT_NUM-SIZE);
29 }
30
31 for (test = 0; test < TESTS; test++) {
32 start = clock();
33
34 for (loop = 0; loop < LOOPS; loop++) {
35 int x = r[loop*2];
36 int y = r[loop*2+1];
37 for (j = y; j < y+SIZE; j++) {
38 for (i = x; i < x+SIZE; i++) {
39 if (bools[j*SQRT_NUM+i]) {
40 count++;
41 }
42 }
43 }
44 }
45
46 end = clock();
47
48 printf("count = %d\n", count);
49
50 printf("time = %g seconds\n", (float)(end-start)/CLOCKS_PER_SEC);
51 }
52}
1// randtest2.c
2#include <stdio.h>
3#include <stdlib.h>
4#include <time.h>
5
6#define NUM 1024*1024*100
7#define SQRT_NUM 10240
8#define LOOPS 50000
9#define SIZE 50
10#define TESTS 5
11
12char bools[NUM/8];
13int r[LOOPS*2];
14
15int main(void)
16{
17 int count = 0;
18 clock_t start, end;
19 int bit = 0;
20 int byte = 0;
21 int test;
22 int loop;
23 int i, j;
24
25 srand(time(NULL));
26
27 for (i = 0; i < NUM/8; i++) {
28 bools[i] = rand() % 256;
29 if (i < LOOPS*2)
30 r[i] = ((float)(rand() % RAND_MAX)/RAND_MAX)*(SQRT_NUM-SIZE);
31 }
32
33 for (test = 0; test < TESTS; test++) {
34 start = clock();
35
36 for (loop = 0; loop < LOOPS; loop++) {
37 int x = r[loop*2];
38 int y = r[loop*2+1];
39 for (j = y; j < y+SIZE; j++) {
40 for (i = x; i < x+SIZE; i++) {
41 int byte = (j*SQRT_NUM+i)>>3;
42 if ((bools[byte] >> (x%8)) & 1) {
43 count++;
44 }
45 }
46 }
47 }
48
49 end = clock();
50
51 printf("count = %d\n", count);
52
53 printf("time = %g seconds\n", (float)(end-start)/CLOCKS_PER_SEC);
54 }
55}
Results: count = 62508973 time = 1.71414 seconds count = 125017946 time = 1.71485 seconds count = 187526919 time = 1.71329 seconds count = 250035892 time = 1.71355 seconds count = 312544865 time = 1.71301 seconds
count = 62495742 time = 1.47988 seconds count = 124991484 time = 1.48007 seconds count = 187487226 time = 1.48014 seconds count = 249982968 time = 1.47874 seconds count = 312478710 time = 1.47809 seconds
|
Elias
Member #358
May 2000
|
My test: 1#include <stdio.h>
2#include <stdlib.h>
3#include <time.h>
4
5#define BITS 1000
6#define NUM 1000000
7
8volatile bool bools[BITS];
9volatile unsigned char bits[BITS / 8];
10volatile int count1, count2;
11
12int main(void) {
13
14 for (int j = 0; j < BITS; j++) {
15 bools[j] = rand() % 2;
16 }
17
18 for (int j = 0; j < BITS / 8; j++) {
19 bits[j] = rand() % 256;
20 }
21
22 clock_t t1 = clock();
23 count1 = 0;
24 for (int i = 0; i < NUM; i++) {
25 for (int j = 0; j < BITS; j++) {
26 if (bools[j] & 1) count1++;
27 }
28 }
29 clock_t t2 = clock();
30
31 clock_t t3 = clock();
32 count2 = 0;
33 for (int i = 0; i < NUM; i++) {
34 for (int j = 0; j < BITS / 8; j++) {
35 if (bits[j] & 1) count2++;
36 if (bits[j] & 2) count2++;
37 if (bits[j] & 4) count2++;
38 if (bits[j] & 8) count2++;
39 if (bits[j] & 16) count2++;
40 if (bits[j] & 32) count2++;
41 if (bits[j] & 64) count2++;
42 if (bits[j] & 128) count2++;
43 }
44 }
45 clock_t t4 = clock();
46
47 printf("bool: count: %d seconds: %f\n", count1, (t2 - t1) / (double)CLOCKS_PER_SEC);
48 printf("bits: count: %d seconds: %f\n", count2, (t4 - t3) / (double)CLOCKS_PER_SEC);
49
50 return 0;
51}
Output: bool: count: 493000000 seconds: 5.320000 bits: count: 487000000 seconds: 1.900000 So, bits are more than twice as fast as bool here. If I change these two lines: #define BITS 1000 #define NUM 1000000 to #define BITS 1000000 #define NUM 1000 The result is: bool: count: 499845000 seconds: 5.450000 bits: count: 498304000 seconds: 4.370000 So bool is only slightly slower now, which can be explained by having to do more loop iterations to test all the bools. How I would interpret this is that the only difference is the used memory - the actual if() is not relevant. But with bits you use only 1/8th of the memory so you may end up using the cache more efficiently which can make it much faster. -- |
Trent Gamblin
Member #261
April 2000
![]() |
The numbers aren't that far off... unroll the first loop and it's almost as fast as the second. [edit] I think my random access tests were the best as far as actual usage goes... they test 2d blocks of bits inside a 2d array of bits... what you'd do for collision detection. They may not be written optimally though. I would be interested in how those tests perform for other people and on what hardware... and I think enabling optimizing in the compiler is not really fair since these tests are so simple and can be optimized away to nothing while in a larger program they couldn't.
|
Evert
Member #794
November 2000
![]() |
What have I done! |
Trent Gamblin
Member #261
April 2000
![]() |
Some more numbers from randtest1 and 2: Intel A100 (atom predecessor) | Core2 Quad 2.4Ghz test 1 5.3 seconds | 1.3 seconds test 2 5.3 seconds | 1.1 seconds The A100 is about the same on both tests, while the older still AMD I have is faster on the 1 byte bool test. Everything else I've tried is faster on bitfields. Seems like newer CPUs are better at bitfields than older ones.
|
Oscar Giner
Member #2,207
April 2002
![]() |
Oh, when I changed the code to fix the mistake I didn't rerun it (I was in a hurry, and I didn't expect it to change the results). The change was this: if (bools[i>>3] & (0x80 >> (i&0x8))) count++; fixed to: if (bools[i>>3] & (0x80 >> (i&0x7))) count++;
That simple change made the time go from 0.14 to 0.42 Trent Gamblin said: I think enabling optimizing in the compiler is not really fair since these tests are so simple and can be optimized away to nothing while in a larger program they couldn't. The optimizer can't really remove anything from these tests, they're ok. I think optimizations need to be on to be able to compare how it performs in a real scenario (the bitfield version can benefit from optimizations, unless you do the implementation yourself in assembler). -- |
Jonatan Hedborg
Member #4,886
July 2004
![]() |
Oscar Giner said: I guess the first one was optimized to a shift :/ I think it evaluated to "true" a lot less, resulting in fewer "count++" being executed.
|
|
1
2
|