Velocity Reviews > bitwise & operator and counting set bits

# bitwise & operator and counting set bits

sathyashrayan
Guest
Posts: n/a

 04-09-2005
Group,
Following function will check weather a bit
is set in the given variouble x.

int bit_count(long x)
{
int n = 0;
/*
** The loop will execute once for each bit of x
set,
*/
if (x)
do
n++;
while (0 != (x = x&(x-1)));
return (n);
}

But the same action in the following function is
confusing me a lot.

int bitcount(long i)
{
i = ((i & 0xAAAAAAAA) >> 1) + (i &
0x55555555);
i = ((i & 0xCCCCCCCC) >> 2) + (i &
0x33333333);
i = ((i & 0xF0F0F0F0) >> 4) + (i &
0x0F0F0F0F);
i = ((i & 0xFF00FF00) >> + (i &
0x00FF00FF);
i = ((i & 0xFFFF0000) >> 16) + (i &
0x0000FFFF);
return (int)i;
}

All those hex number are magic numbers? Is the
above an implementation defined?
Please clear my doubt. I am posting all the two
codes which I have taken from C snippet
archives.

-------------------code-1------------------

#include <stdio.h>
#include <stdlib.h>
#define plural_text(n) &"s"[(1 == (n))]
int bitcount(long i)
{
i = ((i & 0xAAAAAAAA) >> 1) + (i &
0x55555555);
i = ((i & 0xCCCCCCCC) >> 2) + (i &
0x33333333);
i = ((i & 0xF0F0F0F0) >> 4) + (i &
0x0F0F0F0F);
i = ((i & 0xFF00FF00) >> + (i &
0x00FF00FF);
i = ((i & 0xFFFF0000) >> 16) + (i &
0x0000FFFF);
return (int)i;
}
int main(int argc, char *argv[])
{
long n;
while(--argc)
{
int i;
n = atol(*++argv);
i = bitcount(n);
printf("%ld contains %d bit%s
set\n",n, i, plural_text(i));
}
return 0;
}

----------------code-2----------------------------
-
#include <stdio.h>
#include <stdlib.h>

#define plural_text(n) &"s"[(1 == (n))]

int bit_count(long x)
{
int n = 0;
/*
** The loop will execute once for each bit of x
set, this is in average
** twice as fast as the shift/test method.
*/
if (x)
do
n++;
while (0 != (x = x&(x-1)));
return (n);
}
int main(int argc, char *argv[])
{
long n;
while(--argc)
{
int i;
n = atol(*++argv);
i = bit_count(n);
printf("%ld contains %d bit%s
set\n",n, i, plural_text(i));
}
return 0;
}

Walter Roberson
Guest
Posts: n/a

 04-09-2005
In article <425778fe\$0\$13276\$(E-Mail Removed) ws.net>,
sathyashrayan <(E-Mail Removed)> wrote:
>But the same action in the following function is
>confusing me a lot.

>int bitcount(long i)
>{
> i = ((i & 0xAAAAAAAA) >> 1) + (i &
>0x55555555);
> i = ((i & 0xCCCCCCCC) >> 2) + (i &
>0x33333333);

>All those hex number are magic numbers?

Yes, they assume 32 bit longs, which is implementation dependant.
--
Are we *there* yet??

Chris Torek
Guest
Posts: n/a

 04-09-2005
In article <425778fe\$0\$13276\$(E-Mail Removed) ws.net>,
sathyashrayan <(E-Mail Removed)> wrote:
> Group,
> Following function will check weather a bit
>is set in the given variouble x.
> int bit_count(long x)

[snippage]

Actually, it (non-portably -- x should have type "unsigned long")
counts the *number* of bits set.

I think it is time to re-post this Moldy Oldie The stuff at
the top is not Standard C, and the timing code as well, but the
bit-counting functions are at least commented. Note that I wrote
this shortly after the 1989 C Standard came out, before it was
widely available -- we were not even using GCC yet on most machines.

#ifndef lint
static char rcsid[] = "\$Id: bct.c,v 1.5 90/10/13 08:44:12 chris Exp \$";
#endif

/*
* bct - bitcount timing
*
* Runs a bunch of different functions-to-count-bits-in-a-longword
* (many assume 32 bits per long) with a timer around each loop, and
* tries to calcuate the time used.
*/
#include <stdio.h>
#include <sys/types.h>

#ifdef FD_SETSIZE
# define USE_GETRUSAGE
# include <sys/time.h>
# include <sys/resource.h>
#else
# include <sys/times.h>
#endif

#define SIZEOF(a) (sizeof(a)/sizeof(a[0]))

/*
* This function is used to calibrate the timing mechanism.
* This way we can subtract the loop and call overheads.
*/
int
calibrate(n)
register unsigned long n;
{
return 0;
}

/*
* This function counts the number of bits in a long.
* It is limited to 63 bit longs, but a minor mod can cope with 511 bits.
*
* It is so magic, an explanation is required:
* Consider a 3 bit number as being
* 4a+2b+c
* if we shift it right 1 bit, we have
* 2a+b
* subtracting this from the original gives
* 2a+b+c
* if we shift the original 2 bits right we get
* a
* and so with another subtraction we have
* a+b+c
* which is the number of bits in the original number.
* Suitable masking allows the sums of the octal digits in a
* 32 bit number to appear in each octal digit. This isn't much help
* unless we can get all of them summed together.
* This can be done by modulo arithmetic (sum the digits in a number by molulo
* the base of the number minus one) the old "casting out nines" trick they
* taught in school before calculators were invented.
* Now, using mod 7 wont help us, because our number will very likely have
* more than 7 bits set. So add the octal digits together to get base64
* digits, and use modulo 63.
* (Those of you with 64 bit machines need to add 3 octal digits together to
* get base512 digits, and use mod 511.)
*
* This is HACKMEM 169, as used in X11 sources.
*/
int
t0_hackmemmod(n)
register unsigned long n;
{
register unsigned long tmp;

tmp = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111);
return ((tmp + (tmp >> 3)) & 030707070707) % 63;
}

/*
* This is the same as the above, but does not use the % operator.
* Most modern machines have clockless division, and so the modulo is as
* expensive as, say, an addition.
*/
int
t1_hackmemloop(n)
register unsigned long n;
{
register unsigned long tmp;

tmp = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111);
tmp = (tmp + (tmp >> 3)) & 030707070707;
while (tmp > 63)
tmp = (tmp & 63) + (tmp >> 6);
return tmp;
}

/*
* Above, without using while loop. It takes at most 5 iterations of the
* loop, so we just do all 5 in-line. The final result is never 63
* (this is assumed above as well).
*/
int
t2_hackmemunrolled(n)
register unsigned long n;
{
register unsigned long tmp;

tmp = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111);
tmp = (tmp + (tmp >> 3)) & 030707070707;
tmp = (tmp & 63) + (tmp >> 6);
tmp = (tmp & 63) + (tmp >> 6);
tmp = (tmp & 63) + (tmp >> 6);
tmp = (tmp & 63) + (tmp >> 6);
tmp = (tmp & 63) + (tmp >> 6);
return (tmp);
}

/*
* This function counts the bits in a long.
*
* It removes the lsb and counting the number of times round the loop.
* The expression (n & -n) yields the lsb of a number,
* but it only works on 2's compliment machines.
*/
int
t3_rmlsbsub(n)
register unsigned long n;
{
register int count;

for (count = 0; n; n -= (n & -n))
count++;
return count;
}

int
register unsigned long n;
{
register int count;

for (count = 0; n; count++)
n &= n - 1; /* take away lsb */
return (count);
}

/*
* This function counts the bits in a long.
*
* It works by shifting the number down and testing the bottom bit.
*/
int
t5_testlsb(n)
register unsigned long n;
{
register int count;

for (count = 0; n; n >>= 1)
if (n & 1)
count++;
return count;
}

/*
* This function counts the bits in a long.
*
* It works by shifting the number left and testing the top bit.
* On many machines shift is expensive, so it uses a cheap addition instead.
*/
int
t6_testmsb(n)
register unsigned long n;
{
register int count;

for (count = 0; n; n += n)
if (n & ~(~(unsigned long)0 >> 1))
count++;
return count;
}

int
t7_testsignandshift(n)
register unsigned long n;
{
register int count;

for (count = 0; n; n <<= 1)
if ((long)n < 0)
count++;
return (count);
}

/*
* This function counts the bits in a long.
*
* It works by masking each bit.
* This is the second most intuitively obvious method,
* and is independent of the number of bits in the long.
*/
int
t8_testeachbit(n)
register unsigned long n;
{
register int count;

count = 0;
count++;
return count;
}

/*
* This function counts the bits in a long.
*
* It works by masking each bit.
* This is the most intuitively obvious method,
* but how do you a priori know how many bits in the long?
* (except for ''sizeof(long) * CHAR_BITS'' expression)
*/
int
t9_testeachbit1shl(n)
register unsigned long n;
{
register int count;
register int bit;

count = 0;
for (bit = 0; bit < 32; ++bit)
if (n & ((unsigned long)1 << bit))
count++;
return count;
}

static char nbits[256] = {
0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8,
};

static int inbits[256] = {
0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8,
};

int
tA_tableshift(n)
register unsigned long n;
{

return (nbits[n & 0xff] + nbits[(n >> & 0xff] +
nbits[(n >> 16) & 0xff] + nbits[n >> 24]);
}

int
tB_tableuchar(n)
unsigned long n;
{
register unsigned char *p = (unsigned char *)&n;

return (nbits[p[0]] + nbits[p[1]] + nbits[p[2]] + nbits[p[3]]);
}

int
tC_tableshiftcast(n)
register unsigned long n;
{

return nbits[(unsigned char) n] +
nbits[(unsigned char) (n >> ] +
nbits[(unsigned char) (n >> 16)] +
nbits[(unsigned char) (n >> 24)];
}

int
tD_itableshift(n)
register unsigned long n;
{

return (inbits[n & 0xff] + inbits[(n >> & 0xff] +
inbits[(n >> 16) & 0xff] + inbits[n >> 24]);
}

int
tE_itableuchar(n)
unsigned long n;
{
register unsigned char *p = (unsigned char *)&n;

return (inbits[p[0]] + inbits[p[1]] + inbits[p[2]] + inbits[p[3]]);
}

int
tF_itableshiftcast(n)
register unsigned long n;
{

return inbits[(unsigned char) n] +
inbits[(unsigned char) (n >> ] +
inbits[(unsigned char) (n >> 16)] +
inbits[(unsigned char) (n >> 24)];
}

/*
* Explanation:
* First we add 32 1-bit fields to get 16 2-bit fields.
* Each 2-bit field is one of 00, 01, or 10 (binary).
* We then add all the two-bit fields to get 8 4-bit fields.
* These are all one of 0000, 0001, 0010, 0011, or 0100.
*
* Now we can do something different, becuase for the first
* time the value in each k-bit field (k now being 4) is small
* enough that adding two k-bit fields results in a value that
* still fits in the k-bit field. The result is four 4-bit
* fields containing one of {0000,0001,...,0111,1000} and four
* more 4-bit fields containing junk (sums that are uninteresting).
* Pictorially:
* n = 0aaa0bbb0ccc0ddd0eee0fff0ggg0hhh
* n>>4 = 00000aaa0bbb0ccc0ddd0eee0fff0ggg
* sum = 0aaaWWWWiiiiXXXXjjjjYYYYkkkkZZZZ
* where W, X, Y, and Z are the interesting sums (each at most 1000,
* or 8 decimal). Masking with 0x0f0f0f0f extracts these.
*
* Now we can change tactics yet again, because now we have:
* n = 0000WWWW0000XXXX0000YYYY0000ZZZZ
* n>>8 = 000000000000WWWW0000XXXX0000YYYY
* so sum = 0000WWWW000ppppp000qqqqq000rrrrr
* where p and r are the interesting sums (and each is at most
* 10000, or 16 decimal). The sum `q' is junk, like i, j, and
* k above; but it is not necessarry to discard it this time.
* One more fold, this time by sixteen bits, gives
* n = 0000WWWW000ppppp000qqqqq000rrrrr
* n>>16 = 00000000000000000000WWWW000ppppp
* so sum = 0000WWWW000ppppp000sssss00tttttt
* where s is at most 11000 and t is it most 100000 (32 decimal).
*
* Now we have t = r+p = (Z+Y)+(X+W) = ((h+g)+(f+e))+((d+c)+(b+a)),
* or in other words, t is the number of bits set in the original
* 32-bit longword. So all we have to do is return the low byte
* (or low 6 bits, but `low byte' is typically just as easy if not
* easier).
*
* This technique is also applicable to 64 and 128 bit words, but
* 256 bit or larger word sizes require at least one more masking
* step.
*/
int
tG_sumbits(n)
register unsigned long n;
{

n = (n & 0x55555555) + ((n >> 1) & 0x55555555);
n = (n & 0x33333333) + ((n >> 2) & 0x33333333);
n = (n + (n >> 4)) & 0x0f0f0f0f;
n += n >> 8;
n += n >> 16;
return (n & 0xff);
}

/*
* build a long random number.
* The standard rand() returns at least a 15 bit number.
* We use the top 9 of 15 (since the lower N bits of the usual rand()
* repeat with a period of 2^N).
*/
unsigned long
bigrand()
{
#define randbits() ((unsigned long)((rand() >> 6) & 0777))
register unsigned long r;

r = randbits() << 27;
r |= randbits() << 18;
r |= randbits() << 9;
r |= randbits();
return (r);
}

/*
* Run the test many times.
* You will need a running time in the 10s of seconds
* for an accurate result.
*
* To give a fair comparison, the random number generator
* is seeded the same for each measurement.
*
* Return value is seconds per iteration.
*/
#ifndef REPEAT
#if defined(mips) || defined(sparc)
#define REPEAT (1L<<20)
#else
#define REPEAT (1L<<1
#endif
#endif

double
measure(func)
register int (*func)();
{
#ifdef USE_GETRUSAGE
struct rusage ru0, ru1;
register long j;

srand(1);
(void) getrusage(RUSAGE_SELF, &ru0);
for (j = 0; j < REPEAT; ++j)
func(bigrand());
(void) getrusage(RUSAGE_SELF, &ru1);
ru1.ru_utime.tv_sec -= ru0.ru_utime.tv_sec;
if ((ru1.ru_utime.tv_usec -= ru0.ru_utime.tv_usec) < 0) {
ru1.ru_utime.tv_usec += 1000000;
ru1.ru_utime.tv_sec--;
}
return ((ru1.ru_utime.tv_sec + (ru1.ru_utime.tv_usec / 1000000.0)) /
(double)REPEAT);
#else
register long j;
struct tms start;
struct tms finish;

srand(1);
times(&start);
for (j = 0; j < REPEAT; ++j)
func(bigrand());
times(&finish);
return ((finish.tms_utime - start.tms_utime) / (double)REPEAT);
#endif
}

struct table {
char *name;
int (*func)();
double measurement;
int trial;
};

struct table table[] =
{
{ "hackmemmod", t0_hackmemmod },
{ "hackmemloop", t1_hackmemloop },
{ "hackmemunrolled", t2_hackmemunrolled },
{ "rmlsbsub", t3_rmlsbsub },
{ "testlsb", t5_testlsb },
{ "testmsb", t6_testmsb },
{ "testsignandshift", t7_testsignandshift },
{ "testeachbit", t8_testeachbit },
{ "testeachbit1shl", t9_testeachbit1shl },
{ "tableshift", tA_tableshift },
{ "tableuchar", tB_tableuchar },
{ "tableshiftcast", tC_tableshiftcast },
{ "itableshift", tD_itableshift },
{ "itableuchar", tE_itableuchar },
{ "itableshiftcast", tF_itableshiftcast },
{ "sumbits", tG_sumbits },
};

main(argc, argv)
int argc;
char **argv;
{
double calibration, v, best;
int j, k, m, verbose;

verbose = argc > 1;
/*
* run a few tests to make sure they all agree
*/
srand(getpid());
for (j = 0; j < 10; ++j) {
unsigned long n;

n = bigrand();
for (k = 0; k < SIZEOF(table); ++k)
table[k].trial = table[k].func(n);
for (k = 0; k < SIZEOF(table); ++k)
for (m = 0; m < SIZEOF(table); ++m)
if (table[k].trial != table[m].trial)
printf("wrong: %08lX", n);
for (k = 0; k < SIZEOF(table); ++k)
printf(" %3d", table[k].trial);
printf("\n");
}
}

/*
* calibrate the timing mechanism
*/
calibration = measure(calibrate);
if (verbose)
printf("calibration: %g\n", calibration);

/*
* time them all, keeping track of the best (smallest)
*/
for (j = 0; j < SIZEOF(table); ++j) {
v = measure(table[j].func) - calibration;
if (verbose)
printf("%s: %g\n", table[j].name, v);
table[j].measurement = v;
if (!j || v < best)
best = v;
}
(void) printf("%-24s %-14sratio\n", "function", "time");
for (j = 0; j < SIZEOF(table); ++j) {
(void) printf("%-24s %#10.8g %6.3f\n",
table[j].name,
table[j].measurement,
table[j].measurement / best);
}
exit(0);
}
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
Reading email is like searching for food in the garbage, thanks to spammers.

ranveerkunal@gmail.com
Guest
Posts: n/a

 04-10-2005
lets understand the magic numbers first
AAAAAAAA = 1010 1010 1010 1010
55555555 = 0101 0101 0101 0101
alternative 1s and zeros
CCCCCCCC = 1100 1100 1100 1100
33333333 = 0011 0011 0011 0011
2 ones and 2 zeros
then 4 ones 4 zeros
then 8 ones 8 zeros
then 16 ones 16 zeros
so lets go with
dec 99 -> hex 63 -> bin 0110 0011

now we can see that number of high bits is the sum of all the bit
fields
0+1+1+0+0+0+1+1 = 4
this is same as adding consecutive bits and then addin them together
(0+1)+(1+0)+(0+0)+(1+1)=0101 0010 and this is wat you get after first
step.
0110 0011 & 1010 1010 1010 1010 = 0010 0010 > 1= 0001 0001
0110 0011 & 0101 0101 0101 0101 = 0100 0001
add 0101 0010 (2 bits can have maximum two high bits and their sum
is 2(10) that can be accomodated in 2 bits)

now take 4 bits together
((0+1)+(1+0))+((0+0)+(1+1))=(01+01)+(00+10)=0010 0010 (4 bits can have
maximum four high bits and their sum is 4(100) that can be accomodated
in 4 bits)
same as wat you get after step 2
now take 8 bits together
(((0+1)+(1+0))+((0+0)+(1+1)))=((01+01)+(00+10))=00 10+0010=0000 0100
wat you get after step 3
this is the answer 0000 0100 = 4

hope that clears yr doubt

sathyashrayan
Guest
Posts: n/a

 04-10-2005

<(E-Mail Removed)> wrote in message news:(E-Mail Removed) oups.com...
> lets understand the magic numbers first
> AAAAAAAA = 1010 1010 1010 1010
> 55555555 = 0101 0101 0101 0101
> alternative 1s and zeros
> CCCCCCCC = 1100 1100 1100 1100
> 33333333 = 0011 0011 0011 0011
> 2 ones and 2 zeros
> then 4 ones 4 zeros
> then 8 ones 8 zeros
> then 16 ones 16 zeros
> so lets go with
> dec 99 -> hex 63 -> bin 0110 0011
>
>
> now we can see that number of high bits is the sum of all the bit
> fields
> 0+1+1+0+0+0+1+1 = 4
> this is same as adding consecutive bits and then addin them together
> (0+1)+(1+0)+(0+0)+(1+1)=0101 0010 and this is wat you get after first
> step.
> 0110 0011 & 1010 1010 1010 1010 = 0010 0010 > 1= 0001 0001
> 0110 0011 & 0101 0101 0101 0101 = 0100 0001
> add 0101 0010 (2 bits can have maximum two high bits and their sum
> is 2(10) that can be accomodated in 2 bits)
>
> now take 4 bits together
> ((0+1)+(1+0))+((0+0)+(1+1))=(01+01)+(00+10)=0010 0010 (4 bits can have
> maximum four high bits and their sum is 4(100) that can be accomodated
> in 4 bits)
> same as wat you get after step 2
> now take 8 bits together
> (((0+1)+(1+0))+((0+0)+(1+1)))=((01+01)+(00+10))=00 10+0010=0000 0100
> wat you get after step 3
> this is the answer 0000 0100 = 4
>
> hope that clears yr doubt
>

Thanks a lot. It does help

--
"combination is the heart of chess"

A.Alekhine

Mail to:
sathyashrayan AT gmail DOT com

Rufus V. Smith
Guest
Posts: n/a

 04-12-2005

"sathyashrayan" <(E-Mail Removed)> wrote in message
news:425778fe\$0\$13276\$(E-Mail Removed) eenews.net...
> Group,
> Following function will check weather a bit
> is set in the given variouble x.
>
> int bit_count(long x)
> {
> int n = 0;
> /*
> ** The loop will execute once for each bit of x
> set,
> */
> if (x)
> do
> n++;
> while (0 != (x = x&(x-1)));
> return (n);
> }
>
> But the same action in the following function is
> confusing me a lot.
>
> int bitcount(long i)
> {
> i = ((i & 0xAAAAAAAA) >> 1) + (i &
> 0x55555555);
> i = ((i & 0xCCCCCCCC) >> 2) + (i &
> 0x33333333);
> i = ((i & 0xF0F0F0F0) >> 4) + (i &
> 0x0F0F0F0F);
> i = ((i & 0xFF00FF00) >> + (i &
> 0x00FF00FF);
> i = ((i & 0xFFFF0000) >> 16) + (i &
> 0x0000FFFF);
> return (int)i;
> }
>
> All those hex number are magic numbers? Is the
> above an implementation defined?
> Please clear my doubt. I am posting all the two
> codes which I have taken from C snippet
> archives.

off consecutive bits, 1, 2, 4, 8, 16.

It may have been clearer had it been written:

i = ((i >> 1) & 0x55555555)
+ (i & 0x55555555);

i = ((i >> 2) & 0x33333333)
+ (i & 0x33333333);

i = ((i >> 4) & 0x0F0F0F0F)
+ (i & 0x0F0F0F0F);

i = ((i >> & 0x00FF00FF)
+ (i & 0x00FF00FF);

i = ((i >> 16)& 0x0000FFFF) >> 16)
+ (i & 0x0000FFFF);
return (int)i;

This way you can see that the first expression is 16 parallel adds of single
bits into 2-bit result fields. (The bit counts of 2 bit fields)

The next is 8 parallel adds of 2 bits into a 4 bit result field.
(the sums of adjacent prior bit counts)

etc...

The final result being the sum of sums (total bit count for a 32 bit number)

Rufus

Rufus V. Smith
Guest
Posts: n/a

 04-12-2005

"Rufus V. Smith" <(E-Mail Removed)> wrote in message
news:1113314507.697052106a46bcd7df71daf9003ca3cf@t eranews...
>

<snip>
> off consecutive bits, 1, 2, 4, 8, 16.
>
> It may have been clearer had it been written:
>

<snip>
>
> i = ((i >> 16)& 0x0000FFFF) >> 16)
> + (i & 0x0000FFFF);
> return (int)i;

Oops, forgot to get rid of that second shift during cut and paste.
Should be:

> i = ((i >> 16)& 0x0000FFFF)
> + (i & 0x0000FFFF);
> return (int)i;

Rufus

Rufus V. Smith
Guest
Posts: n/a

 04-14-2005

"Chris Torek" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> In article <425778fe\$0\$13276\$(E-Mail Removed) ws.net>,
> /*
> * Explanation:
> * First we add 32 1-bit fields to get 16 2-bit fields.
> * Each 2-bit field is one of 00, 01, or 10 (binary).
> * We then add all the two-bit fields to get 8 4-bit fields.
> * These are all one of 0000, 0001, 0010, 0011, or 0100.
> *
> * Now we can do something different, becuase for the first
> * time the value in each k-bit field (k now being 4) is small
> * enough that adding two k-bit fields results in a value that
> * still fits in the k-bit field. The result is four 4-bit
> * fields containing one of {0000,0001,...,0111,1000} and four
> * more 4-bit fields containing junk (sums that are uninteresting).
> * Pictorially:
> * n = 0aaa0bbb0ccc0ddd0eee0fff0ggg0hhh
> * n>>4 = 00000aaa0bbb0ccc0ddd0eee0fff0ggg
> * sum = 0aaaWWWWiiiiXXXXjjjjYYYYkkkkZZZZ
> * where W, X, Y, and Z are the interesting sums (each at most 1000,
> * or 8 decimal). Masking with 0x0f0f0f0f extracts these.
> *
> * Now we can change tactics yet again, because now we have:
> * n = 0000WWWW0000XXXX0000YYYY0000ZZZZ
> * n>>8 = 000000000000WWWW0000XXXX0000YYYY
> * so sum = 0000WWWW000ppppp000qqqqq000rrrrr
> * where p and r are the interesting sums (and each is at most
> * 10000, or 16 decimal). The sum `q' is junk, like i, j, and
> * k above; but it is not necessarry to discard it this time.
> * One more fold, this time by sixteen bits, gives
> * n = 0000WWWW000ppppp000qqqqq000rrrrr
> * n>>16 = 00000000000000000000WWWW000ppppp
> * so sum = 0000WWWW000ppppp000sssss00tttttt
> * where s is at most 11000 and t is it most 100000 (32 decimal).
> *
> * Now we have t = r+p = (Z+Y)+(X+W) = ((h+g)+(f+e))+((d+c)+(b+a)),
> * or in other words, t is the number of bits set in the original
> * 32-bit longword. So all we have to do is return the low byte
> * (or low 6 bits, but `low byte' is typically just as easy if not
> * easier).
> *
> * This technique is also applicable to 64 and 128 bit words, but
> * 256 bit or larger word sizes require at least one more masking
> * step.
> */
> int
> tG_sumbits(n)
> register unsigned long n;
> {
>
> n = (n & 0x55555555) + ((n >> 1) & 0x55555555);
> n = (n & 0x33333333) + ((n >> 2) & 0x33333333);
> n = (n + (n >> 4)) & 0x0f0f0f0f;
> n += n >> 8;
> n += n >> 16;
> return (n & 0xff);
> }
>

Nice optimization job on that parallel adder, Chris!

I'm sure I'll find other gems as I read that posting!

For myself, and on behalf of others, thanks for
that post!

Rufus