Velocity Reviews > C++ > testing if just one bit is set...

# testing if just one bit is set...

James Kanze
Guest
Posts: n/a

 11-07-2008
On Nov 6, 10:55 pm, Jeff Schwab <(E-Mail Removed)> wrote:
> Juha Nieminen wrote:
> > Jeff Schwab wrote:
> >> I've only considered it for two's complement.

> > Exactly how does two's complement representation kick in
> > with unsigned values?

> Are you asking why the representation is relevant? As far as
> I know, all of the representations allowed by the Standard are
> equivalent for purposes of this discussion.

Yes and no. For signed values, the bit pattern of -n will
depend on the representation. For unsigned values, the standard
defines -i to be 2^n-i (where n is the number of value bits in
the unsigned). Which by a curious bit of chance() just
happens to correspond to what you'd get for 2's complement
signed values.

> The only one I've really considered is two's complement,
> though. That doesn't mean I think there's any particular
> problem with the other allowed representations, just that I
> don't know enough about them to know whether there are any
> gotchas.

On thing is immediately certain, the bit pattern of -n will
depend on the representation if the type is signed. Which means
that it probably will not work.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze
Guest
Posts: n/a

 11-07-2008
On Nov 6, 11:47 pm, Jeff Schwab <(E-Mail Removed)> wrote:
> Andrey Tarasevich wrote:
> > Well, unsigned values in C++ have one and only one [allowed]
> > representation. Which is why some people might find any
> > mention of "other representations" to be confusing
> > (equivalent or not).

> We're not only discussing unsigned values.

> The OP was specifically asking about the number of bits set in
> a 32-bit int; by definition, that depends on the
> representation.

Of course, the obvious solution is to cast to unsigned, and then
procede. Except that if he really is concerned with signed
values, casting to unsigned may change the bit pattern (and thus
the number of bits). A reinterpret_cast involving references
might do the trick (to make the compiler access the value as if
it were unsigned), although even there, on at least one machine,
unsigned is implemented by simply masking out the sign bit (i.e.
UINT_MAX == INT_MAX).

Given that the original poster specified 32 bits, however, I
rather suspect that total portability wasn't a concern; all of
the 32 bit machines I know today use 2's complement, where
converting an int to an unsigned doesn't change the bit pattern.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Kai-Uwe Bux
Guest
Posts: n/a

 11-07-2008
blargg wrote:

> In article <phHQk.18329\$(E-Mail Removed)>,
> "Andrew Koenig" <(E-Mail Removed)> wrote:
>
>> ".rhavin grobert" <(E-Mail Removed)> wrote in message
>> news:(E-Mail Removed)...
>>
>> > guess you have a processor that can handle 32bit natively and you have
>> > a 32-bit-int. im now looking for some *ultrafast* way to determine if
>> > an int has more than one bit set. any ideas?

>>
>> If n has an unsigned type (i.e. unsigned int or unsigned long), then
>> (n&-n) is equal to n unless n has more than one bit set.
>> So the expression you're looking for is n!=(n&-n)

>
>
> (n-1)&n
>
> which probably involves fewer operations (negate+and+compare versus
> subtract+and)? Non-zero if more than one bit is set.

For the OP, that would work too. Which one is faster, only measurement can
show.

As for the two methods, n & -n has the nice feature of having all but the
lowest order bit in n equal to 0. Thus, it provides essentially a way to
extract the lowest order bit.

Best

Kai-Uwe Bux

James Kanze
Guest
Posts: n/a

 11-07-2008
On Nov 7, 12:59 pm, Jeff Schwab <(E-Mail Removed)> wrote:
> James Kanze wrote:
> > On Nov 6, 11:47 pm, Jeff Schwab <(E-Mail Removed)> wrote:
> >> Andrey Tarasevich wrote:
> >>> Well, unsigned values in C++ have one and only one [allowed]
> >>> representation. Which is why some people might find any
> >>> mention of "other representations" to be confusing
> >>> (equivalent or not).

> >> We're not only discussing unsigned values.

> > Of course, the obvious solution is to cast to unsigned, and
> > then procede. Except that if he really is concerned with
> > signed values, casting to unsigned may change the bit
> > pattern (and thus the number of bits). A reinterpret_cast
> > involving references might do the trick (to make the
> > compiler access the value as if it were unsigned),

> Is that UB? It reminds me of the old "access the other member
> of the union" trick.

Yes and no. Casting a pointer or a reference is the
"officially" sanctionned way of type punning, but of course, you
don't get any guarantees with regards to the meaning of the
results in general. In this one particular case, you are
guaranteed that 1) the size of a signed integral type and its
corresponding unsigned is the same, and 2) that non-negative
values will have the same representation. So unless the sign
bit acts funny, you should be OK. (The C standard more or less
sanctions it, too, in that it allows you to pass signed integers
to match a "%x" format specifier.)

> > although even there, on at least one machine, unsigned is
> > implemented by simply masking out the sign bit (i.e.
> > UINT_MAX == INT_MAX).

> What machine is that? I don't doubt that such platforms
> exist, but that sounds more exotic than anything I've
> targeted.

Unisys MCP. It's an interesting machine; it descends from the
old Burroughs line of machines, uses a stack instead of
registers, and has a tagged architecture; there's only one add,
sub, etc. instruction, which works for floating point or
integral values. I've not got access to one to do any testing,
but given the way it works, I suspect that an overflow in
integral arithmetic could lead to some very strange behavior.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze
Guest
Posts: n/a

 11-07-2008
On Nov 7, 12:01 pm, (E-Mail Removed) (blargg) wrote:
> James Kanze wrote:

> [...]
> but because two's complement is the most
> natural representation.

What does "the most natural representation" mean? I would have
thought that the most natural representation would be base 10
(which is forbidden for ints).

2's complement has one very big problem: - INT_MIN isn't
representable. 1's complement and signed magnitude have to
consider signed 0, but IMHO, that's a lesser problem; in
addition, it can be handled by hardware in a transparent manner.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze
Guest
Posts: n/a

 11-09-2008
On Nov 8, 12:02*pm, (E-Mail Removed) (blargg) wrote:
> James Kanze wrote:
> > On Nov 7, 12:01 pm, (E-Mail Removed) (blargg) wrote:
> > > James Kanze wrote:

> > > [...]
> > > but because two's complement is the most natural
> > > representation.

> > What does "the most natural representation" mean?

> Two's complement is the natural way to represent negative
> values, since things like add, subtract, multiply (both by a
> power of two using a shift, and by an arbitrary value) can use
> the same hardware as the unsigned equivalents.

In other words, it's not the most natural, it's the cheapest.

> [...]> 2's complement has one very big problem: - INT_MIN isn't
> > representable.

> [...]

> How is this a big problem? Since trying to generate a value
> outside the range of a signed type is UB, one could just treat
> the most negative as UB by making INT_MIN == -INT_MAX.

To begin with, you can't write the value directly as an integral
literal. And you have to special case it in a lot of cases.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Andrew Koenig
Guest
Posts: n/a

 11-14-2008
"Victor Bazarov" <(E-Mail Removed)> wrote in message
news:gevfrd\$q4n\$(E-Mail Removed)...

>> If n has an unsigned type (i.e. unsigned int or unsigned long), then
>> (n&-n) is equal to n unless n has more than one bit set.
>> So the expression you're looking for is n!=(n&-n)

> Wow... Does it work for any representation (two's complement, one's
> complement, signed magnitude)?

Just 2's complement -- but that's what everyone uses these days anyway.

Andrew Koenig
Guest
Posts: n/a

 11-14-2008
"Andrey Tarasevich" <(E-Mail Removed)> wrote in message
news:gevisn\$hss\$(E-Mail Removed)...

> Unsigned values don't vary in representation. "Two's complement, one's
> complement, signed magnitude" come into play with signed representations
> only.

The question is whether the meaning of -n depends on the representation of
signed values.

Andrey Tarasevich
Guest
Posts: n/a

 11-14-2008
Andrew Koenig wrote:
>
>> Unsigned values don't vary in representation. "Two's complement, one's
>> complement, signed magnitude" come into play with signed representations
>> only.

>
> The question is whether the meaning of -n depends on the representation of
> signed values.

And the answer is no. Within the pre-condition of 'n' being of type
'unsigned int' or 'unsigned long' (set in your original message) it
doesn't. The meaning of unary '-' for these types is defined by the
language specification without any dependency on signed representations,
meaning that the result is always the same regardless of signed
representation used by the implementation.

--
Best regards,
Andrey Tarasevich

James Kanze
Guest
Posts: n/a

 11-14-2008
On Nov 14, 6:48*pm, "Andrew Koenig" <(E-Mail Removed)> wrote:
> "Victor Bazarov" <(E-Mail Removed)> wrote in message

> news:gevfrd\$q4n\$(E-Mail Removed)...

> >> If n has an unsigned type (i.e. unsigned int or unsigned
> >> long), then (n&-n) is equal to n unless n has more than one
> >> bit set. So the expression you're looking for is n!=(n&-n)

> > Wow... *Does it work for any representation (two's
> > complement, one's complement, signed magnitude)?

> Just 2's complement -- but that's what everyone uses these
> days anyway.

Andy, what's up? In your posting, you specifically said "If n
has an unsigned type". And you know very well that the standard
defines - for unsigned types exactly.

And not everyone uses 2's complement. There are at least two
architectures being sold today that don't: one with 36 bit 1's
complement, and another with 48 bit signed magnitude. (Since
they're mainframes, a lot of programmers don't see them, but
they are there.)

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34