Velocity Reviews > How to extract bytes from long?

# How to extract bytes from long?

Jirka Klaue
Guest
Posts: n/a

 10-17-2003
Jirka Klaue wrote:
....
> (uc)-1 == (uc)(0 - 1) == (uc)(UINT_MAX + 1 - 1) == (uc)(UINT_MAX) == UCHAR_MAX

s/UI/I/g

Jirka

Samuel Barber
Guest
Posts: n/a

 10-17-2003
http://www.velocityreviews.com/forums/(E-Mail Removed) (Samuel Barber) wrote in message news:<(E-Mail Removed). com>...
> pete <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
> > -1 is a value, not a bit pattern.
> > The value of negative one, cast to unsigned char, is UCHAR_MAX.

>
> Hello? The point is that -1 is ***being used as*** a bit pattern. The
> intent is to get "all 1s", which is true if the integer representation
> is 2's complement; that's an implicit assumption of the code. (This is
> the best reason not to use -1: the intent is not perfectly clear).

(Quoting myself)

Please disregard this part of my reply. I misinterpreted what Pete was saying.

Sam

Samuel Barber
Guest
Posts: n/a

 10-17-2003
"Arthur J. O'Dwyer" <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
> > > Samuel Barber wrote:

> > How can a bitwise operation trap? It can't.

>
> Of course it can! (Why wouldn't it? And do modern digital computers
> perform any operations that *aren't* bitwise, anyway?)

The C bitwise operators are &, |, ^, and ~ (>> and << are also
included in this catagory, although bitwise is a misnomer in the case
of shifts). "Trap" in the context of this discussion seems to mean
"detect an illegal value"; well, there are only two values of interest
to bitwise operators (0 and 1), and they are both legal. So how can it
trap?

Sam

Samuel Barber
Guest
Posts: n/a

 10-17-2003
Sheldon Simms <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
> 6.2.6.2 Integer types
> 2 ... (It is implementation-defined) whether the value with ...
> sign bit 1 and all value bits 1 (for one's complement), is a trap
> representation or a normal value. In the case of ... one's
> complement, if this representation is a normal value it is called
> a negative zero.
> ...
> 3 If the implementation supports negative zeros, they shall be
> generated only by: the &, |, ^, ~, <<, and >> operators with
> arguments that produce such a value;
> ...
> 4 If the implementation does not support negative zeros, the
> behavior of the &, |, ^, ~, <<, and >> operators with arguments
> that would produce such a value is undefined.

Okay, but if we are to believe this hocus pocus, there is no way to
avoid the hypothetical trapping. It makes no difference whether you
use (unsigned char)-1, (unsigned char)~0, or UCHAR_MAX, since they all
evaluate to the same thing. All are equally right or equally wrong.

Sam

Irrwahn Grausewitz
Guest
Posts: n/a

 10-17-2003
(E-Mail Removed) (Samuel Barber) wrote:

>"Arthur J. O'Dwyer" <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
>> > > Samuel Barber wrote:
>> > How can a bitwise operation trap? It can't.

>>
>> Of course it can! (Why wouldn't it? And do modern digital computers
>> perform any operations that *aren't* bitwise, anyway?)

>
>The C bitwise operators are &, |, ^, and ~ (>> and << are also
>included in this catagory, although bitwise is a misnomer in the case
>of shifts).

Not from the standard's POV. In fact, they are referred to as "bitwise
shift operators" explicitly.

>"Trap" in the context of this discussion seems to mean
>"detect an illegal value"; well, there are only two values of interest
>to bitwise operators (0 and 1), and they are both legal. So how can it
>trap?

Read C99 6.2.6.2, Sheldon Simms already quoted the relevant parts in

Regards
--
Irrwahn
((E-Mail Removed))

Chris Torek
Guest
Posts: n/a

 10-17-2003
>Sheldon Simms <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
>> 6.2.6.2 Integer types
>> 2 ... (It is implementation-defined) whether the value with ...
>> sign bit 1 and all value bits 1 (for one's complement), is a trap
>> representation or a normal value. In the case of ... one's
>> complement, if this representation is a normal value it is called
>> a negative zero. ...

In article <(E-Mail Removed) >,
Samuel Barber <(E-Mail Removed)> wrote:
>Okay, but if we are to believe this hocus pocus, there is no way to
>avoid the hypothetical trapping. It makes no difference whether you
>use (unsigned char)-1, (unsigned char)~0, or UCHAR_MAX, since they all
>evaluate to the same thing. All are equally right or equally wrong.

You appear to have an incorrect "mental model" of how C works.
This is not surprising; I suspect most people do. (One needs to
have worked with those oddball ones'-complement machines to really
have a feel for this stuff.)

The language is not defined in terms of "what happens on a PDP-11",
nor "what happens on a VAX", nor even "what happens on an x86 or
other CPU produced within the last few years". Rather, it is defined
in terms of an "abstract machine". A C compiler writer must map
from "abstract machine" to "real machine" in some way.

The section quoted above (along with others) define how the abtract
machine is to work. In the abstract machine, writing:

~0

means:

- make an int with the value 0
- now, flip all the bits

This process *can* give rise to a "trap representation" on a ones'
complement machine.

On the other hand, writing:

-1

means:

- make an int with the value 1
- now, negate it

This process *must* produce the (ordinary signed int) value -1.
On a ones' complement machine, this value in binary is a sequence
of 1 bits followed by a zero, e.g., 111111111111111110 -- 17 1 bits
and then a 0 -- on an 18-bit-int ones' complement CPU. (The CPU
I am using as a model here is the Univac 11xx, which has 9, 18,
and 36 bit integers and *does* use ones' complement.)

Converting any ordinary signed int to type "unsigned char" *must*
produce a valid unsigned char bit pattern and value -- these are
defined as more or less the same thing in the abstract machine --
and the process by which a negative signed int is transformed into
a (positive) unsigned char is defined mathematically. If the
signed int has value -1, the result must be UCHAR_MAX, which is
a valid bit pattern that consists of all-1-bits, e.g., 111111111
(9 ones) on a 9-bit-byte ones' complement CPU.

At this point you are probably ready to hit your "post follow-up"
key or mouseable button or whatnot, saying: "What?! HOLD ON! JUST
A CONSARNED MINUTE! That all-1-bits pattern, you just said it's
a trap representation, now you say it's a valid value?!?" Yep.
How can it be both?

The answer lies in the *type* of the value. When the *type* of
the value is "signed int", an all-one-bits pattern is allowed to
be a "trap representation". When the type is "unsigned int", this
is *not* allowed. If the target CPU makes this a royal pain in
the butt, well, too bad for the C compiler implementor and/or user
-- "unsigned"s are going to be difficult and/or slow. But if you
*need* all-one-bits patterns, you -- as a C programmer -- should
use "unsigned" arithmetic, which is well-behaved and avoids all
these "trap representation" things. Moreover, given:

unsigned int ui = UINT_MAX;

the sequence:

ui++;

is *guaranteed* to cause ui to "roll over" to zero, without trapping
at runtime with an overflow error. With ordinary signed ints there
is no guarantee -- they may "roll over" (from positive to negative
or vice versa) or they may trap at runtime, whichever the implementor
finds easier or "better".

At the edges, the rules for C can get pretty complicated, but
there *are* simple answers for the common cases:

- If you need an ordinary signed integer and do not believe
you will overflow it, use an ordinary signed integer. (Use
"long" if your range is -2 billion to +2 billion; in C99, use
"long long" if your range is -9 quintillion to +9 quintillion.
Numerically these are 2147483647 and 9223372036854775807
respectively, in case you-the-reader are someone who uses
"milliard". Ordinary "int" is only guaranteed to handle
[-32767..+32767], even though it often handles the 2 billion
number.)

- If you need modular "clock arithmetic", use an unsigned integer.

- If you need to do bitwise operations, use an unsigned integer.

- If you need exact, precisely defined behavior in *all* cases,
use an unsigned integer, synthesizing your own signed values
from these if desired. (In other words, build your own ones'
or two's complement or sign-and-magnitude system.)

Incidentally, one trick proposed (but not actually used on the
Univac) for unsigned integers vs. trap representations is, e.g.,
to have "unsigned int" be only 17 bits, while ordinary signed int
is 18 bits. Then UINT_MAX and INT_MAX are the same number (!),
and "unsigned"ness is achieved mainly by forcing the sign bit to
stay off. This appears to be allowed by the C standard. It is
therefore possible that the "simple rules" *still* do not achieve
the desired effect, depending on what that desired effect might
be.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://67.40.109.61/torek/index.html (for the moment)
Reading email is like searching for food in the garbage, thanks to spammers.

cody
Guest
Posts: n/a

 10-17-2003
"Arthur J. O'Dwyer" <(E-Mail Removed)> schrieb im Newsbeitrag
news(E-Mail Removed)...
> However, it *does* produce the right answer, which is a point in
> its favor. ~0 might trap, and in any case I think (unsigned char)-1
> has a bit more aesthetic value to it (YMMV, of course).

Why should ~0 trap??? it results in the 1's complement of 0 which means all
bits are 1's. Padding bits are not affected by this operation, however the
values of padding bits should never be of your interest.

A Trap representation can *only* be generated when manipulating the value
using pointers which aren't the type of the value or doesn't start at the
exact address of the value. Wrong usage of an union can also result in a
Trap representation.

But all arithmetic or bitwise operations cannot result in a
trap-representation.

--
cody

[Freeware, Games and Humor]
www.deutronium.de.vu || www.deutronium.tk

CBFalconer
Guest
Posts: n/a

 10-17-2003
pete wrote:
> CBFalconer wrote:
> > pete wrote:
> > > Nils Petter Vaskinn wrote:
> > > > On Wed, 15 Oct 2003 01:59:04 -0700, RB wrote:
> > > >
> > > > > How to extract bytes from long, starting from the last byte?
> > >
> > > #include <limits.h>
> > > > #include <stdio.h>
> > > >
> > > > int main() {
> > > >
> > > > unsigned long value = 0x12345678;
> > > > int i;
> > > >
> > > > printf("%#lx\n",value);
> > > >
> > > > for (i = sizeof value; i > 0; --i) {
> > > > printf("%#x\n", value & 0xff);
> > > > value >>= 8;
> > >
> > > /*
> > > ** You realise that you don't know the size of value,
> > > ** so you might as well go all the way.
> > > */
> > > printf("%#x\n", value & (unsigned char)-1);
> > > value >>= CHAR_BIT;
> > >
> > > > }
> > > > return 0;
> > > > }

> >
> > You need neither CHAR_BIT nor shifts nor limits.h nor sizeof:
> >
> > for (i = 8; i > 0; --i) {
> > printf("%x ", value % 256);
> > value /= 256;
> > }
> > putchar('\n'); /* <--AND HERE is where the \n goes */
> > return 0;
> > }
> >
> > and the result is portable.

>
> I was addressing the more general subject,
> in the subject line of this thread: "How to extract bytes from long?",
> rather than how to extract bytes from 0x12345678
> or any other number which doesn't require more than 32 bits.

You were not extracting bytes. You were extracting 8 bit
quantities, least significant part first. In other words you are
expressing the _value_ in base 256, so why not say so in the code?

If the compiler knows that it can improve the code by using
shifts, it may do so.

--
Chuck F ((E-Mail Removed)) ((E-Mail Removed))
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Arthur J. O'Dwyer
Guest
Posts: n/a

 10-17-2003

On Fri, 17 Oct 2003, Jirka Klaue wrote:
>
> Arthur J. O'Dwyer wrote:
> ...
> > (uchar)-1 == (uchar)-((int)1) == (-(int)1)+UCHAR_MAX == UCHAR_MAX-1

>
> You must have miscalculated somewhere. (uc)-1 should be UCHAR_MAX.

Augh! I do that every time! Thanks.

(uc)-1 == (uc)(0 - 1) == (-1+UCHAR_MAX+1) == UCHAR_MAX

-Arthur

Arthur J. O'Dwyer
Guest
Posts: n/a

 10-17-2003

On Fri, 17 Oct 2003, cody wrote:
>
> "Arthur J. O'Dwyer" <(E-Mail Removed)> schrieb...
> > However, it *does* produce the right answer, which is a point in
> > its favor. ~0 might trap, and in any case I think (unsigned char)-1
> > has a bit more aesthetic value to it (YMMV, of course).

>
> Why should ~0 trap??? it results in the 1's complement of 0 which
> means all bits are 1's

....which may be a trap representation on a ones'-complement
architecture.

> Padding bits are not affected by this operation, however the
> values of padding bits should never be of your interest.

Well, technically padding bits *might* be affected by the ~
operation, but the effect on the padding bits alone cannot create
a trap representation -- the system has to remember to do the
Right Thing with them in this case.

> A trap representation can *only* be generated when manipulating the
> value using pointers which aren't the type of the value or doesn't start
> at the exact address of the value.

Wrong. Signed integer overflow may create a trap value, for instance.

> Wrong usage of an union can also result in a trap representation.
>
> But all arithmetic or bitwise operations cannot result in a
> trap representation.

Wrong.

-Arthur