Velocity Reviews > Convert binary char array to integer with reordering

# Convert binary char array to integer with reordering

Dave
Guest
Posts: n/a

 10-25-2005
Hi all,

I have a 4 byte char array with the binary data for two 16-bit signed
integers in it like this:

Index 3 2 1 0
Data Bh Bl Ah Al

Where Bh is the high byte of signed 16-bit integer B and so on.

I want to create 32-bit integers A and B with the data in the char
array.

I have tried things like (and various other permutations):

A = (data[1] << | (unsigned int)data[0];
B = (data[3] << | (unsigned int)data[2];

and this works except for when say data[1] = 0x00 and data[0] = 0x80.
In this case, Al gets sign extended all the way to the top of A giving
0xffffff80 which is wrong of course.

I read about the arithmetic converions and I believe it is these that
are converting the right operand to signed and causing the sign
extension.

At the moment, I am getting things right like this:

int A;
int B;
char buildA[4], buildB[4], data[4];

// data[] gets filled here

buildA[0] = data[0];
buildA[1] = data[1];

if (data[1] >> 7)
{
buildA[2] = (char)0xff;
buildA[3] = (char)0xff;
}
else
{
buildA[2] = 0x00;
buildA[3] = 0x00;
}

buildB[0] = data[2];
buildB[1] = data[3];

if (data[3] >> 7)
{
buildB[2] = (char)0xff;
buildB[3] = (char)0xff;
}
else
{
buildB[2] = 0x00;
buildB[3] = 0x00;
}

A = *((int*)buildA);
B = *((int*)buildB);

Surely there is a cleaner way?

Dave

Kenny McCormack
Guest
Posts: n/a

 10-25-2005
In article <(E-Mail Removed). com>,
Dave <(E-Mail Removed)> wrote:
>Hi all,
>
>I have a 4 byte char array with the binary data for two 16-bit signed
>integers in it like this:
>
>Index 3 2 1 0
>Data Bh Bl Ah Al
>
>Where Bh is the high byte of signed 16-bit integer B and so on.

Not portable. Can't discuss it here. Blah, blah, blah.

Dick de Boer
Guest
Posts: n/a

 10-25-2005
If you convert a signed char to an unsigned int, the result is
sign-extended, because the char is signed. Try:
A = (data[1] << | (unsigned char)data[0];
B = (data[3] << | (unsigned achr)data[2];
(Or make the array of type unsigned char)

Now, the unsigned char is default promoted to signed int, and the signed int
should have the same value as the unsigned char...

DickB

"Dave" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) ups.com...
> Hi all,
>
> I have a 4 byte char array with the binary data for two 16-bit signed
> integers in it like this:
>
> Index 3 2 1 0
> Data Bh Bl Ah Al
>
> Where Bh is the high byte of signed 16-bit integer B and so on.
>
> I want to create 32-bit integers A and B with the data in the char
> array.
>
> I have tried things like (and various other permutations):
>
> A = (data[1] << | (unsigned int)data[0];
> B = (data[3] << | (unsigned int)data[2];
>
> and this works except for when say data[1] = 0x00 and data[0] = 0x80.
> In this case, Al gets sign extended all the way to the top of A giving
> 0xffffff80 which is wrong of course.
>
> I read about the arithmetic converions and I believe it is these that
> are converting the right operand to signed and causing the sign
> extension.
>
> At the moment, I am getting things right like this:
>
> int A;
> int B;
> char buildA[4], buildB[4], data[4];
>
> // data[] gets filled here
>
> buildA[0] = data[0];
> buildA[1] = data[1];
>
> if (data[1] >> 7)
> {
> buildA[2] = (char)0xff;
> buildA[3] = (char)0xff;
> }
> else
> {
> buildA[2] = 0x00;
> buildA[3] = 0x00;
> }
>
> buildB[0] = data[2];
> buildB[1] = data[3];
>
> if (data[3] >> 7)
> {
> buildB[2] = (char)0xff;
> buildB[3] = (char)0xff;
> }
> else
> {
> buildB[2] = 0x00;
> buildB[3] = 0x00;
> }
>
> A = *((int*)buildA);
> B = *((int*)buildB);
>
> Surely there is a cleaner way?
>
> Many thanks for your time,
>
> Dave
>

Jordan Abel
Guest
Posts: n/a

 10-25-2005
On 2005-10-25, Kenny McCormack <(E-Mail Removed)> wrote:
> In article <(E-Mail Removed). com>,
> Dave <(E-Mail Removed)> wrote:
>>Hi all,
>>
>>I have a 4 byte char array with the binary data for two 16-bit signed
>>integers in it like this:
>>
>>Index 3 2 1 0
>>Data Bh Bl Ah Al
>>
>>Where Bh is the high byte of signed 16-bit integer B and so on.

>
> Not portable. Can't discuss it here. Blah, blah, blah.

says who?

int16_t A = data[1]<<8+data[0];
int16_t B = data[3]<<8+data[2];

looks portable to me. chars have to be at least 8 bits [to represent values
from -127 to 127 signed, 0 to 255 unsigned, they have to be], and int16_t where
present is exactly 16 bits and signed. Now ideally you should be using unsigned
char for this, but I don't think it actually matters in this case [well, I
suppose negative zero could still be a trap representation on non twos
complement systems].

Now, the precise _meaning_ of those bytes with regards to the integer value you
end up with may differ in that negative numbers [in this case, where high bit
of byte 1 or 3 is set] can be represented in precisely three different ways
according to the standard, but assuming that he got the byte values in a
portable way in the first place and is using them on the same machine where he
generated them, he can put them back the same way he got them out, and assuming
he used a type guaranteed to be exactly 16 bits (say, c99 int16_t, <stdint.h>)
he loses no information in doing so.

While this exercise may seem pointless, it could be intended as a method of
serialization [in which case, though, he may wish to guarantee a particular
signed representation of his values as well as a byte order]

I can't imagine what (other than the possibility that char may be signed) is
actually the same as) x86 registers that made you think "non-portable!"?

--
How's that for my first post?

Dave
Guest
Posts: n/a

 10-25-2005
Hi Dick,

I used the code you quoted above (left data as char) and it worked
perfectly.

I think data needs to stay char so that the sign extension does happen
with data[1] and [3] as required.

Many thanks,

Dave

Richard Bos
Guest
Posts: n/a

 10-25-2005
Jordan Abel <(E-Mail Removed)> wrote:

> On 2005-10-25, Kenny McCormack <(E-Mail Removed)> wrote:
> > Not portable. Can't discuss it here. Blah, blah, blah.

>
> says who?

Says a loser with a chip on his shoulder who still hasn't got over being
told, once, that he himself posted something off-topic. Just ignore him
when he's in this mode.

Richard

Walter Roberson
Guest
Posts: n/a

 10-25-2005
In article <(E-Mail Removed)>,
Jordan Abel <(E-Mail Removed)> wrote:
>int16_t A = data[1]<<8+data[0];
>int16_t B = data[3]<<8+data[2];

>looks portable to me.

int16_t does not exist in C89, and in C99 it is optional.
It simply doesn't exist on C99 systems that have (say) 18 bit ints.
That makes it standardized but not portable.

--
Chocolate is "more than a food but less than a drug" -- RJ Huxtable

Richard Bos
Guest
Posts: n/a

 10-25-2005
"Dave" <(E-Mail Removed)> wrote:

> I want to create 32-bit integers A and B with the data in the char
> array.
>
> I have tried things like (and various other permutations):
>
> A = (data[1] << | (unsigned int)data[0];
> B = (data[3] << | (unsigned int)data[2];
>
> and this works except for when say data[1] = 0x00 and data[0] = 0x80.
> In this case, Al gets sign extended all the way to the top of A giving
> 0xffffff80 which is wrong of course.

> char buildA[4], buildB[4], data[4];

An alternative to the other solutions, perhaps safer because you have
less chance of running into signed integer overflow, is to make data
(and because of this, also buildA and buildB) arrays of unsigned int
instead. They won't get sign-extended then simply because they won't
have any sign.
Note also that, since A and B are signed ints, you do not know for
certain that they are 32 bits - use long or int32_t (or even
int_least32_t, which is guaranteed to exist under C99) to get around
this. What's worse, if you ever get an array that represents a value
that doesn't fit in 31 bits - that is, 32 minus the sign bit - you cause
undefined behaviour. Again, an unsigned type (uint_least32_t?) could be
a good solution.

Richard

Kenny McCormack
Guest
Posts: n/a

 10-25-2005
In article <djljr4\$cuq\$(E-Mail Removed)>,
Walter Roberson <(E-Mail Removed)-cnrc.gc.ca> wrote:
>In article <(E-Mail Removed)>,
>Jordan Abel <(E-Mail Removed)> wrote:
>>int16_t A = data[1]<<8+data[0];
>>int16_t B = data[3]<<8+data[2];

>
>>looks portable to me.

>
>int16_t does not exist in C89, and in C99 it is optional.
>It simply doesn't exist on C99 systems that have (say) 18 bit ints.
>That makes it standardized but not portable.

Exactly.

Default User
Guest
Posts: n/a

 10-25-2005
Dick de Boer wrote:

> If you convert a signed char to an unsigned int, the result is
> sign-extended, because the char is signed. Try: A = (data[1] << |
> (unsigned char)data[0]; B = (data[3] << | (unsigned achr)data[2];
> (Or make the array of type unsigned char)
>
> Now, the unsigned char is default promoted to signed int, and the
> signed int should have the same value as the unsigned char...
>
> DickB
>
> "Dave" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed) ups.com...
> > Hi all,

interspersed with properly trimmed quotes.

Brian