Velocity Reviews > integers to binary

# integers to binary

Old Wolf
Guest
Posts: n/a

 12-13-2005
rayw wrote:

[n is a char]
>>> itoa(n, buffer, 2);

>>
>> Doesn't your library have any documentation?

>
> Yes, it does, but I was hoping to get some theory here as to
> what it's doing: I don't have the source.

The relevant line in the documentation is:

char *itoa(int value, char *string, int radix);

and in particular, that itoa is expecting 'value' to be an int.

> For example, if you run the program 'as posted' and enter
> 255 for the value, the output is:
>
> [compiler 1: gcc]
> 11111111111111111111111111111111
>
> [compiler 2: msvc]
> 11111111111111111111111111112111
>
> If you enter something like 128, you get this
> [compiler 1: gcc]
> 11111111111111111111111110000000
>
> [compiler 2: msvc]
> 11111111111111111111111110001000
>
> but for 127, you get this
> [compiler 1: gcc]
> 1111111
> [compiler 2: msvc]
> 1111111
>
> So, I *think* there's 'weird stuff going on' - and that's about as
> descriptive as I can get.

128 and 255 are not valid values for 'char'. char values are in
the range -128 to 127[*]. You cause your program to be
unreliable when you assign out-of-range values to chars.

Here is a fuller description of what is happening, please bear
in mind that this behaviour is peculiar to some systems only
(eg. IA32), and doesn't hold in the general case.
When you assign the int 128 to your char, the compiler tries
to squish 0x00000080 into eight bits, which results in 0x80,
which is the value -128 when used as the bit-pattern for a char.
So the itoa function receives the value of -128. To find out
what itoa does when you give it a negative number you will
itoa function but it only specifies negative number behaviour

Based on your output I'd say that it is printing the bit pattern
of the negative number, and in binary the int -128 is indeed
11111111111111111111111110000000 .
Since your buffer is less than 33 bytes, anything could
happen really, you're lucky that GCC even managed to
not screw up the end of the buffer, although MSVC clearly
did (as it is entitled to, of course).

A similar explanation applies to the 255 case (which gets
the value -1, and has bit-pattern of all 1s).
[*] In general they can be in a variety of ranges, and the actual
range is specified by CHAR_MIN and CHAR_MAX in limits.h .
But it's clear from your post that your system has this range.

Old Wolf
Guest
Posts: n/a

 12-13-2005
rayw wrote:

> I've been trying to write a program that converts an entered number
> into its binary representation using three different techniques - bit
> shifting, mod 2, and itoa..

I have some other comments on y our code which

> #include <stdio.h >

No space !

> #include <limits.h>
> #include <stdlib.h>
> #include <string.h>
> #define I_TYPE char

Most importantly, you should use an unsigned type
for anything that you are planning to do bit-shifts on,
because signed types cause trouble as you have seen.

It is usually better to use a typedef:
typedef unsigned int I_TYPE;

Also you will find it easier to start off with an int, and
then work down to a char (everything in C is designed
to work best with ints).

>
> #define K sizeof(I_TYPE) * CHAR_BIT
>
> // Forward decs.
> //
> char * binary1(I_TYPE n);
> char * binary2(I_TYPE n);
> char * binary3(I_TYPE n);
> char * strrev (char * str);

Strictly speaking you should not call your functions anything
starting with str followed by a lower case letter, because
it might clash with future updates to string.h .

> int main(void)
> {
> char buffer[100];
> puts("Enter numbers to convert to binary, or hit enter to quit");
>
> while(strlen(fgets(&buffer[0], sizeof(buffer), stdin)) > 1)

fgets() returns NULL on failure.. you'll need to re-arrange this test.
Also you can just write 'buffer' everywhere you have &buffer[0],
it means exactly the same thing in most contexts.

> {
> I_TYPE l = (I_TYPE)strtol(&buffer[0], NULL, 10);

Don't use this cast. It doesn't gain you anything.

> puts(binary1(l));
> puts(binary2(l));
> puts(binary3(l));
> }
>
> return 0;
> }
>
> char * binary1(I_TYPE n)
> {
> int i = 0;
> unsigned long int j = 1;
> static char buffer[K + 1];
>
> buffer[sizeof(buffer) - 1] = '\0';
> for(i = 0; i < K; i++)
> {
> buffer[i] = (n & j) == j ? '1' : '0';
> j = j << 1;

This will give you unexpected results if n is negative -- which is
why I_TYPE should be unsigned.

Also it's possible to be more concise (and clearer IMHO):

buffer[i] = (n & j) ? '1' : '0';
j <<= 1;

> }
>
> return strrev(buffer);
> }
>
> char * binary2(I_TYPE n)
> {
> int i;
> static char buffer[K + 1];
> buffer[sizeof(buffer) - 1] = '\0';
>
> for(i = K - 1; i >= 0; n /= (unsigned long int)2)

You can write 2UL instead of the cast notation.

> buffer[i--] = n % 2 == 0 ? '0' : '1';
> return buffer;
> }
>
> char * strrev(char * s)
> {
> char * p1;
> char * p2;
>
> if (!s || !*s)
> return s;
> else
> for(p1 = s, p2 = s + strlen(s) - 1; p2 > p1; ++p1, --p2)
> {
> *p1 ^= *p2;
> *p2 ^= *p1;
> *p1 ^= *p2;
> }

Please don't do that: it just confuses people who haven't
seen it before. Far clearer is:

char tmp = *p1;
*p1 = *p2;
*p2 = tmp;

>
> return s;
> }

Niklas Norrthon
Guest
Posts: n/a

 12-14-2005
Mark McIntyre <(E-Mail Removed)> writes:

> On Tue, 13 Dec 2005 15:42:49 -0000, in comp.lang.c , "rayw"
> <(E-Mail Removed)> wrote:

> > buffer[i] = (n & j) == j ? '1' : '0';

>
> if ( (n & j) == j )
> buffer[i]='1';
> else
> buffer[i]='0';

buffer[i] = ((n & j) == j) + '0';

Style issue.

/Niklas Norrthon

rayw
Guest
Posts: n/a

 12-14-2005

"Old Wolf" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...
> rayw wrote:
>
> [n is a char]
>>>> itoa(n, buffer, 2);
>>>
>>> Doesn't your library have any documentation?

>>
>> Yes, it does, but I was hoping to get some theory here as to
>> what it's doing: I don't have the source.

>
> The relevant line in the documentation is:
>
> char *itoa(int value, char *string, int radix);
>
> and in particular, that itoa is expecting 'value' to be an int.
>
>> For example, if you run the program 'as posted' and enter
>> 255 for the value, the output is:
>>
>> [compiler 1: gcc]
>> 11111111111111111111111111111111
>>
>> [compiler 2: msvc]
>> 11111111111111111111111111112111
>>
>> If you enter something like 128, you get this
>> [compiler 1: gcc]
>> 11111111111111111111111110000000
>>
>> [compiler 2: msvc]
>> 11111111111111111111111110001000
>>
>> but for 127, you get this
>> [compiler 1: gcc]
>> 1111111
>> [compiler 2: msvc]
>> 1111111
>>
>> So, I *think* there's 'weird stuff going on' - and that's about as
>> descriptive as I can get.

>
> 128 and 255 are not valid values for 'char'. char values are in
> the range -128 to 127[*]. You cause your program to be
> unreliable when you assign out-of-range values to chars.
>
> Here is a fuller description of what is happening, please bear
> in mind that this behaviour is peculiar to some systems only
> (eg. IA32), and doesn't hold in the general case.
> When you assign the int 128 to your char, the compiler tries
> to squish 0x00000080 into eight bits, which results in 0x80,
> which is the value -128 when used as the bit-pattern for a char.
> So the itoa function receives the value of -128. To find out
> what itoa does when you give it a negative number you will
> itoa function but it only specifies negative number behaviour
>
> Based on your output I'd say that it is printing the bit pattern
> of the negative number, and in binary the int -128 is indeed
> 11111111111111111111111110000000 .
> Since your buffer is less than 33 bytes, anything could
> happen really, you're lucky that GCC even managed to
> not screw up the end of the buffer, although MSVC clearly
> did (as it is entitled to, of course).
>
> A similar explanation applies to the 255 case (which gets
> the value -1, and has bit-pattern of all 1s).
>
>[*] In general they can be in a variety of ranges, and the actual
> range is specified by CHAR_MIN and CHAR_MAX in limits.h .
> But it's clear from your post that your system has this range.

Many thanks - clear now!

rayw
Guest
Posts: n/a

 12-14-2005

"Old Wolf" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...
> rayw wrote:
>
>> I've been trying to write a program that converts an entered number
>> into its binary representation using three different techniques - bit
>> shifting, mod 2, and itoa..

>
> I have some other comments on y our code which
> hopefully you will find helpful.
>
>> #include <stdio.h >

>
> No space !
>
>> #include <limits.h>
>> #include <stdlib.h>
>> #include <string.h>
>> #define I_TYPE char

>
> Most importantly, you should use an unsigned type
> for anything that you are planning to do bit-shifts on,
> because signed types cause trouble as you have seen.
>
> It is usually better to use a typedef:
> typedef unsigned int I_TYPE;
>
> Also you will find it easier to start off with an int, and
> then work down to a char (everything in C is designed
> to work best with ints).
>
>>
>> #define K sizeof(I_TYPE) * CHAR_BIT
>>
>> // Forward decs.
>> //
>> char * binary1(I_TYPE n);
>> char * binary2(I_TYPE n);
>> char * binary3(I_TYPE n);
>> char * strrev (char * str);

>
> Strictly speaking you should not call your functions anything
> starting with str followed by a lower case letter, because
> it might clash with future updates to string.h .
>
>> int main(void)
>> {
>> char buffer[100];
>> puts("Enter numbers to convert to binary, or hit enter to quit");
>>
>> while(strlen(fgets(&buffer[0], sizeof(buffer), stdin)) > 1)

>
> fgets() returns NULL on failure.. you'll need to re-arrange this test.
> Also you can just write 'buffer' everywhere you have &buffer[0],
> it means exactly the same thing in most contexts.
>
>> {
>> I_TYPE l = (I_TYPE)strtol(&buffer[0], NULL, 10);

>
> Don't use this cast. It doesn't gain you anything.
>
>> puts(binary1(l));
>> puts(binary2(l));
>> puts(binary3(l));
>> }
>>
>> return 0;
>> }
>>
>> char * binary1(I_TYPE n)
>> {
>> int i = 0;
>> unsigned long int j = 1;
>> static char buffer[K + 1];
>>
>> buffer[sizeof(buffer) - 1] = '\0';
>> for(i = 0; i < K; i++)
>> {
>> buffer[i] = (n & j) == j ? '1' : '0';
>> j = j << 1;

>
> This will give you unexpected results if n is negative -- which is
> why I_TYPE should be unsigned.
>
> Also it's possible to be more concise (and clearer IMHO):
>
> buffer[i] = (n & j) ? '1' : '0';
> j <<= 1;
>
>> }
>>
>> return strrev(buffer);
>> }
>>
>> char * binary2(I_TYPE n)
>> {
>> int i;
>> static char buffer[K + 1];
>> buffer[sizeof(buffer) - 1] = '\0';
>>
>> for(i = K - 1; i >= 0; n /= (unsigned long int)2)

>
> You can write 2UL instead of the cast notation.
>
>> buffer[i--] = n % 2 == 0 ? '0' : '1';
>> return buffer;
>> }
>>
>> char * strrev(char * s)
>> {
>> char * p1;
>> char * p2;
>>
>> if (!s || !*s)
>> return s;
>> else
>> for(p1 = s, p2 = s + strlen(s) - 1; p2 > p1; ++p1, --p2)
>> {
>> *p1 ^= *p2;
>> *p2 ^= *p1;
>> *p1 ^= *p2;
>> }

>
> Please don't do that: it just confuses people who haven't
> seen it before. Far clearer is:
>
> char tmp = *p1;
> *p1 = *p2;
> *p2 = tmp;
>
>>
>> return s;
>> }

Simon Biber
Guest
Posts: n/a

 12-14-2005
Niklas Norrthon wrote:
> Mark McIntyre <(E-Mail Removed)> writes:
>>On Tue, 13 Dec 2005 15:42:49 -0000, in comp.lang.c , "rayw"
>><(E-Mail Removed)> wrote:
>>> buffer[i] = (n & j) == j ? '1' : '0';

>
>>if ( (n & j) == j )
>> buffer[i]='1';
>>else
>> buffer[i]='0';

>
> buffer[i] = ((n & j) == j) + '0';
>
> Style issue.

Indeed it is a style issue. However, your contribution is too clever for
its own good. It's correct, but it's harder to read and harder to
understand. Remember that C code is written 10% for the compiler and 90%
for the maintenance programmer. Understanding this code requires some
1. a Boolean expression has an integer value of 0 or 1
2. digits are sequential, ie. '0' + 1 == '1'
While any good C programmer knows these two facts, it takes time and
brain cycles to put it all together, which could be avoided by not being
so clever.

--
Simon.

pete
Guest
Posts: n/a

 12-14-2005
Simon Biber wrote:
>
> Niklas Norrthon wrote:
> > Mark McIntyre <(E-Mail Removed)> writes:
> >>On Tue, 13 Dec 2005 15:42:49 -0000, in comp.lang.c , "rayw"
> >><(E-Mail Removed)> wrote:
> >>> buffer[i] = (n & j) == j ? '1' : '0';

> >
> >>if ( (n & j) == j )
> >> buffer[i]='1';
> >>else
> >> buffer[i]='0';

> >
> > buffer[i] = ((n & j) == j) + '0';
> >
> > Style issue.

>
> Indeed it is a style issue. However,
> your contribution is too clever for
> its own good. It's correct, but it's harder to read and harder to
> understand. Remember that C code is written
> 10% for the compiler and 90%
> for the maintenance programmer. Understanding this code requires some
> extra connections to be made:
> 1. a Boolean expression has an integer value of 0 or 1
> 2. digits are sequential, ie. '0' + 1 == '1'
> While any good C programmer knows these two facts, it takes time and
> brain cycles to put it all together,
> which could be avoided by not being so clever.

I liked "rayw"'s original version:

buffer[i] = (n & j) == j ? '1' : '0';

That's just about as simple a use of the conditional operator,
as there can be.
Some people just don't like the conditional operator,
but I'm not one of them.

--
pete

Tim Rentsch
Guest
Posts: n/a

 12-14-2005
Mark McIntyre <(E-Mail Removed)> writes:

> On Tue, 13 Dec 2005 15:42:49 -0000, in comp.lang.c , "rayw"
> <(E-Mail Removed)> wrote:
>
> >
> >"Pieter Droogendijk" <(E-Mail Removed)> wrote in message
> >news(E-Mail Removed) FOR.ukME...
> >> On Tue, 13 Dec 2005 10:21:09 +0000, rayw wrote:
> >>
> >>> buffer[i] = (n & j) == j ? '1' : '0';
> >>
> >> This expression could be nicer. How's this:
> >>
> >> if ( (n & j) == j ) {
> >> buffer[i] = '1';
> >> }
> >> else {
> >> buffer[i] = '0';
> >> }

> >
> >I agree that to some it's more readable, but as it's more lines of code, and
> >duplicates the assignment etc, I think it's more prone to errors later -
> >like if someone changes it.

>
> code. Write for readability first. Consider clever tricks only if
> trying to obfuscate code.
>
> if ( (n & j) == j )
> buffer[i]='1';
> else
> buffer[i]='0';

The original single assignment is precisely the sort of situation
where using the ?: operator makes sense. If Mr. McIntyre says he
finds the if/else form with two assignments more readable, I'm expect
that's so, but his reaction isn't shared by everyone. IMO the
single assignment using ?: is both more readable and more clear
than using if/else and two assignments.

Pieter Droogendijk
Guest
Posts: n/a

 12-14-2005
On Tue, 13 Dec 2005 15:42:49 +0000, rayw wrote:

>
> "Pieter Droogendijk" <(E-Mail Removed)> wrote in message
> news(E-Mail Removed) OR.ukME...
>> On Tue, 13 Dec 2005 10:21:09 +0000, rayw wrote:
>>>

<snap>

>> storage duration because it's been defined in main() already? Because you
>> 'scope'.

>
> Each binary?'()s local buffer is declared static as each of them returns
> &buffer[0] to the caller - so, nothing to do with scope, just don't want to
> return a frame variable.

Whoops.
Forget my comments about it. Erase them from Usenet. It was temporary
insanity. I was tired. It was the dog that done it.

>>> buffer[i] = (n & j) == j ? '1' : '0';

>>
>> This expression could be nicer. How's this:
>>
>> if ( (n & j) == j ) {
>> buffer[i] = '1';
>> }
>> else {
>> buffer[i] = '0';
>> }

>
> I agree that to some it's more readable, but as it's more lines of code, and
> duplicates the assignment etc, I think it's more prone to errors later -
> like if someone changes it.

Sure it's six lines for what could be one assignment, but it's as clear
as it's going to get. That's what I aim for, Unless obfuscation is what I
want.

But each to their own, of course.

>>> // itoa() not a standard ISO function.
>>> //
>>> itoa(n, buffer, 2);

>>
>> Indeed it isn't. I don't have it.
>>
>> Though according to http://www.mkssoftware.com/docs/man3/itoa.3.asp this
>> should work, unless I've gone blind.

>
> Thanks for the ref and the comments.

--
Pieter Droogendijk <pieter at binky dot org dot uk>
PGP/1E92DBBC [ Make way for the Emperor's Finest. ] binky.org.uk

Mark McIntyre
Guest
Posts: n/a

 12-14-2005
On 14 Dec 2005 09:14:23 +0100, in comp.lang.c , Niklas Norrthon
<(E-Mail Removed)> wrote:

>Mark McIntyre <(E-Mail Removed)> writes:
>
>> On Tue, 13 Dec 2005 15:42:49 -0000, in comp.lang.c , "rayw"
>> <(E-Mail Removed)> wrote:

>
>> > buffer[i] = (n & j) == j ? '1' : '0';

>
>>
>> if ( (n & j) == j )
>> buffer[i]='1';
>> else
>> buffer[i]='0';

>
>buffer[i] = ((n & j) == j) + '0';

cute, but less readable, IMHO. Better then the original tho.

>Style issue.

----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----