Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > How to read an integer one by one?

Reply
Thread Tools

How to read an integer one by one?

 
 
Skarmander
Guest
Posts: n/a
 
      12-08-2005
Richard Heathfield wrote:
> Skarmander said:
>
>> Dag-Erling Smørgrav wrote:
>>> This one is slightly wasteful of memory, but is computed entirely at
>>> compile time:
>>>
>>> #include <limits.h>
>>> char number[1+(sizeof(int)*CHAR_BIT)/3+1];

>
> That looks familiar, but... well, enough of that later.
>
>>> The resulting array will be large enough to store the null-terminated
>>> decimal representation of any integer in the range (INT_MIN,INT_MAX).
>>> Proof of this is left as an exercise for the reader.

>> Actually, please show why this works, especially why this works for
>> INT_MIN. It's easy enough to see N / 3 + 1 can approximate
>> ceil(log10(2^N)), but I can't get the details quite right for signed
>> integers. I'm sure I'm overlooking something obvious.

>
> I can explain it easily enough, for the simple reason that I invented it.
> (I'm quite sure there are other people who've invented it too, and
> long before I did, but at any rate I derived it independently, and so I
> know why it works.)
>
> An int comprises sizeof(int) * CHAR_BIT bits. Since three bits can always
> represent any octal digit, and leaving signs and terminators aside for a
> second, it is clear that (sizeof(int) * CHAR_BIT) / 3 characters are
> sufficient to store the octal representation of the number (but see below).
> Since decimal can represent more efficiently than octal, what's good enough
> for octal is also good enough for decimal.
>
> Now we add in 1 for the sign, and 1 for the null terminator, and that's
> where the above expression comes from.
>

Interesting. Yes, that's probably more straightforward than the brute force
I applied: to represent a number n in a base b, you need ceil(log_b(|n|))
digits (for n != 0). sizeof(int) * CHAR_BIT gives us an upper bound for the
value bits of an integer, and hence an upper bound for (U)INT_MAX: 2^N with
N = sizeof(int) * CHAR_BIT.

Now to represent N-bit integers in decimal, we need at most ceil(log10(2^N))
= ceil(N * log10(2)) = floor(N * log10(2)) + 1 = N / log2(10) + 1 bits. "3"
(log2() is exactly right for octal, and as you say, good enough for
decimal. Better approximations for log2(10) give closer bounds. This does
not take the sign into account.

> BUT: consider the possibility that the integer division truncates (which it
> will do if sizeof(int) * CHAR_BIT is not a multiple of 3). Under such
> circumstances, you could be forgiven for wanting some bodge factor in
> there. That's why I use:
>
> char number[1 + (sizeof(int) * CHAR_BIT + 2) / 3 + 1];
>
> The + 2 is always sufficient to counter the effects of any truncation after
> division by 3, but doesn't inflate the result by more than one character at
> most.
>

<snip>
Yes, and I won't dispute the practicality of this approach as opposed to my
mysterious logarithm approximations, but the challenge is this: prove that
no bodge factor is necessary (it isn't). sizeof(int) * CHAR_BIT / 3 + 1
always gives enough space for all digits and a sign if an integer type is
greater than 11 bits, which it's guaranteed to be (except for char, of
course, and you do need correction for signed char if CHAR_BIT is .

If I'm not mistaken, this is the same as proving that ceil((N - 1) *
log10(2) + 1) <= N / 3 for all N >= 12 (one value bit less for signed
integers, but one character more). Like I said, I'm sure I'm overlooking
something simple, some transformation that will highlight this.

I guess I should have paid better attention in math class. I like to stick
to boolean algebra...

S.
 
Reply With Quote
 
 
 
 
=?iso-8859-1?q?Dag-Erling_Sm=F8rgrav?=
Guest
Posts: n/a
 
      12-08-2005
Skarmander <(E-Mail Removed)> writes:
> Dag-Erling Smørgrav wrote:
> > #include <limits.h>
> > char number[1+(sizeof(int)*CHAR_BIT)/3+1];
> > The resulting array will be large enough to store the null-terminated
> > decimal representation of any integer in the range (INT_MIN,INT_MAX).
> > Proof of this is left as an exercise for the reader.

> Actually, please show why this works, especially why this works for
> INT_MIN. It's easy enough to see N / 3 + 1 can approximate
> ceil(log10(2^N)), but I can't get the details quite right for signed
> integers. I'm sure I'm overlooking something obvious.


There are three terms in the addition.

DES
--
Dag-Erling Smørgrav - http://www.velocityreviews.com/forums/(E-Mail Removed)
 
Reply With Quote
 
 
 
 
Skarmander
Guest
Posts: n/a
 
      12-08-2005
Dag-Erling Smørgrav wrote:
> Skarmander <(E-Mail Removed)> writes:
>> Dag-Erling Smørgrav wrote:
>>> #include <limits.h>
>>> char number[1+(sizeof(int)*CHAR_BIT)/3+1];
>>> The resulting array will be large enough to store the null-terminated
>>> decimal representation of any integer in the range (INT_MIN,INT_MAX).
>>> Proof of this is left as an exercise for the reader.

>> Actually, please show why this works, especially why this works for
>> INT_MIN. It's easy enough to see N / 3 + 1 can approximate
>> ceil(log10(2^N)), but I can't get the details quite right for signed
>> integers. I'm sure I'm overlooking something obvious.

>
> There are three terms in the addition.
>

So sizeof(int) * CHAR_BIT / 3 is the term supposed to approximate the number
of digits, and the +1 is for the sign? Hm. Crude.

S.
 
Reply With Quote
 
=?iso-8859-1?q?Dag-Erling_Sm=F8rgrav?=
Guest
Posts: n/a
 
      12-08-2005
Richard Heathfield <(E-Mail Removed)> writes:
> I can explain it easily enough, for the simple reason that I
> invented it.


That's a bit presumptive, don't you think? It's an obvious solution
to anyone who understands logarithms.

> An int comprises sizeof(int) * CHAR_BIT bits. Since three bits can
> always represent any octal digit, and leaving signs and terminators
> aside for a second, it is clear that (sizeof(int) * CHAR_BIT) / 3
> characters are sufficient to store the octal representation of the
> number (but see below). Since decimal can represent more
> efficiently than octal, what's good enough for octal is also good
> enough for decimal.


Octal never entered my mind; 3 is simply a good enough approximation
of log2(10).

> BUT: consider the possibility that the integer division truncates
> (which it will do if sizeof(int) * CHAR_BIT is not a multiple of
> 3). Under such circumstances, you could be forgiven for wanting some
> bodge factor in there. That's why I use:
>
> char number[1 + (sizeof(int) * CHAR_BIT + 2) / 3 + 1];


Turns out the bodge factor is never necessary on a two's complement
machine.

DES
--
Dag-Erling Smørgrav - (E-Mail Removed)
 
Reply With Quote
 
Jordan Abel
Guest
Posts: n/a
 
      12-08-2005
On 2005-12-08, Dag-Erling Smørgrav <(E-Mail Removed)> wrote:
> Turns out the bodge factor is never necessary on a two's complement
> machine.


two's complement has nothing to do with it.

take 32 bits.

32/3=10

your assertion implies [essentially] that 10 digits is sufficient for
the _octal_ representation of 2^32-1, which is 11 digits: 37777777777.

For n < 9, ceil(log[10](2^n)) exceeds floor(log[8](2^n))

this means if you try it with char rather than int, you'll get screwed.

on an 8-bit char system, CHAR_BIT/3+2=4, which is one fewer byte than
needed for SCHAR_MIN, "-128" (plus null terminator).
 
Reply With Quote
 
Richard Heathfield
Guest
Posts: n/a
 
      12-09-2005
Dag-Erling Smørgrav said:

> Richard Heathfield <(E-Mail Removed)> writes:
>> I can explain it easily enough, for the simple reason that I
>> invented it.

>
> That's a bit presumptive, don't you think?


Yes. Hence the qualification, which you snipped.

> It's an obvious solution to anyone who understands logarithms.


Lots of things are obvious even to people who don't. That doesn't stop
software companies from patenting them.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
 
Reply With Quote
 
=?iso-8859-1?q?Dag-Erling_Sm=F8rgrav?=
Guest
Posts: n/a
 
      12-09-2005
Jordan Abel <(E-Mail Removed)> writes:
> this means if you try it with char rather than int, you'll get screwed.


I never intended it to be used for char. I was relying on the fact
that int cannot be smaller than 16 bits.

DES
--
Dag-Erling Smørgrav - (E-Mail Removed)
 
Reply With Quote
 
Jordan Abel
Guest
Posts: n/a
 
      12-09-2005
On 2005-12-09, Dag-Erling Smørgrav <(E-Mail Removed)> wrote:
> Jordan Abel <(E-Mail Removed)> writes:
>> this means if you try it with char rather than int, you'll get screwed.

>
> I never intended it to be used for char. I was relying on the fact
> that int cannot be smaller than 16 bits.


well, it still has nothing to do with twos-complement. The extra factor
is still useful if you intend to use the same buffer for octal - and
it's practically free since it gets calculated at compile time

>
> DES

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
How to use read() system call to read an 32bit integer Yin Zhu C Programming 15 09-12-2007 10:17 AM
CType(x,Integer) vs. Integer.Parse(x) =?Utf-8?B?Sm9l?= ASP .Net 7 02-07-2006 02:30 AM
How do I add an Integer to another Integer? Sebastian Stelzer Java 2 10-15-2004 01:17 PM
No Math.min(Integer, Integer)? =?ISO-8859-1?Q?Thomas_Gagn=E9?= Java 0 07-29-2003 07:46 PM



Advertisments