Richard Heathfield wrote:
> Skarmander said:
>
>> Dag-Erling Smørgrav wrote:
>>> This one is slightly wasteful of memory, but is computed entirely at
>>> compile time:
>>>
>>> #include <limits.h>
>>> char number[1+(sizeof(int)*CHAR_BIT)/3+1];
>
> That looks familiar, but... well, enough of that later.
>
>>> The resulting array will be large enough to store the null-terminated
>>> decimal representation of any integer in the range (INT_MIN,INT_MAX).
>>> Proof of this is left as an exercise for the reader.
>> Actually, please show why this works, especially why this works for
>> INT_MIN. It's easy enough to see N / 3 + 1 can approximate
>> ceil(log10(2^N)), but I can't get the details quite right for signed
>> integers. I'm sure I'm overlooking something obvious.
>
> I can explain it easily enough, for the simple reason that I invented it.
>
(I'm quite sure there are other people who've invented it too, and
> long before I did, but at any rate I derived it independently, and so I
> know why it works.)
>
> An int comprises sizeof(int) * CHAR_BIT bits. Since three bits can always
> represent any octal digit, and leaving signs and terminators aside for a
> second, it is clear that (sizeof(int) * CHAR_BIT) / 3 characters are
> sufficient to store the octal representation of the number (but see below).
> Since decimal can represent more efficiently than octal, what's good enough
> for octal is also good enough for decimal.
>
> Now we add in 1 for the sign, and 1 for the null terminator, and that's
> where the above expression comes from.
>
Interesting. Yes, that's probably more straightforward than the brute force
I applied: to represent a number n in a base b, you need ceil(log_b(|n|))
digits (for n != 0). sizeof(int) * CHAR_BIT gives us an upper bound for the
value bits of an integer, and hence an upper bound for (U)INT_MAX: 2^N with
N = sizeof(int) * CHAR_BIT.
Now to represent N-bit integers in decimal, we need at most ceil(log10(2^N))
= ceil(N * log10(2)) = floor(N * log10(2)) + 1 = N / log2(10) + 1 bits. "3"
(log2(

) is exactly right for octal, and as you say, good enough for
decimal. Better approximations for log2(10) give closer bounds. This does
not take the sign into account.
> BUT: consider the possibility that the integer division truncates (which it
> will do if sizeof(int) * CHAR_BIT is not a multiple of 3). Under such
> circumstances, you could be forgiven for wanting some bodge factor in
> there. That's why I use:
>
> char number[1 + (sizeof(int) * CHAR_BIT + 2) / 3 + 1];
>
> The + 2 is always sufficient to counter the effects of any truncation after
> division by 3, but doesn't inflate the result by more than one character at
> most.
>
<snip>
Yes, and I won't dispute the practicality of this approach as opposed to my
mysterious logarithm approximations, but the challenge is this: prove that
no bodge factor is necessary (it isn't). sizeof(int) * CHAR_BIT / 3 + 1
always gives enough space for all digits and a sign if an integer type is
greater than 11 bits, which it's guaranteed to be (except for char, of
course, and you do need correction for signed char if CHAR_BIT is

.
If I'm not mistaken, this is the same as proving that ceil((N - 1) *
log10(2) + 1) <= N / 3 for all N >= 12 (one value bit less for signed
integers, but one character more). Like I said, I'm sure I'm overlooking
something simple, some transformation that will highlight this.
I guess I should have paid better attention in math class. I like to stick
to boolean algebra...
S.