Richard Heathfield wrote:

> Skarmander said:

>

>> Dag-Erling Smørgrav wrote:

>>> This one is slightly wasteful of memory, but is computed entirely at

>>> compile time:

>>>

>>> #include <limits.h>

>>> char number[1+(sizeof(int)*CHAR_BIT)/3+1];

>

> That looks familiar, but... well, enough of that later.

>

>>> The resulting array will be large enough to store the null-terminated

>>> decimal representation of any integer in the range (INT_MIN,INT_MAX).

>>> Proof of this is left as an exercise for the reader.

>> Actually, please show why this works, especially why this works for

>> INT_MIN. It's easy enough to see N / 3 + 1 can approximate

>> ceil(log10(2^N)), but I can't get the details quite right for signed

>> integers. I'm sure I'm overlooking something obvious.

>

> I can explain it easily enough, for the simple reason that I invented it.

> (I'm quite sure there are other people who've invented it too, and

> long before I did, but at any rate I derived it independently, and so I

> know why it works.)

>

> An int comprises sizeof(int) * CHAR_BIT bits. Since three bits can always

> represent any octal digit, and leaving signs and terminators aside for a

> second, it is clear that (sizeof(int) * CHAR_BIT) / 3 characters are

> sufficient to store the octal representation of the number (but see below).

> Since decimal can represent more efficiently than octal, what's good enough

> for octal is also good enough for decimal.

>

> Now we add in 1 for the sign, and 1 for the null terminator, and that's

> where the above expression comes from.

>
Interesting. Yes, that's probably more straightforward than the brute force

I applied: to represent a number n in a base b, you need ceil(log_b(|n|))

digits (for n != 0). sizeof(int) * CHAR_BIT gives us an upper bound for the

value bits of an integer, and hence an upper bound for (U)INT_MAX: 2^N with

N = sizeof(int) * CHAR_BIT.

Now to represent N-bit integers in decimal, we need at most ceil(log10(2^N))

= ceil(N * log10(2)) = floor(N * log10(2)) + 1 = N / log2(10) + 1 bits. "3"

(log2(

) is exactly right for octal, and as you say, good enough for

decimal. Better approximations for log2(10) give closer bounds. This does

not take the sign into account.

> BUT: consider the possibility that the integer division truncates (which it

> will do if sizeof(int) * CHAR_BIT is not a multiple of 3). Under such

> circumstances, you could be forgiven for wanting some bodge factor in

> there. That's why I use:

>

> char number[1 + (sizeof(int) * CHAR_BIT + 2) / 3 + 1];

>

> The + 2 is always sufficient to counter the effects of any truncation after

> division by 3, but doesn't inflate the result by more than one character at

> most.

>
<snip>

Yes, and I won't dispute the practicality of this approach as opposed to my

mysterious logarithm approximations, but the challenge is this: prove that

no bodge factor is necessary (it isn't). sizeof(int) * CHAR_BIT / 3 + 1

always gives enough space for all digits and a sign if an integer type is

greater than 11 bits, which it's guaranteed to be (except for char, of

course, and you do need correction for signed char if CHAR_BIT is

.

If I'm not mistaken, this is the same as proving that ceil((N - 1) *

log10(2) + 1) <= N / 3 for all N >= 12 (one value bit less for signed

integers, but one character more). Like I said, I'm sure I'm overlooking

something simple, some transformation that will highlight this.

I guess I should have paid better attention in math class. I like to stick

to boolean algebra...

S.