Velocity Reviews > double cast to int reliable?

# double cast to int reliable?

Ersek, Laszlo
Guest
Posts: n/a

 06-04-2010
On Thu, 3 Jun 2010, Keith Thompson wrote:

> Seebs <(E-Mail Removed)> writes:

>> For plain float, on the systems I've tried, the boundary seems to be
>> about 2^24; 2^24+1 cannot be represented exactly in a 32-bit float. I
>> wouldn't be surprised to find that double came out somewhere near
>> 2^48+1 as the first positive integer value that couldn't be
>> represented.

>
> It's more likely to be 2^53-1, assuming IEEE floating-point; look at the
> values of FLT_MANT_DIG and DBL_MANT_DIG.

It's my turn to sigh now. For some reason I failed both to notice and to
remember DBL_MANT_DIG, which IIRC is on the same page of the standard as
LDBL_DIG.

We could simply check if

2 ** (sizeof(utype) * CHAR_BIT) <= FLT_RADIX ** DBL_MANT_DIG

That is,

sizeof(utype) * CHAR_BIT <= log2(FLT_RADIX) * DBL_MANT_DIG

or perhaps even

logb(2) * (sizeof(utype) * CHAR_BIT) <= DBL_MANT_DIG

We could pre-check if FLT_RADIX is 2, and if so, simply omit
log2(FLT_RADIX) or logb(2), and compare integers. If not, then perhaps we
should first ask the environment to round towards zero or -Inf for the
log2(FLT_RADIX) formula, or towards +Inf for the logb(2) formula.

Sorry,
lacos

Nobody
Guest
Posts: n/a

 06-04-2010
On Wed, 02 Jun 2010 16:58:02 -0700, kathir wrote:

>> Is there any chance that "i" will not equal "j" due to the double
>> being stored inexactly?

>
> The way how floating point numbers are stored internally are
> different, uses mantissa and exponent portion. If you do any floating
> point calculation (multiplication and division) and convert back to
> integer, you will see a minor difference between int and double value.

No, not "will", but "might".

There are plenty of cases where you *won't* see a difference.

So long as you aren't using Borland C, the compiler won't just introduce
random errors for the hell of it.

OTOH, if you *are* using Borland C:

2. 12.0/3.0 = 3.99999...

[i.e. it will calculate x/y as x*(1.0/y), and there is no solution except
to use a better compiler.]