Velocity Reviews > Exact integer-valued floats

# Exact integer-valued floats

Steven D'Aprano
Guest
Posts: n/a

 09-21-2012
Python floats can represent exact integer values (e.g. 42.0), but above a
certain value (see below), not all integers can be represented. For
example:

py> 1e16 == 1e16 + 1 # no such float as 10000000000000001.0
True
py> 1e16 + 3 == 1e16 + 4 # or 10000000000000003.0
True

So some integers are missing from the floats. For large enough values,
the gap between floats is rather large, and many numbers are missing:

py> 1e200 + 1e10 == 1e200
True

The same applies for large enough negative values.

The question is, what is the largest integer number N such that every
whole number between -N and N inclusive can be represented as a float?

If my tests are correct, that value is 9007199254740992.0 = 2**53.

Have I got this right? Is there a way to work out the gap between one
float and the next?

(I haven't tried to exhaustively check every float because, even at one
nanosecond per number, it will take over 200 days.)

--
Steven

Ian Kelly
Guest
Posts: n/a

 09-21-2012
On Fri, Sep 21, 2012 at 11:29 AM, Steven D'Aprano
<(E-Mail Removed)> wrote:
> The question is, what is the largest integer number N such that every
> whole number between -N and N inclusive can be represented as a float?
>
> If my tests are correct, that value is 9007199254740992.0 = 2**53.
>
> Have I got this right? Is there a way to work out the gap between one
> float and the next?

That looks mathematically correct. The "gap" between floats is the
equivalent of a difference of 1 bit in the significand. For a
floating point number represented as (sign * c * 2 ** q), where c is
an integer, the gap between floats is equal to 2 ** q. There are 53
bits of precision in a double-precision float (technically an implicit
1 followed by 52 bits), so q becomes greater than 0 at 2 ** 53.

Cheers,
Ian

Jussi Piitulainen
Guest
Posts: n/a

 09-21-2012
Steven D'Aprano writes:

> Python floats can represent exact integer values (e.g. 42.0), but above a
> certain value (see below), not all integers can be represented. For
> example:
>
> py> 1e16 == 1e16 + 1 # no such float as 10000000000000001.0
> True
> py> 1e16 + 3 == 1e16 + 4 # or 10000000000000003.0
> True
>
> So some integers are missing from the floats. For large enough values,
> the gap between floats is rather large, and many numbers are missing:
>
> py> 1e200 + 1e10 == 1e200
> True
>
> The same applies for large enough negative values.
>
> The question is, what is the largest integer number N such that every
> whole number between -N and N inclusive can be represented as a float?
>
> If my tests are correct, that value is 9007199254740992.0 = 2**53.
>
> Have I got this right? Is there a way to work out the gap between one
> float and the next?

There is a way to find the distance between two IEEE floats in "ulps",
or "units in the last position", computable from the bit pattern using
integer arithmetic. I think it's then also possible to find the next

I don't have a link at hand, I'm too tired to search at the moment,
and I'm no expert on floats, but you might find an answer by looking
for ulps.

> (I haven't tried to exhaustively check every float because, even at one
> nanosecond per number, it will take over 200 days.)

Come to think of it, the difference between adjacent floats is exactly
one ulp. Just use the right unit

Nobody
Guest
Posts: n/a

 09-21-2012
On Fri, 21 Sep 2012 17:29:13 +0000, Steven D'Aprano wrote:

> The question is, what is the largest integer number N such that every
> whole number between -N and N inclusive can be represented as a float?
>
> If my tests are correct, that value is 9007199254740992.0 = 2**53.
>
> Have I got this right? Is there a way to work out the gap between one
> float and the next?

CPython's "float" type uses C's "double". For a system where C's "double"
is IEEE-754 double precision, N=2**53 is the correct answer.

An IEEE-754 double precision value consists of a 53-bit integer whose
first bit is a "1", multiplied or divided by a power of two.

http://en.wikipedia.org/wiki/IEEE_754-1985

The largest 53-bit integer is 2**53-1. 2**53 can be represented as
2**52 * 2**1. 2**53+1 cannot be represented in this form. 2**53+2 can be
represented as (2**52+1) * 2**1.

For values x where 2**52 <= x < 2**53, the the interval between
representable values (aka Unit in the Last Place or ULP) is 1.0.
For 2**51 <= x < 2**52, the ULP is 0.5.
For 2**53 <= x < 2**54, the ULP is 2.0.
And so on.

Dennis Lee Bieber
Guest
Posts: n/a

 09-21-2012
On 21 Sep 2012 17:29:13 GMT, Steven D'Aprano
<(E-Mail Removed)> declaimed the following in
gmane.comp.python.general:

>
> The question is, what is the largest integer number N such that every
> whole number between -N and N inclusive can be represented as a float?
>

Single precision commonly has 7 significant (decimal) digits. Double
precision runs somewhere between 13 and 15 (decimal) significant digits

> If my tests are correct, that value is 9007199254740992.0 = 2**53.
>

For an encoding of a double precision using one sign bit and an
8-bit exponent, you have 53 bits available for the mantissa. This
ignores the possibility of an implied msb in the mantissa (encodings
which normalize to put the leading 1-bit at the msb can on some machines
remove that 1-bit and shift the mantissa one more place; effectively
giving a 54-bit mantissa). Something like an old XDS Sigma-6 used
non-binary exponents (exponent was in power of 16 <> 2^4) and used
"non-normalized" mantissa -- the mantissa could have up to three leading
0-bits); this affected the decimal significance...

--
Wulfraed Dennis Lee Bieber AF6VN
http://www.velocityreviews.com/forums/(E-Mail Removed) HTTP://wlfraed.home.netcom.com/

Hans Mulder
Guest
Posts: n/a

 09-21-2012
On 21/09/12 22:26:26, Dennis Lee Bieber wrote:
> On 21 Sep 2012 17:29:13 GMT, Steven D'Aprano
> <(E-Mail Removed)> declaimed the following in
> gmane.comp.python.general:
>
>>
>> The question is, what is the largest integer number N such that every
>> whole number between -N and N inclusive can be represented as a float?
>>

> Single precision commonly has 7 significant (decimal) digits. Double
> precision runs somewhere between 13 and 15 (decimal) significant digits
>
>> If my tests are correct, that value is 9007199254740992.0 = 2**53.

The expression 2 / sys.float_info.epsilon produces exactly that
number. That's probably not a coincidence.

> For an encoding of a double precision using one sign bit and an
> 8-bit exponent, you have 53 bits available for the mantissa.

If your floats have 64 bits, and you use 1 bit for the sign and 8 for
the exponent, you'll have 55 bits available for the mantissa.

> This
> ignores the possibility of an implied msb in the mantissa (encodings
> which normalize to put the leading 1-bit at the msb can on some machines
> remove that 1-bit and shift the mantissa one more place; effectively
> giving a 54-bit mantissa).

My machine has 64-bits floats, using 1 bit for the sign, 11 for the
exponent, leaving 52 for the mantissa. The mantissa has an implied
leading 1, so it's nominally 53 bits.

You can find this number in sys.float_info.mant_dig

> Something like an old XDS Sigma-6 used
> non-binary exponents (exponent was in power of 16 <> 2^4) and used
> "non-normalized" mantissa -- the mantissa could have up to three leading
> 0-bits); this affected the decimal significance...

Hope this helps,

-- HansM

Paul Rubin
Guest
Posts: n/a

 09-21-2012
Steven D'Aprano <(E-Mail Removed)> writes:
> Have I got this right? Is there a way to work out the gap between one
> float and the next?

Yes, 53-bit mantissa as people have mentioned. That tells you what ints
can be exactly represented. But, arithmetic in some situations can have
a 1-ulp error. So I wonder if it's possible that if n is large enough,
you might have something like n+1==n even if the integers n and n+1 have
distinct floating point representations.

Dennis Lee Bieber
Guest
Posts: n/a

 09-22-2012
On Fri, 21 Sep 2012 23:04:14 +0200, Hans Mulder <(E-Mail Removed)>
declaimed the following in gmane.comp.python.general:

> On 21/09/12 22:26:26, Dennis Lee Bieber wrote:

>
> > For an encoding of a double precision using one sign bit and an
> > 8-bit exponent, you have 53 bits available for the mantissa.

>
> If your floats have 64 bits, and you use 1 bit for the sign and 8 for
> the exponent, you'll have 55 bits available for the mantissa.
>

Mea Culpa -- doing mental arithmetic too fast

> > This
> > ignores the possibility of an implied msb in the mantissa (encodings
> > which normalize to put the leading 1-bit at the msb can on some machines
> > remove that 1-bit and shift the mantissa one more place; effectively
> > giving a 54-bit mantissa).

>
> My machine has 64-bits floats, using 1 bit for the sign, 11 for the
> exponent, leaving 52 for the mantissa. The mantissa has an implied
> leading 1, so it's nominally 53 bits.
>
> You can find this number in sys.float_info.mant_dig
>
> > Something like an old XDS Sigma-6 used
> > non-binary exponents (exponent was in power of 16 <> 2^4) and used
> > "non-normalized" mantissa -- the mantissa could have up to three leading
> > 0-bits); this affected the decimal significance...

>
>
>
> Hope this helps,
>
> -- HansM

--
Wulfraed Dennis Lee Bieber AF6VN
(E-Mail Removed) HTTP://wlfraed.home.netcom.com/

Steven D'Aprano
Guest
Posts: n/a

 09-22-2012
On Fri, 21 Sep 2012 15:23:41 -0700, Paul Rubin wrote:

> Steven D'Aprano <(E-Mail Removed)> writes:
>> Have I got this right? Is there a way to work out the gap between one
>> float and the next?

>
> Yes, 53-bit mantissa as people have mentioned. That tells you what ints
> can be exactly represented. But, arithmetic in some situations can have
> a 1-ulp error. So I wonder if it's possible that if n is large enough,
> you might have something like n+1==n even if the integers n and n+1 have
> distinct floating point representations.

I don't think that is possible for IEEE 754 floats, where integer
arithmetic is exact. But I'm not entirely sure, which is why I asked.

For non IEEE 754 floating point systems, there is no telling how bad the
implementation could be

--
Steven

Dennis Lee Bieber
Guest
Posts: n/a

 09-22-2012
On 22 Sep 2012 01:36:59 GMT, Steven D'Aprano
<(E-Mail Removed)> declaimed the following in
gmane.comp.python.general:

>
> For non IEEE 754 floating point systems, there is no telling how bad the
> implementation could be

Let's see what can be found...

http://www.bitsavers.org/pdf/sds/sig..._Man_Jun71.pdf
pages 50-54
A sign bit, 7-bit offset base 16 exponent/characteristic, 24-bit
mantissa/fraction (for short; long float is 56 bit mantissa).

IBM 360: Same as Sigma-6 (no surprise; hearsay is the Sigma was
designed by renegade IBM folk; even down to using EBCDIC internally --
but with a much different interrupt system [224 individual interrupt
vectors as I recall, vs the IBM's 7 vectors and polling to find what
device]).

Motorola Fast Floating Point (software library for the Amiga, and
apparently also used on early Palm units)
Sign bit, 7-bit binary exponent, 24-bit mantissa

VAX http://nssdc.gsfc.nasa.gov/nssdc/for...atingPoint.htm
(really nasty looking as bits 6 is most significant, running down to bit
0, THEN bits 32-16 with 16 the least significant; extend for longer
formats)
F-float: sign bit, 8-bit exponent, 23 bits mantissa
D-float: as above but 55 bits mantissa
G-float: sign bit, 11-bit exponent, 52 bits mantissa
H-float: sign bit, 15-bit exponent, 112 bits mantissa

--
Wulfraed Dennis Lee Bieber AF6VN
(E-Mail Removed) HTTP://wlfraed.home.netcom.com/