Velocity Reviews > Casting integers to float.

# Casting integers to float.

Jonathan Fielder
Guest
Posts: n/a

 08-12-2003
Hi,

I have a 32 bit integer value and I wish to find the single precision
floating point value that is closest to but less than or equal to the
integer. I also have a similar case where I need to find the single
precision floating point value that is closest to but greater than or equal
to the integer. I believe that if I simply cast to a float, it may be
assigned the next higher or lower representable value, depending on
implementation.

I am aware that if I use double precision floating point values then I
shouldn't have a problem because 32 bit integers can be represented exactly,
but I really need to use float.

Is there a simple method using standard C to achieve my goal?

Many thanks,

Jon.

Christian Bau
Guest
Posts: n/a

 08-12-2003
In article <bhb5u3\$4he\$(E-Mail Removed)>,
"Jonathan Fielder" <(E-Mail Removed)> wrote:

> Hi,
>
> I have a 32 bit integer value and I wish to find the single precision
> floating point value that is closest to but less than or equal to the
> integer. I also have a similar case where I need to find the single
> precision floating point value that is closest to but greater than or equal
> to the integer. I believe that if I simply cast to a float, it may be
> assigned the next higher or lower representable value, depending on
> implementation.
>
> I am aware that if I use double precision floating point values then I
> shouldn't have a problem because 32 bit integers can be represented exactly,
> but I really need to use float.
>
> Is there a simple method using standard C to achieve my goal?

Interesting problem. I think this should give the required result on
most or all correct C implementations.

float int32_to_float_rounddown (long i) {

double d = (double) i;
double e = d;
float f;

while ((f = (float) e) > d)
e -= 1.0;

return f;
}

float int32_to_float_roundup (long i) {

double d = (double) i;
double e = d;
float f;

while ((f = (float) e)< d)
e += 1.0;

return f;
}

d and e must be double so that the conversions from 32 bit integer and
from float are exact.

Kevin Easton
Guest
Posts: n/a

 08-12-2003
Jonathan Fielder <(E-Mail Removed)> wrote:
> Hi,
>
> I have a 32 bit integer value and I wish to find the single precision
> floating point value that is closest to but less than or equal to the
> integer. I also have a similar case where I need to find the single
> precision floating point value that is closest to but greater than or equal
> to the integer. I believe that if I simply cast to a float, it may be
> assigned the next higher or lower representable value, depending on
> implementation.

Will this work?

float f = myint;

if (f > (double)myint) {
f -= FLT_EPSILON * myint;
}

- Kevin.

Tim Prince
Guest
Posts: n/a

 08-13-2003
Jonathan Fielder wrote:

>
> I have a 32 bit integer value and I wish to find the single precision
> floating point value that is closest to but less than or equal to the
> integer.

float f = myint -.25
> I also have a similar case where I need to find the single
> precision floating point value that is closest to but greater than or
> equal
> to the integer.

float f = myint +.25
> I believe that if I simply cast to a float, it may be
> assigned the next higher or lower representable value, depending on
> implementation.

Only for some of the values satisfying myint > 1/FLT_EPSILON, assuming a
sane implementation, such as any IEEE compliant one.

--
Tim Prince

Christian Bau
Guest
Posts: n/a

 08-13-2003
Tim Prince <(E-Mail Removed)> wrote:

> Jonathan Fielder wrote:
>
>
> >
> > I have a 32 bit integer value and I wish to find the single precision
> > floating point value that is closest to but less than or equal to the
> > integer.

> float f = myint -.25

Wrong result if myint = 1

> > I also have a similar case where I need to find the single
> > precision floating point value that is closest to but greater than or
> > equal
> > to the integer.

> float f = myint +.25

Wrong result if myint = 1

> > I believe that if I simply cast to a float, it may be
> > assigned the next higher or lower representable value, depending on
> > implementation.

> Only for some of the values satisfying myint > 1/FLT_EPSILON, assuming a
> sane implementation, such as any IEEE compliant one.

Kevin Easton
Guest
Posts: n/a

 08-13-2003
Tim Prince <(E-Mail Removed)> wrote:
> Jonathan Fielder wrote:
>
>
>>
>> I have a 32 bit integer value and I wish to find the single precision
>> floating point value that is closest to but less than or equal to the
>> integer.

> float f = myint -.25

If myint = 1, that gives 0.75 as f. There are many values representable
in float that are closer to 1.0 than 0.75, whilst still being less than
or equal to 1.0.

- Kevin.

Kevin Easton
Guest
Posts: n/a

 08-13-2003
Christian Bau <(E-Mail Removed)> wrote:
> In article <bhb5u3\$4he\$(E-Mail Removed)>,
> "Jonathan Fielder" <(E-Mail Removed)> wrote:
>
>> Hi,
>>
>> I have a 32 bit integer value and I wish to find the single precision
>> floating point value that is closest to but less than or equal to the
>> integer. I also have a similar case where I need to find the single
>> precision floating point value that is closest to but greater than or equal
>> to the integer. I believe that if I simply cast to a float, it may be
>> assigned the next higher or lower representable value, depending on
>> implementation.
>>
>> I am aware that if I use double precision floating point values then I
>> shouldn't have a problem because 32 bit integers can be represented exactly,
>> but I really need to use float.
>>
>> Is there a simple method using standard C to achieve my goal?

>
> Interesting problem. I think this should give the required result on
> most or all correct C implementations.
>
> float int32_to_float_rounddown (long i) {
>
> double d = (double) i;
> double e = d;
> float f;
>
> while ((f = (float) e) > d)
> e -= 1.0;

Why do you think that 1.0 is the smallest amount you will have to
subtract from e to make it less than i ?

- Kevin.

Jirka Klaue
Guest
Posts: n/a

 08-13-2003
Kevin Easton wrote:
> Christian Bau <(E-Mail Removed)> wrote:
>>"Jonathan Fielder" <(E-Mail Removed)> wrote:

....
>>>I have a 32 bit integer value and I wish to find the single precision
>>>floating point value that is closest to but less than or equal to the
>>>integer. I also have a similar case where I need to find the single
>>>precision floating point value that is closest to but greater than or equal
>>>to the integer. I believe that if I simply cast to a float, it may be
>>>assigned the next higher or lower representable value, depending on
>>>implementation.
>>>
>>>I am aware that if I use double precision floating point values then I
>>>shouldn't have a problem because 32 bit integers can be represented exactly,
>>>but I really need to use float.
>>>
>>>Is there a simple method using standard C to achieve my goal?

>>
>>Interesting problem. I think this should give the required result on
>>most or all correct C implementations.
>>
>>float int32_to_float_rounddown (long i) {
>>
>> double d = (double) i;
>> double e = d;
>> float f;
>>
>> while ((f = (float) e) > d)
>> e -= 1.0;

>
> Why do you think that 1.0 is the smallest amount you will have to
> subtract from e to make it less than i ?

float f = i;
double d = i;

while (f > (double)i) {
d -= 1;
f = d;
}

while (f + FLT_EPSILON != f && f + FLT_EPSILON < (double)i)
f += FLT_EPSILON;

Jirka

Christian Bau
Guest
Posts: n/a

 08-13-2003
In article <newscache\$3p6kjh\$xg3\$(E-Mail Removed)>,
Kevin Easton <(E-Mail Removed)> wrote:

> Christian Bau <(E-Mail Removed)> wrote:
> > In article <bhb5u3\$4he\$(E-Mail Removed)>,
> > "Jonathan Fielder" <(E-Mail Removed)> wrote:
> >
> >> Hi,
> >>
> >> I have a 32 bit integer value and I wish to find the single precision
> >> floating point value that is closest to but less than or equal to the
> >> integer. I also have a similar case where I need to find the single
> >> precision floating point value that is closest to but greater than or
> >> equal
> >> to the integer. I believe that if I simply cast to a float, it may be
> >> assigned the next higher or lower representable value, depending on
> >> implementation.
> >>
> >> I am aware that if I use double precision floating point values then I
> >> shouldn't have a problem because 32 bit integers can be represented
> >> exactly,
> >> but I really need to use float.
> >>
> >> Is there a simple method using standard C to achieve my goal?

> >
> > Interesting problem. I think this should give the required result on
> > most or all correct C implementations.
> >
> > float int32_to_float_rounddown (long i) {
> >
> > double d = (double) i;
> > double e = d;
> > float f;
> >
> > while ((f = (float) e) > d)
> > e -= 1.0;

>
> Why do you think that 1.0 is the smallest amount you will have to
> subtract from e to make it less than i ?

This makes three assumptions: 1. All integers that fit almost into 32
bit can be stored exactly in a "double" variable. 2. Adding or
subtracting 1 to/from such a variable produces the correct result. 3.
The type float has the following property: There are two numbers fmin
and fmax such that all integers x, fmin <= x <= fmax can be represented
in a variable of type float, and no non-integer value less than fmin or
greater than fmax can be represented.

That would be the case for any simple floating point representation that
I have ever seen, and it wouldn't matter if it is binary, base 10, base
sixteen or whatever. (I know there are implementations of long double
that work differently).