Velocity Reviews > C++ > conversion problem int <-> double ?

# conversion problem int <-> double ?

Markus Dehmann
Guest
Posts: n/a

 04-01-2008
I have two integers i1 and i2, the second of which is guaranteed to be
between 0 and 99, and I encode them into one double:

double encoded = (double)i1 + (double)i2 / (double)100;

So, for example, 324 and 2 become 324.02. Now I want to decode them
using the function given below but it decodes the example as 324 and
1, instead of 324 and 2.

Can anyone tell me what's wrong and how to do this right? (my code see
below)

Thanks!
Markus

#include <iostream>

void decode(double n, int& i1, int& i2){
i1 = int(n);
double rest = n - int(n);
i2 = int(rest * 100.0); // i2 is 1, should be
2
}

int main(int argc, char** argv){
double n = 324.02;
int p;
int i;
decode(n, p, i);
std::cerr << "n=" << n <<", p=" << p << ", i=" << i << std::endl;
return EXIT_SUCCESS;
}

Greg Herlihy
Guest
Posts: n/a

 04-01-2008
On Mar 31, 7:04*pm, Markus Dehmann <(E-Mail Removed)> wrote:
> I have two integers i1 and i2, the second of which is guaranteed to be
> between 0 and 99, and I encode them into one double:
>
> double encoded = (double)i1 + (double)i2 / (double)100;
>
> So, for example, 324 and 2 become 324.02. Now I want to decode them
> using the function given below but it decodes the example as 324 and
> 1, instead of 324 and 2.

The problem is that floating point values usually have a binary
representation. So precise decimal values (such as 324.02) often
cannot be exactly represented with a floating point type. Instead, the
floating point type stores the representable value nearest to the
value specified (for example, the nearest representable value to
324.02 is likely 324.01999).

One solution would be to use decimal floating point arithmetic.
Decimal floating point arithmetic would be able to represent 324.02
exactly. But although support for decimal floating arithmetic is
likely coming to the C++ library, actual implementations of this
feature are not that common.

A more likely (and practical) solution might be to use "fixed-point"
arithmetic. Fixed point arithmetic is completely accurate - up to the
specified resolution. For example, performing the above calculation
with fixed point arithmetic (with 1/100 resolution) might look
something like this:

typedef long Fixed; // in 1/100ths of a unit

Fixed n = 32402;
long p = n/100;
long i = n%100;

Greg

Jack Klein
Guest
Posts: n/a

 04-01-2008
On Mon, 31 Mar 2008 19:04:01 -0700 (PDT), Markus Dehmann
<(E-Mail Removed)> wrote in comp.lang.c++:

> I have two integers i1 and i2, the second of which is guaranteed to be
> between 0 and 99, and I encode them into one double:
>
> double encoded = (double)i1 + (double)i2 / (double)100;

This is not a very good idea, as you have found. The floating point
data types in C, and just about every other computer language, use a
fixed number of bits, and that limits their precision.

In particular, the floating point representation in almost all
computer systems, and certainly in all common ones programmed in C++,
use binary fractions. That means a value like .125, or .25, or .5 is
exactly representable in the fractional part of floating point values,
but fractions that are not 1/(a power of 2) are not. They get rounded
to the nearest binary fraction.

> So, for example, 324 and 2 become 324.02. Now I want to decode them
> using the function given below but it decodes the example as 324 and
> 1, instead of 324 and 2.

Actually, it does not become 324.02, it becomes some value slightly
greater or smaller than 324.02, because .02 cannot be exactly
represented in a binary fraction.

> Can anyone tell me what's wrong and how to do this right? (my code see
> below)

> Thanks!
> Markus
>
> #include <iostream>
>
> void decode(double n, int& i1, int& i2){
> i1 = int(n);
> double rest = n - int(n);
> i2 = int(rest * 100.0); // i2 is 1, should be
> 2
> }
>
> int main(int argc, char** argv){
> double n = 324.02;
> int p;
> int i;
> decode(n, p, i);
> std::cerr << "n=" << n <<", p=" << p << ", i=" << i << std::endl;
> return EXIT_SUCCESS;
> }
>

Look at this short program:

#include <iostream>
#include <iomanip>

int main()
{
double d = 304.0;
d += (2 / 100.0);

std::cout << "The value is " << std::setprecision(20) << d <<
std::endl;
return 0;
}

Here is the output of that program on my computer:

The value is 304.01999999999998

If you can't think of any better idea than trying to stick two integer
values into a double, and there is almost certainly a better way, here
are a few possible approaches:

1. Since one number is always between 0 and 99, you could multiply
the other number by 100 and add the second one. This will work if the
first value is not too large to fit into a double when multiplied by
100. You can calculate this by using the value of the macro DBL_DIG
in the <cfloat> or <float.h> header.

In my implementation, this value is 15, which means that a double can
hold a whole number value up to 999,999,999,999,999 with no loss of
precision. So if the first number is guaranteed not to be greater
than 1/100 of this value, approach 1 will work.

2. If you must stick integer values into floating point fractions, do
not simply multiply them back up and assign them to an int. Assignment
to an integer type causes truncation, any fractional portion is just
chopped off. So .01999999999998 * 100 equals 1.999999999998 which
gets truncated to 1.

Instead, if you know the fraction is positive, pass it to the
std::ceil() function before converting to int.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html

Andrey Tarasevich
Guest
Posts: n/a

 04-01-2008
Markus Dehmann wrote:
> I have two integers i1 and i2, the second of which is guaranteed to be
> between 0 and 99, and I encode them into one double:
>
> double encoded = (double)i1 + (double)i2 / (double)100;
> ...

As others already noted, floating-point numbers are normally represented
in binary internally. For this reason, in order to keep your 'i2'
encoded precisely in the fractional part of the floating-point number,
you should use a power of 2 as a divisor. Since your 'i2' is in 0..99
range, use 128 as a divisor in the encoder (and multiplier in the decoder)

double encoded = i1 + (double) i2 / 128;

This is still a pretty thin ice you'd be walking on, so you might be
better off following the other suggestions. Yet replacing 100 with 128
would fix the very basic error in your implementation of your original
approach.

--
Best regards,
Andrey Tarasevich

James Kanze
Guest
Posts: n/a

 04-01-2008
On Apr 1, 4:44 am, Jack Klein <(E-Mail Removed)> wrote:
> On Mon, 31 Mar 2008 19:04:01 -0700 (PDT), Markus Dehmann
> <(E-Mail Removed)> wrote in comp.lang.c++:

[Just a few odd comments, since the basic problem has

> > I have two integers i1 and i2, the second of which is
> > guaranteed to be between 0 and 99, and I encode them into
> > one double:

> > double encoded = (double)i1 + (double)i2 / (double)100;

The obvious question is: why? If you really do receive a value
in this format (i.e. integer part and hundredths as two separate
values), and want to treat it as a single value, fine, but then
I don't understand why you want to go the other direction later.
And I can't think of any other reason why one would want to do
this.

> This is not a very good idea, as you have found. The floating
> point data types in C, and just about every other computer
> language, use a fixed number of bits, and that limits their
> precision.

> In particular, the floating point representation in almost all
> computer systems, and certainly in all common ones programmed
> in C++, use binary fractions.

At least one architecture that is relatively common (IBM
mainframes) used base 16, and there's at least one base 8 out
there still being sold, but that doesn't change anything---all
of 2.

So does anyone know of a machine for which there existed a C++
compiler (or even a C compiler) which doesn't use a base which
is a power of 2. I know that machines using base 10 existed in
the past, but the ones I know of were out of production long
before even C came along. Or maybe there is a compiler for IBM
mainframes which uses their decimal arithmetic, rather than
their floating point, for float and double (but I'd be very
surprised).

> That means a value like .125, or .25, or .5 is exactly
> representable in the fractional part of floating point values,
> but fractions that are not 1/(a power of 2) are not. They get
> rounded to the nearest binary fraction.

Just a nit, but that should be fractions that are not n/(a power
of 2), where n is an integer. Something like .75 is no problem
either. (Of course, if the power of 2 is greater than something
like 51, you might get problems with some of those as well.)

> > So, for example, 324 and 2 become 324.02. Now I want to
> > decode them using the function given below but it decodes
> > the example as 324 and 1, instead of 324 and 2.

> Actually, it does not become 324.02, it becomes some value
> slightly greater or smaller than 324.02, because .02 cannot be
> exactly represented in a binary fraction.

> > Can anyone tell me what's wrong and how to do this right?
> > (my code see below)

> Your basic idea is wrong.

Hard to say without really knowing what his basic idea is.
Why does he want to do this? Anyway, two "obvious" solutions
come to mind:

-- pass through a textual representation:

std:stringstream s1 ;
s1.precision( 2 ) ;
s1.setf( std::ios::fixed, std::ios::floatfield ) ;
s1 << encoded ;
std::istringstream s2( s1.str() ) ;
char dummyForDecimal ;
s1 >> i1 >> dummyForDecimal >> i2 ;

-- use the correct functions from C:

double i1d ;
i2 = nearbyint( 100.0 * modf( encoded, &i1d ) ) ;
i1 = i1d ;

Modf is in C90, and thus in C++ (in <cmath>). Nearbyint is an
addition of C99, and thus will be in the next version of C++,
If not, replace the line with
i2 = floor( 100.0 * modf( encoded, &i1d ) + 0.5 ) ;
Although less robust, it should work for positive values
constructed as above.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jack Klein
Guest
Posts: n/a

 04-02-2008
On Tue, 1 Apr 2008 01:10:24 -0700 (PDT), James Kanze
<(E-Mail Removed)> wrote in comp.lang.c++:

> On Apr 1, 4:44 am, Jack Klein <(E-Mail Removed)> wrote:
> > On Mon, 31 Mar 2008 19:04:01 -0700 (PDT), Markus Dehmann
> > <(E-Mail Removed)> wrote in comp.lang.c++:

>
> [Just a few odd comments, since the basic problem has
>
> > > I have two integers i1 and i2, the second of which is
> > > guaranteed to be between 0 and 99, and I encode them into
> > > one double:

>
> > > double encoded = (double)i1 + (double)i2 / (double)100;

>
> The obvious question is: why? If you really do receive a value
> in this format (i.e. integer part and hundredths as two separate
> values), and want to treat it as a single value, fine, but then
> I don't understand why you want to go the other direction later.
> And I can't think of any other reason why one would want to do
> this.

I rather think I covered this in my next sentence:

> > This is not a very good idea, as you have found. The floating
> > point data types in C, and just about every other computer
> > language, use a fixed number of bits, and that limits their
> > precision.

>
> > In particular, the floating point representation in almost all
> > computer systems, and certainly in all common ones programmed
> > in C++, use binary fractions.

>
> At least one architecture that is relatively common (IBM
> mainframes) used base 16, and there's at least one base 8 out
> there still being sold, but that doesn't change anything---all
> of 2.

I've never actually programmed an IBM mainframe, but in the dim and
distant past (> .25 century), I did use a C compiler for an early
microprocessor (without hardware floating point) that used base 16 as
well.

But base 16 would add complications and not really change the problem.
You can't represent 02 exactly in a base 16 fraction, either.

> So does anyone know of a machine for which there existed a C++
> compiler (or even a C compiler) which doesn't use a base which
> is a power of 2. I know that machines using base 10 existed in
> the past, but the ones I know of were out of production long
> before even C came along. Or maybe there is a compiler for IBM
> mainframes which uses their decimal arithmetic, rather than
> their floating point, for float and double (but I'd be very
> surprised).

Can't help you there, never used (or seen) anything other than base 2
and base 16, and the base 16 was before C++ was even a bright idea in
Bjarne's mind, I think.

> > That means a value like .125, or .25, or .5 is exactly
> > representable in the fractional part of floating point values,
> > but fractions that are not 1/(a power of 2) are not. They get
> > rounded to the nearest binary fraction.

>
> Just a nit, but that should be fractions that are not n/(a power
> of 2), where n is an integer. Something like .75 is no problem
> either. (Of course, if the power of 2 is greater than something
> like 51, you might get problems with some of those as well.)

You're right actually, n/(a power of 2) is better than my wording.

> > > So, for example, 324 and 2 become 324.02. Now I want to
> > > decode them using the function given below but it decodes
> > > the example as 324 and 1, instead of 324 and 2.

>
> > Actually, it does not become 324.02, it becomes some value
> > slightly greater or smaller than 324.02, because .02 cannot be
> > exactly represented in a binary fraction.

>
> > > Can anyone tell me what's wrong and how to do this right?
> > > (my code see below)

>
> > Your basic idea is wrong.

>
> Hard to say without really knowing what his basic idea is.
> Why does he want to do this? Anyway, two "obvious" solutions
> come to mind:

Without knowing his reasoning, I gave him the benefit of the doubt,
and still decided that he was wrong. If he worked for my company, he
wouldn't write code that way a second time after the first code
review.

> -- pass through a textual representation:
>
> std:stringstream s1 ;
> s1.precision( 2 ) ;
> s1.setf( std::ios::fixed, std::ios::floatfield ) ;
> s1 << encoded ;
> std::istringstream s2( s1.str() ) ;
> char dummyForDecimal ;
> s1 >> i1 >> dummyForDecimal >> i2 ;
>
> -- use the correct functions from C:
>
> double i1d ;
> i2 = nearbyint( 100.0 * modf( encoded, &i1d ) ) ;
> i1 = i1d ;
>
> Modf is in C90, and thus in C++ (in <cmath>). Nearbyint is an
> addition of C99, and thus will be in the next version of C++,
> If not, replace the line with
> i2 = floor( 100.0 * modf( encoded, &i1d ) + 0.5 ) ;
> Although less robust, it should work for positive values
> constructed as above.

There are no "non-icky" ways to do this. If its a space issue of some
type, I will bet there are very few platforms where sizeof(std::div_t)
is greater than sizeof(double).

Putting the two values into a std::div_t would retain all the integer
bits with no loss, and still allow easy conversion to a double if
actually needed for some arcane purpose.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html

James Kanze
Guest
Posts: n/a

 04-02-2008
On Apr 2, 6:47 am, Jack Klein <(E-Mail Removed)> wrote:
> On Tue, 1 Apr 2008 01:10:24 -0700 (PDT), James Kanze
> <(E-Mail Removed)> wrote in comp.lang.c++:
> > On Apr 1, 4:44 am, Jack Klein <(E-Mail Removed)> wrote:
> > > On Mon, 31 Mar 2008 19:04:01 -0700 (PDT), Markus Dehmann
> > > <(E-Mail Removed)> wrote in comp.lang.c++:

> > > > double encoded = (double)i1 + (double)i2 / (double)100;

> > The obvious question is: why? If you really do receive a value
> > in this format (i.e. integer part and hundredths as two separate
> > values), and want to treat it as a single value, fine, but then
> > I don't understand why you want to go the other direction later.
> > And I can't think of any other reason why one would want to do
> > this.

> I rather think I covered this in my next sentence:

Which is:

> > > This is not a very good idea, as you have found. The
> > > floating point data types in C, and just about every other
> > > computer language, use a fixed number of bits, and that
> > > limits their precision.

I can see reasons why one might want to do this on input. Some
external source is providing an integral value, followed by an
integral number of 100ths, and you want to do various
calculations on those values. Since the input is with an
accuracy of at most a 100th, you'll normally only output with
this accuracy as well, and for most trivial compuations, you can
pretty much ignore the rounding errors (which will be far
smaller).

I can't see a reason why one would want to go back, however.
(Maybe outputting to the same device?)

[...]
> > > > Can anyone tell me what's wrong and how to do this
> > > > right? (my code see below)

> > > Your basic idea is wrong.

> > Hard to say without really knowing what his basic idea
> > is. Why does he want to do this? Anyway, two "obvious"
> > solutions come to mind:

> Without knowing his reasoning, I gave him the benefit of the doubt,
> and still decided that he was wrong. If he worked for my company, he
> wouldn't write code that way a second time after the first code
> review.

Even if it was what the requirements spefication demanded?

> There are no "non-icky" ways to do this. If its a space issue
> of some type, I will bet there are very few platforms where
> sizeof(std::div_t) is greater than sizeof(double).

I can't really believe that it's a space issue, since a double
generally is the size of two int, and his input is two ints.

> Putting the two values into a std::div_t would retain all the
> integer bits with no loss, and still allow easy conversion to
> a double if actually needed for some arcane purpose.

I wouldn't call computing a new value an "arcane purpose". And
if some external device is providing input in this format, then
you have to deal with it. The question is why the round trip.
Why does he want to go back to the original format?

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34