Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Perl numerics: EPSILON and equality between floats

Reply
Thread Tools

Perl numerics: EPSILON and equality between floats

 
 
bill
Guest
Posts: n/a
 
      08-11-2005




I just recently came across this snippet:

my $EPSILON = 1;
$EPSILON /= 2 while 0.5 + $EPSILON/2 > 0.5;

where $EPSILON was later used, e.g., to compare floats for equality:

sub float_equal {
my ($x, $y) = @_;
return !($x||$y) || ($x+$y) && abs(($x-$y)/($x+$y)) < $EPSILON;
}

(The above is the gist of what I recall, not verbatim production
code, but I think I got it right.)

I'm not very knowledgeable about numerics, so I was wondering
whether this was the best way to compare floats for equality.

I thought that this would be a FAQ, but I couldn't find it.

The first edition of The Perl Cookbook recommends converting both
numbers to the same string format, and then doing a string comparison:

sub equal {
my ($A, $B, $dp) = @_;
return sprintf("%.${dp}g", $A) eq sprintf("%.${dp}g", $B);
}

The main reason I don't like this approach is that, in my experience,
some numerical objects, such as Math::BigFloat and Math:ari,
don't work well with sprintf.

Also, I recall that in C there are pre-defined constants (e.g.
DBL_EPSILON in limits.h) that can be used for this kind of tests.
Is there something similar in Perl?

Many thanks in advance!

bill

P.S. If someone in Perl publishing reads this: a book on Perl
numerics is sorely needed (fundamentals, algorithms, recipes, best
practices) is sorely needed.

 
Reply With Quote
 
 
 
 
Paul Lalli
Guest
Posts: n/a
 
      08-11-2005
bill wrote:
> I'm not very knowledgeable about numerics, so I was wondering
> whether this was the best way to compare floats for equality.


Floats should not be compared for "equality". In any language.

> I thought that this would be a FAQ, but I couldn't find it.


perldoc -q 999

Paul Lalli

 
Reply With Quote
 
 
 
 
xhoster@gmail.com
Guest
Posts: n/a
 
      08-11-2005
bill <(E-Mail Removed)> wrote:
> I just recently came across this snippet:
>
> my $EPSILON = 1;
> $EPSILON /= 2 while 0.5 + $EPSILON/2 > 0.5;
>
> where $EPSILON was later used, e.g., to compare floats for equality:
>
> sub float_equal {
> my ($x, $y) = @_;
> return !($x||$y) || ($x+$y) && abs(($x-$y)/($x+$y)) < $EPSILON;
> }
>
> (The above is the gist of what I recall, not verbatim production
> code, but I think I got it right.)
>
> I'm not very knowledgeable about numerics, so I was wondering
> whether this was the best way to compare floats for equality.


The best way to compare floats for equality is highly dependent on
why you want to compare floats for equality.

There are no magic bullets.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
 
Reply With Quote
 
bill
Guest
Posts: n/a
 
      08-11-2005
In <20050811160034.410$(E-Mail Removed)> http://www.velocityreviews.com/forums/(E-Mail Removed) writes:

>bill <(E-Mail Removed)> wrote:
>> I just recently came across this snippet:
>>
>> my $EPSILON = 1;
>> $EPSILON /= 2 while 0.5 + $EPSILON/2 > 0.5;
>>
>> where $EPSILON was later used, e.g., to compare floats for equality:
>>
>> sub float_equal {
>> my ($x, $y) = @_;
>> return !($x||$y) || ($x+$y) && abs(($x-$y)/($x+$y)) < $EPSILON;
>> }
>>
>> (The above is the gist of what I recall, not verbatim production
>> code, but I think I got it right.)
>>
>> I'm not very knowledgeable about numerics, so I was wondering
>> whether this was the best way to compare floats for equality.


>The best way to compare floats for equality is highly dependent on
>why you want to compare floats for equality.


That's interesting. I can't think of any examples in which method
X would be better than method Y for some purposes but worse for
others. Can you give an example?

In recent times, I've needed to compare floats in unit tests
(expected vs. obtained). I typically use an approach similar to
the one I described in my OP (i.e. declare two floats as "equal"
if the difference between them is below some tolerance), except
that I just pull the tolerance out of thin air (I usually go for
2**-20, instead of having a halfway defensible procedure for
determining it.

bill


 
Reply With Quote
 
Anno Siegel
Guest
Posts: n/a
 
      08-12-2005
bill <(E-Mail Removed)> wrote in comp.lang.perl.misc:
>
>
>
>
> I just recently came across this snippet:
>
> my $EPSILON = 1;
> $EPSILON /= 2 while 0.5 + $EPSILON/2 > 0.5;
>
> where $EPSILON was later used, e.g., to compare floats for equality:
>
> sub float_equal {
> my ($x, $y) = @_;
> return !($x||$y) || ($x+$y) && abs(($x-$y)/($x+$y)) < $EPSILON;
> }
>
> (The above is the gist of what I recall, not verbatim production
> code, but I think I got it right.)
>
> I'm not very knowledgeable about numerics, so I was wondering
> whether this was the best way to compare floats for equality.
>
> I thought that this would be a FAQ, but I couldn't find it.
>
> The first edition of The Perl Cookbook recommends converting both
> numbers to the same string format, and then doing a string comparison:
>
> sub equal {
> my ($A, $B, $dp) = @_;
> return sprintf("%.${dp}g", $A) eq sprintf("%.${dp}g", $B);
> }


For an ad-hoc approach I simply use "eq" on the float-point arguments
(without sprintf). "If it prints the same, it is the same."

> The main reason I don't like this approach is that, in my experience,
> some numerical objects, such as Math::BigFloat and Math:ari,
> don't work well with sprintf.


For comparison of non-native number formats you'll have to resort to
special methods. Unless the package comes with its own sprintf(),
sprintf-based methods won't work. The epsilon-finding procedure
would probably loop forever.

> Also, I recall that in C there are pre-defined constants (e.g.
> DBL_EPSILON in limits.h) that can be used for this kind of tests.
> Is there something similar in Perl?


Those constants are in POSIX.

> P.S. If someone in Perl publishing reads this: a book on Perl
> numerics is sorely needed (fundamentals, algorithms, recipes, best
> practices) is sorely needed.


Perl doesn't have its own numerics, it uses the native operations
of the C compiler. A book about Perl numerics would have to be one
about computer numerics in general, which abound.

Anno
--
If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.
 
Reply With Quote
 
bill
Guest
Posts: n/a
 
      08-12-2005
In <ddhtkk$iuq$(E-Mail Removed)-Berlin.DE> http://www.velocityreviews.com/forums/(E-Mail Removed)-berlin.de (Anno Siegel) writes:

>For an ad-hoc approach I simply use "eq" on the float-point arguments
>(without sprintf). "If it prints the same, it is the same."


Another objection I have against string-based methods (other than
the fact that it doesn't work well with certain numeric classes)
is the loss of precision incurred by translating to base 10. If
there were a way to do a "binary eq", where the string representation
was in base 2 instead of base 10, your approach would be fine with
me.

>> Also, I recall that in C there are pre-defined constants (e.g.
>> DBL_EPSILON in limits.h) that can be used for this kind of tests.
>> Is there something similar in Perl?


>Those constants are in POSIX.


Thanks!

>If you want to post a followup via groups.google.com, don't use
>the broken "Reply" link at the bottom of the article. Click on
>"show options" at the top of the article, then click on the
>"Reply" at the bottom of the article headers.


Is it my imagination or did Google Groups go to the dogs after they
"overhauled it" about a year ago? It's got to be the buggiest of
all Google services.

bill

 
Reply With Quote
 
Anno Siegel
Guest
Posts: n/a
 
      08-12-2005
bill <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> In <ddhtkk$iuq$(E-Mail Removed)-Berlin.DE>
> (E-Mail Removed)-berlin.de (Anno Siegel) writes:
>
> >For an ad-hoc approach I simply use "eq" on the float-point arguments
> >(without sprintf). "If it prints the same, it is the same."

>
> Another objection I have against string-based methods (other than
> the fact that it doesn't work well with certain numeric classes)
> is the loss of precision incurred by translating to base 10. If
> there were a way to do a "binary eq", where the string representation
> was in base 2 instead of base 10, your approach would be fine with
> me.


But you use rounding, or epsilon fuzz, to get rid of spurious precision.
If you round off only a single decimal digit, you're giving up more
precision than the conversion to decimal costs in the first place.

Something like that "binary eq" could probably be written (in Perl
or otherwise), but what would it gain? In effect, it gives you a
choice of epsilons that are powers of two, instead of powers of ten
like rounding. The practical difference is small.

Anno
--
If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.
 
Reply With Quote
 
bill
Guest
Posts: n/a
 
      08-12-2005
In <ddi69e$o30$(E-Mail Removed)-Berlin.DE> (E-Mail Removed)-berlin.de (Anno Siegel) writes:

>bill <(E-Mail Removed)> wrote in comp.lang.perl.misc:
>> In <ddhtkk$iuq$(E-Mail Removed)-Berlin.DE>
>> (E-Mail Removed)-berlin.de (Anno Siegel) writes:
>>
>> >For an ad-hoc approach I simply use "eq" on the float-point arguments
>> >(without sprintf). "If it prints the same, it is the same."

>>
>> Another objection I have against string-based methods (other than
>> the fact that it doesn't work well with certain numeric classes)
>> is the loss of precision incurred by translating to base 10. If
>> there were a way to do a "binary eq", where the string representation
>> was in base 2 instead of base 10, your approach would be fine with
>> me.


>But you use rounding, or epsilon fuzz, to get rid of spurious precision.
>If you round off only a single decimal digit, you're giving up more
>precision than the conversion to decimal costs in the first place.


>Something like that "binary eq" could probably be written (in Perl
>or otherwise), but what would it gain? In effect, it gives you a
>choice of epsilons that are powers of two, instead of powers of ten
>like rounding. The practical difference is small.


The effect is to leave it up to the user to decide, instead of
forcing him/her to rationalize a suboptimal situation after the
fact. The phrase "practical difference" makes a huge blanket
assumption about the task at hand. What's "good enough"? 10%?
1%? 0.0001%? The answer obviously depends on the application, and
my philosophy is to give as much power to the user as possible.
In this case, the best that can be done is to allow a binary
specification of precision, and let the user avail him/herself of
this as desired.

I suppose "binary eq" could be written using pack/unpack, but even
though I've been programming Perl for 8+ years and consider myself
pretty proficient at it, I remain a pack/unpack illiterate. For
some reason, even after I study pack/unpack intensively, its
operations seem mysterious to me, and therefore my minds retains
whatever I learn about it for roughly one nanosecond. Maybe I
should just read the perl source to finally understand it.

bill

 
Reply With Quote
 
Anno Siegel
Guest
Posts: n/a
 
      08-12-2005
bill <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> In <ddi69e$o30$(E-Mail Removed)-Berlin.DE>
> (E-Mail Removed)-berlin.de (Anno Siegel) writes:
>
> >bill <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> >> In <ddhtkk$iuq$(E-Mail Removed)-Berlin.DE>
> >> (E-Mail Removed)-berlin.de (Anno Siegel) writes:


[snip comparison of floats]

> >Something like that "binary eq" could probably be written (in Perl
> >or otherwise), but what would it gain? In effect, it gives you a
> >choice of epsilons that are powers of two, instead of powers of ten
> >like rounding. The practical difference is small.

>
> The effect is to leave it up to the user to decide, instead of
> forcing him/her to rationalize a suboptimal situation after the
> fact. The phrase "practical difference" makes a huge blanket
> assumption about the task at hand. What's "good enough"? 10%?
> 1%? 0.0001%?


Hmmm? Even the rounding approach gives you the choice of exactly
that? I don't get the big diff. The right approach to float
comparison is an epsilon approach, where (in principle) epsilon
must be specified for each comparison. The same goes for rounding,
whether done in decimal or binary. In principle, the number of
places to round must be specified on each comparison. There's your
chance to decide what's good enough. Nothing is being forced on you.

The rounding approach is basically equivalent to epsilon values
that are powers of ten (or two), just more convenient than a
specific comparison procedure. You get fewer choices of epsilon,
but you don't need that many.

> The answer obviously depends on the application, and
> my philosophy is to give as much power to the user as possible.
> In this case, the best that can be done is to allow a binary
> specification of precision, and let the user avail him/herself of
> this as desired.


Nah... that would depend on the odds and ends of the local representation
of floating point numbers. That's not a good thing for a program to
depend on. Specify the "range of equality" numerically, explicitly
or implicitly through rounding.

> I suppose "binary eq" could be written using pack/unpack, but even
> though I've been programming Perl for 8+ years and consider myself
> pretty proficient at it, I remain a pack/unpack illiterate. For
> some reason, even after I study pack/unpack intensively, its
> operations seem mysterious to me, and therefore my minds retains
> whatever I learn about it for roughly one nanosecond. Maybe I
> should just read the perl source to finally understand it.


That's not the way to understand pack/unpack. To learn about them,
play with them. Few people learn all the pack/unpack specifications
by heart, that's something to look up in each case. But it's useful
to acquaint oneself with the basic possibilities, so write a few test
cases and see what you get. It isn't used often, but for some things
it is just the ticket. One of them is reading of fixed-column data,
another (somewhat surprisingly) is counting the one-bits in a bit
string. Pack/unpack might be used in a "binary compare", but not
before we know what exactly it does.

Anno
--
If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.
 
Reply With Quote
 
xhoster@gmail.com
Guest
Posts: n/a
 
      08-13-2005
bill <(E-Mail Removed)> wrote:
> In <20050811160034.410$(E-Mail Removed)> (E-Mail Removed) writes:
>
> >bill <(E-Mail Removed)> wrote:
> >> I just recently came across this snippet:
> >>
> >> my $EPSILON = 1;
> >> $EPSILON /= 2 while 0.5 + $EPSILON/2 > 0.5;
> >>
> >> where $EPSILON was later used, e.g., to compare floats for equality:
> >>
> >> sub float_equal {
> >> my ($x, $y) = @_;
> >> return !($x||$y) || ($x+$y) && abs(($x-$y)/($x+$y)) < $EPSILON;
> >> }
> >>
> >> (The above is the gist of what I recall, not verbatim production
> >> code, but I think I got it right.)
> >>
> >> I'm not very knowledgeable about numerics, so I was wondering
> >> whether this was the best way to compare floats for equality.

>
> >The best way to compare floats for equality is highly dependent on
> >why you want to compare floats for equality.

>
> That's interesting. I can't think of any examples in which method
> X would be better than method Y for some purposes but worse for
> others.


You mean in general, or just in this context?

> Can you give an example?


If you are running an optimization, or an interation-until-convergence,
where the score function loses all physical significance after 3 digits but
you converge out to 28 digits, merely because your machine supports that
precision, you are likely to waste hours or decades of computer time for no
purpose.

>
> In recent times, I've needed to compare floats in unit tests
> (expected vs. obtained). I typically use an approach similar to
> the one I described in my OP (i.e. declare two floats as "equal"
> if the difference between them is below some tolerance), except
> that I just pull the tolerance out of thin air (I usually go for
> 2**-20, instead of having a halfway defensible procedure for
> determining it.


The defensible tolerance level is the one that makes sense given the
physical interpretation of the floats. Otherwise your algorithm
which gives a perfectly good answer will be deemed wrong simply because
your machine precision is high enough to detect a meaningless difference.
Or an algorithm that gives a fatally wrong answer (on a particular machine)
will be deemed right merely because your machine precision is too low to
realize it is wrong.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem in generating uniform distribution between 0 - Epsilon tahir.rauf C Programming 0 04-24-2011 11:31 AM
syntax Float equality within epsilon Une Bévue Ruby 3 11-23-2007 10:25 AM
Precision for equality of two floats? Anton81 Python 7 11-30-2005 03:18 AM
HEXUS.beans :: FSP Group Epsilon FSP700-80GL Spy Shots Silverstrand Front Page News 0 10-24-2005 01:43 PM
Floats to chars and chars to floats Kosio C Programming 44 09-23-2005 09:49 AM



Advertisments