Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Problems to calculate sin

Reply
Thread Tools

Problems to calculate sin

 
 
Kenneth P. Turvey
Guest
Posts: n/a
 
      03-05-2008
On Wed, 05 Mar 2008 01:09:56 +0000, Roedy Green wrote:

[Snip]
> the Intel FP instruction set has a sine-computing instruction. It
> works inside with polynomial approximations. Any error is Intel's
> doing.

[Snip]

I read somewhere that Java doesn't use this sine-computing instruction
since it doesn't meet the accuracy requirements guaranteed by the class.
This was quoted as the reason that Java performs slower on transcendental
functions than other languages on the platform.

Is this information out of date?



--
Kenneth P. Turvey <(E-Mail Removed)>
 
Reply With Quote
 
 
 
 
Thomas.a.mcglynn@nasa.gov
Guest
Posts: n/a
 
      03-05-2008
On Mar 4, 9:45 pm, Patricia Shanahan <(E-Mail Removed)> wrote:
> (E-Mail Removed) wrote:
>
> ...
>
> > With cubic interpolation, I'd anticipate errors of order the fourth
> > power of the step size, which suggests about 10,000 interpolation
> > intervals would be needed to get errors of order 10^-16 (for the range
> > 0 - pi/4). We need smaller errors for smaller values, but the
> > increasing linearity of sin(x) for small x probably takes care of
> > that. 10,000 is big enough that I'd want strong evidence that the
> > table approach was desirable, but not enough to preclude a table
> > driven approach based only upon the size of the table -- seems like it
> > should fit into typical cache.

>
> How many bytes per interval?
>
> Patricia


For a cubic fit presumably one needs 4 values per interval or 32 bytes
for double coefficients. The expansion isn't around 0 so the even and
odd terms both show. So about 300kB altogether. Too big to want to
use it without a good reason, but if, contrary to our belief, the
computation of the sine were faster using the table, easily
accommodated within typical programs, and a bit smaller than the
typical cache size. Presumably one would need to compute a lot of
sines to accommodate the setup overhead though.

Tom
 
Reply With Quote
 
 
 
 
Patricia Shanahan
Guest
Posts: n/a
 
      03-05-2008
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> On Mar 4, 9:45 pm, Patricia Shanahan <(E-Mail Removed)> wrote:
>> (E-Mail Removed) wrote:
>>
>> ...
>>
>>> With cubic interpolation, I'd anticipate errors of order the fourth
>>> power of the step size, which suggests about 10,000 interpolation
>>> intervals would be needed to get errors of order 10^-16 (for the range
>>> 0 - pi/4). We need smaller errors for smaller values, but the
>>> increasing linearity of sin(x) for small x probably takes care of
>>> that. 10,000 is big enough that I'd want strong evidence that the
>>> table approach was desirable, but not enough to preclude a table
>>> driven approach based only upon the size of the table -- seems like it
>>> should fit into typical cache.

>> How many bytes per interval?
>>
>> Patricia

>
> For a cubic fit presumably one needs 4 values per interval or 32 bytes
> for double coefficients. The expansion isn't around 0 so the even and
> odd terms both show. So about 300kB altogether. Too big to want to
> use it without a good reason, but if, contrary to our belief, the
> computation of the sine were faster using the table, easily
> accommodated within typical programs, and a bit smaller than the
> typical cache size. Presumably one would need to compute a lot of
> sines to accommodate the setup overhead though.
>
> Tom


I think at the moment the polynomial approximation approach wins,
because the processor has the greatest freedom to reorder and schedule
arithmetic involving only a few constants and one parameter, with no
memory loads.

As caches get bigger, the economics may shift again.

It's the sort of thing where if I were doing it professionally I might
maintain implementations both ways, and compare them on each processor
generation to see shifts in the relative performance.

Patricia
 
Reply With Quote
 
Thomas.a.mcglynn@nasa.gov
Guest
Posts: n/a
 
      03-05-2008
On Mar 5, 12:41 pm, Patricia Shanahan <(E-Mail Removed)> wrote:
> (E-Mail Removed) wrote:
> > For a cubic fit presumably one needs 4 values per interval or 32 bytes
> > for double coefficients. The expansion isn't around 0 so the even and
> > odd terms both show. So about 300kB altogether. Too big to want to

....
>
> I think at the moment the polynomial approximation approach wins,
> because the processor has the greatest freedom to reorder and schedule
> arithmetic involving only a few constants and one parameter, with no
> memory loads.
>
> As caches get bigger, the economics may shift again.
>
> It's the sort of thing where if I were doing it professionally I might
> maintain implementations both ways, and compare them on each processor
> generation to see shifts in the relative performance.
>
> Patricia


I'm sure your right about the current situation. However I suspect I
overestimated the needed size for a cache somewhat. I was naively
thinking of 10,000 independent cubic fits, but it's more reasonable
and efficient to use the values from several points near where the
interpolation is to be done. In that case we only need a single
coefficient for each point (presumably the function value) so we can
get by with a mere ~100kB of table. I don't think this affects your
conclusion though.

Regards,
Tom
 
Reply With Quote
 
Mark Thornton
Guest
Posts: n/a
 
      03-05-2008
Kenneth P. Turvey wrote:
> On Wed, 05 Mar 2008 01:09:56 +0000, Roedy Green wrote:
>
> [Snip]
>> the Intel FP instruction set has a sine-computing instruction. It
>> works inside with polynomial approximations. Any error is Intel's
>> doing.

> [Snip]
>
> I read somewhere that Java doesn't use this sine-computing instruction
> since it doesn't meet the accuracy requirements guaranteed by the class.
> This was quoted as the reason that Java performs slower on transcendental
> functions than other languages on the platform.


It does use the FSIN function, but only after reducing the argument to
+-PI/4 (I believe). The accuracy problem with the FSIN function is the
way Intel do argument reduction using a 66bit approximation of PI. The
Java specification requires many more bits of PI in some cases.

Mark Thornton
 
Reply With Quote
 
Mark Thornton
Guest
Posts: n/a
 
      03-05-2008
(E-Mail Removed) wrote:
> On Mar 5, 12:41 pm, Patricia Shanahan <(E-Mail Removed)> wrote:
>> (E-Mail Removed) wrote:
>>> For a cubic fit presumably one needs 4 values per interval or 32 bytes
>>> for double coefficients. The expansion isn't around 0 so the even and
>>> odd terms both show. So about 300kB altogether. Too big to want to

> ...
>> I think at the moment the polynomial approximation approach wins,
>> because the processor has the greatest freedom to reorder and schedule
>> arithmetic involving only a few constants and one parameter, with no
>> memory loads.
>>
>> As caches get bigger, the economics may shift again.
>>
>> It's the sort of thing where if I were doing it professionally I might
>> maintain implementations both ways, and compare them on each processor
>> generation to see shifts in the relative performance.
>>
>> Patricia

>
> I'm sure your right about the current situation. However I suspect I
> overestimated the needed size for a cache somewhat. I was naively
> thinking of 10,000 independent cubic fits, but it's more reasonable
> and efficient to use the values from several points near where the
> interpolation is to be done. In that case we only need a single
> coefficient for each point (presumably the function value) so we can
> get by with a mere ~100kB of table. I don't think this affects your
> conclusion though.
>
> Regards,
> Tom


It may be worth noting that the Level one data caches are often much
smaller than this (32KB/core on my machine).

Mark Thornton
 
Reply With Quote
 
Thomas.a.mcglynn@nasa.gov
Guest
Posts: n/a
 
      03-05-2008
On Mar 4, 8:09 pm, Roedy Green <(E-Mail Removed)>
wrote:
>
> I suppose in some future chip it will have special in-parallel checks
> for 45 degrees, 90 degrees, 0 degrees, 30 degrees to get as perfect as
> possible results.


Hmmm.... Do you mean that you would like something like

Math.sin(Math.PI)

to return 0? That seems very dangerous.

I could see having special checking on new degree based functions
though: E.g., [untested]

double sind(double x) {
double specialValue[] {
0,0.5,Math.sqrt(3)/2, 1, ...};

x = x % 360;
if (x < 0) {
x += 360;
}

if ( (x mod 30) == 0} {
return specialValue[x/30];
} else {
return Math.sin(x*Math.PI/180);
}
}

Regards,
Tom McGlynn
 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      03-05-2008
On 05 Mar 2008 09:41:15 GMT, "Kenneth P. Turvey"
<(E-Mail Removed)> wrote, quoted or indirectly quoted
someone who said :

>I read somewhere that Java doesn't use this sine-computing instruction
>since it doesn't meet the accuracy requirements guaranteed by the class.
>This was quoted as the reason that Java performs slower on transcendental
>functions than other languages on the platform.
>
>Is this information out of date?


I don't know. The code would in a native class. The chip will not have
changed, so likely that information would be stable.
--

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com
 
Reply With Quote
 
Andreas Leitgeb
Guest
Posts: n/a
 
      03-06-2008
Roedy Green <(E-Mail Removed)> wrote:
> On 05 Mar 2008 09:41:15 GMT, "Kenneth P. Turvey"
>>I read somewhere that Java doesn't use this sine-computing instruction
>>since it doesn't meet the accuracy requirements guaranteed by the class.
>>This was quoted as the reason that Java performs slower on transcendental
>>functions than other languages on the platform.
>>Is this information out of date?


> I don't know. The code would in a native class. The chip will not have
> changed, so likely that information would be stable.


My guess was, that if one needed math more exact than native
processor's arithmetics, he would use "strictfp", so while I
don't know for sure, I wouldn't bet on abovementioned information.

 
Reply With Quote
 
Patricia Shanahan
Guest
Posts: n/a
 
      03-06-2008
Andreas Leitgeb wrote:
> Roedy Green <(E-Mail Removed)> wrote:
>> On 05 Mar 2008 09:41:15 GMT, "Kenneth P. Turvey"
>>> I read somewhere that Java doesn't use this sine-computing instruction
>>> since it doesn't meet the accuracy requirements guaranteed by the class.
>>> This was quoted as the reason that Java performs slower on transcendental
>>> functions than other languages on the platform.
>>> Is this information out of date?

>
>> I don't know. The code would in a native class. The chip will not have
>> changed, so likely that information would be stable.

>
> My guess was, that if one needed math more exact than native
> processor's arithmetics, he would use "strictfp", so while I
> don't know for sure, I wouldn't bet on abovementioned information.
>


The minimum precision requirements in the Math.sin API documentation are
not dependent on strictfp.

Patricia
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
very simple question on Cos and Sin aj VHDL 10 12-31-2005 03:22 AM
any python module to calculate sin, cos, arctan? Shi Mu Python 8 11-08-2005 03:30 PM
Re: any python module to calculate sin, cos, arctan? Simon Brunning Python 0 11-08-2005 10:58 AM
why do I need ::sin, not std::sin? Alexander Stippler C++ 14 06-04-2004 11:09 PM
x.sin() versus sin(x) franky.backeljauw@ua.ac.be C++ 7 09-09-2003 11:24 AM



Advertisments