Velocity Reviews > Optimizing pow() function

# Optimizing pow() function

James Kuyper
Guest
Posts: n/a

 04-23-2013
On 04/23/2013 01:27 AM, pozz wrote:
> Il 22/04/2013 20:49, James Kuyper ha scritto:
>> On 04/22/2013 02:34 PM, Edward A. Falk wrote:
>>> In article <(E-Mail Removed)>,
>>> <(E-Mail Removed)> wrote:
>>>> I have a simple (16-bit) embedded platform and I have to make the following calculations.
>>>>
>>>> unsigned int
>>>> func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
>>>> return (unsigned int)(pow((double)adc, exp) *
>>>> (double)pwr / pow((double)pt, exp));
>>>> }
>>>>
>>>> I noticed this calculation takes too much time, most probably for pow() function call
>>>> and double conversion.
>>>>
>>>> I'm finding a way to simplify/optimize this function. Consider that adc and pt
>>>> parameters are in the range 0..1023, exp is in the range 1.00-3.00, and pwr can be
>>>> 50-10000.
>>>
>>> Are you not worried about overflow?
>>>
>>> Your embedded processor almost certainly does not have a floating-point
>>> unit. This explains why your code is so slow.
>>>
>>> I see all values are guaranteed to be integer values.

>>
>> The variable exp is double, and he didn't say that only integral values
>> of exp are allowed. If that is the case, he can speed up the code a lot
>> by using multiplications, rather than calls to pow().

>
> exp can be in the range 1.00 to 3.00 (it derives from an integer value
> in the range 100-300 divided by 100).

OK, then you can't use that speed-up. Your options for speeding up the
calculation inside func() are limited to pretty much the advice you've

1. You need to handle the case of pt==0 separately.
2. Cut the time nearly in half, by using pwr*pow(pd/(double)adc, exp)
3. Produce a small and unreliable speed up by changing to use
exclusively float rather than double. That means that 'exp' should be
float, too. For best results, make sure that the calculation of 'exp'
outside the function is done in single precision: (float)integer/100.0F
(or equivalently, and faster on some systems: (float)integer*0.01F).
pt, and calculate pwr*expf(exp*precomputed_table[adc][pt]). This takes
up a fair amount of memory; it's only worthwhile if the function is
going to be evaluated frequently. Note that you must handle values of
zero for adc and pt separately.
5. More extreme: pre-compute pow(adc/pt, exp) for all values of adc, pt,
and the integer you divide by 100 to calculate exp. This uses a lot more
memory, but can save a lot of time if func() is called sufficiently
frequently.

If you need more of a speed-up than that, you'll have to consider the
reorganizing your code that calls func(), and not just func() itself.
It's got four arguments: is it frequently called for several different
values of some of the arguments, while the remaining arguments are
constant for many different consecutive calls to func()? Even if that's
not true of the way func() is called now, could you reorder the
calculations so that it becomes true?
If so, you can achieve substantial speed-ups by breaking the evaluation
of the expression into two parts, one part that depends only upon the
values that are frequently the same, and the other part that depends
upon the values that vary frequently. Whether or not this would help
significantly depends upon which arguments vary infrequently, and which
ones vary frequently. If 'exp' is one of the ones that varies
and pt varies infrequently, it could help a great deal.
--
James Kuyper

James Kuyper
Guest
Posts: n/a

 04-23-2013
On 04/23/2013 01:35 AM, pozz wrote:
....
> The exp parameter is a calibration value that is defined during first
> stages. After that, it will remain always the same value.

That's unfortunate, because the exp is the argument that provides the
least opportunities for speeding things up by reason of being constant.

> I can pre-calculate the look-up table for that particular value. I need
> a table of adc/pt combinations length (1023 integers, considering a
> fixed pt value), so 2046 bytes... I don't have this space in RAM

If pt can be considered fixed, and exp can be considered fixed, you
could pre-compute denominator = pow(pt,exp), and then calculate
pwr*pow(adc,exp)/denominator, but that's still a multiplication, a
pow(), and a division, exactly the same operation count as
--
James Kuyper

Eric Sosman
Guest
Posts: n/a

 04-23-2013
On 4/23/2013 1:25 AM, pozz wrote:
> Il 22/04/2013 17:42, Eric Sosman ha scritto:
>> On 4/22/2013 11:23 AM, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
>>> I have a simple (16-bit) embedded platform and I have to make the
>>> following calculations.
>>>
>>> unsigned int
>>> func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
>>> return (unsigned int)(pow((double)adc, exp) *
>>> (double)pwr / pow((double)pt, exp));
>>> }
>>>
>>> I noticed this calculation takes too much time, most probably for
>>> pow() function call and double conversion.
>>>
>>> I'm finding a way to simplify/optimize this function. Consider that
>>> adc and pt parameters are in the range 0..1023, exp is in the range
>>> 1.00-3.00, and pwr can be 50-10000.
>>>
>>> Any suggestion to avoid pow() function call?

>>
>> Simple enough to avoid half of them:
>>
>> return pwr * pow( (double)adc / (double)pt, exp);
>>
>> This is "algebraically equivalent" to the original, although
>> it probably won't give exactly the same result every time.
>>

>
> Sure, I have halfed the time in this way, but I'd like to reduce the
> process time more, if possible.

always an integer or half-integer" or "I only need six bits'
accuracy in the result"), I see no approach that's likely to
make a further dramatic difference. (Giant precomputed tables
seem inappropriate for "a simple embedded platform.")

Sometimes it's more fruitful to take a step back and look
at the overall problem: Not "How can I do *this* faster," but
"If I did things differently, could I replace *this* with
something else?" What problem are you trying to solve?

--
Eric Sosman
(E-Mail Removed)d

James Kuyper
Guest
Posts: n/a

 04-23-2013
On 04/23/2013 04:27 AM, (E-Mail Removed) wrote:
....
> Ok, it's better to explain what I'm trying to do.
>
> I have a 10-bit ADC that acquires a voltage signal and convert it to a decimal number 0..1023.
> This signal measures a power (Watt). The relation between voltage and power is theorically quadratic:
>
> P = k * x ^ 2
>
> where k is a constant that depends on the system and x is the ADC result.
>
> In order to calibrate k, I measure with an instrument the power at a nominal/reference value, for example I have Pr power measured ad Vr voltage that xr ADC points. So I can write:
>
> P = Pr * (x / xr) ^ 2
>
> Usually I make the reference measurement at xr=800pt.
>
> The quadratic law above is simple to implement, but in practice I have to fine-tune this law, because at low power levels the voltage/power relation seems to deviate from the quadratic law (I don't know exactly why, maybe some non-linear behaviour of some components). So I changed the formula with:
>
> P = Pr * (x / xr) ^ a

Without seeing your data, I can't be sure, but I think that it's
unlikely that the correct formula is of that form. If you don't have a
theory that tells you what the shape of a curve should be, and you're
trying to fit it empirically, choose the form of the curve to make your
job easier. A power law with a non-integer power makes things very
difficult.

....
> The best would be to have a simple formula (polynomial function?) that approximate well enough the above equation at run-time, without too big tables.

I agree.
--
James Kuyper

88888 Dihedral
Guest
Posts: n/a

 04-23-2013
glen herrmannsfeldt於 2013年4月23日星期二UTC+8上午5時31分24秒 寫道：
> James Kuyper <(E-Mail Removed)> wrote:
>
> > On 04/22/2013 02:34 PM, Edward A. Falk wrote:

>
> >> <(E-Mail Removed)> wrote:

>
> >>> I have a simple (16-bit) embedded platform and I have to

>
> >>> make the following calculations.

>
>
>
> >>> unsigned int

>
> >>> func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp){

>
> >>> return (unsigned int)(pow((double)adc, exp) *

>
> >>> (double)pwr / pow((double)pt, exp));

>
>
>
> >>> I noticed this calculation takes too much time, most probably

>
> >>> for pow() function call and double conversion.

>
>
>
> >>> I'm finding a way to simplify/optimize this function. Consider that adc and pt

>
> >>> parameters are in the range 0..1023, exp is in the range 1.00-3.00, and pwr can be

>
> >>> 50-10000.

>
>
>
> >> Are you not worried about overflow?

>
>
>
> >> Your embedded processor almost certainly does not have a floating-point

>
> >> unit. This explains why your code is so slow.

>
>
>
> >> I see all values are guaranteed to be integer values.

>
>
>
> > The variable exp is double, and he didn't say that only integral values

>
> > of exp are allowed. If that is the case, he can speed up the code a lot

>
> > by using multiplications, rather than calls to pow().

>
>
>
> If only a small number of different exp values occur, a look-up
>
> table would be faster than pow. If I want an int result, I try to do
>
> the whole calculation in integer arithmetic, even if it needs some
>
> shifts to get the binary point in the right place.
>
>
>
> Routines for fixed point scaled exp, log, and I believe pow are
>
> in Knuth's "Metafont: The Program."
>
>
>
> With only 10 bit input, I presume you don't need 53 bits of double.
>
> With a little work doing scaling, it can all be done in fixed point
>
> arithmetic. (Except that pow comes in as double, but either generate
>
> the appropriate scaled fixed point value or send it in that way.)
>

Well, the benefits of integer additions and multiplications
for very long integers in distributed systems by
the number theory transforms and residue number systems
were long long time ago.

But these were applied in the high priced military custom-built
systems only.

There are no ready and cheap to use MCUs with more than
16 cores now with the complete tool-chain .

pozz
Guest
Posts: n/a

 04-23-2013
Il 23/04/2013 14:35, Richard Damon ha scritto:
>
> Hearing your problem, my guess would be that the issue isn't that the
> exponent is a little bit off, but that there is an offset to the voltage
> measured. Thus rather than changing to
>
> P = Pr * (x / xr ) ^ a
>
> you might want to look at
>
> P = k * (x + x0) ^ 2

It is nonsense when x=0, because P must be 0 in that case.

pozz
Guest
Posts: n/a

 04-23-2013
Il 22/04/2013 22:09, Malcolm McLean ha scritto:
> On Monday, April 22, 2013 8:09:31 PM UTC+1, Bart wrote:
>> <(E-Mail Removed)> wrote in message
>>
>>
>> Can exp be anything from 1.00 to 3.00? How often will it be 1.0, 2.0, and
>> 3.0? (These can be optimised.) Will it also only be specified to two
>> decimals? That suggests only 201 different values, although making use of a
>> 200K lookup table indexed by adc/pt and 100*exp is probably not that
>> practical, and the results will be approximate.
>>

>
> pow(x, y) = exp(y * log(x)) for positive values.
>
> the expensive part is working out log x. But if you can look it up, you need
> far fewer than 200k entries.

I'm trying some series that approximate log() and exp() and it seems is
simpler to approximate log() and not exp().

Any suggestions?

pozz
Guest
Posts: n/a

 04-23-2013
Il 23/04/2013 15:19, James Kuyper ha scritto:
> On 04/23/2013 04:27 AM, (E-Mail Removed) wrote:
> ...
>> Ok, it's better to explain what I'm trying to do.
>>
>> I have a 10-bit ADC that acquires a voltage signal and convert it to a decimal number 0..1023.
>> This signal measures a power (Watt). The relation between voltage and power is theorically quadratic:
>>
>> P = k * x ^ 2
>>
>> where k is a constant that depends on the system and x is the ADC result.
>>
>> In order to calibrate k, I measure with an instrument the power at a nominal/reference value, for example I have Pr power measured ad Vr voltage that xr ADC points. So I can write:
>>
>> P = Pr * (x / xr) ^ 2
>>
>> Usually I make the reference measurement at xr=800pt.
>>
>> The quadratic law above is simple to implement, but in practice I have to fine-tune this law, because at low power levels the voltage/power relation seems to deviate from the quadratic law (I don't know exactly why, maybe some non-linear behaviour of some components). So I changed the formula with:
>>
>> P = Pr * (x / xr) ^ a

>
> Without seeing your data, I can't be sure, but I think that it's
> unlikely that the correct formula is of that form. If you don't have a
> theory that tells you what the shape of a curve should be, and you're
> trying to fit it empirically, choose the form of the curve to make your
> job easier. A power law with a non-integer power makes things very
> difficult.

Ok, I could try adding a linear term to the original equation:

P = k * x ^ 2 + h * x

But now it's very difficult to calibrate and find k and h constant in
such a way the curve passes through the reference point (xr, Pr) and
another reference low-power point (xs, Ps).

With my exponential equation, the calibration is very simple. I move to
the reference point (xr=800) and measure Pr. The curve passes always
throug the reference point (xr, Pr) even changing the second constant a
(the exponent).
So I can move to a second reference point corresponding to a low power
level and calibrate the constant a, without worrying about the first
reference point.

>
> ...
>> The best would be to have a simple formula (polynomial function?) that approximate well enough the above equation at run-time, without too big tables.

>
> I agree.

I'll try to find two good series to calculate exp() and log().

pozz
Guest
Posts: n/a

 04-23-2013
Il 23/04/2013 15:08, Eric Sosman ha scritto:
> On 4/23/2013 1:25 AM, pozz wrote:
>> Il 22/04/2013 17:42, Eric Sosman ha scritto:
>>> On 4/22/2013 11:23 AM, (E-Mail Removed) wrote:
>>>> I have a simple (16-bit) embedded platform and I have to make the
>>>> following calculations.
>>>>
>>>> unsigned int
>>>> func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
>>>> return (unsigned int)(pow((double)adc, exp) *
>>>> (double)pwr / pow((double)pt, exp));
>>>> }
>>>>
>>>> I noticed this calculation takes too much time, most probably for
>>>> pow() function call and double conversion.
>>>>
>>>> I'm finding a way to simplify/optimize this function. Consider that
>>>> adc and pt parameters are in the range 0..1023, exp is in the range
>>>> 1.00-3.00, and pwr can be 50-10000.
>>>>
>>>> Any suggestion to avoid pow() function call?
>>>
>>> Simple enough to avoid half of them:
>>>
>>> return pwr * pow( (double)adc / (double)pt, exp);
>>>
>>> This is "algebraically equivalent" to the original, although
>>> it probably won't give exactly the same result every time.
>>>

>>
>> Sure, I have halfed the time in this way, but I'd like to reduce the
>> process time more, if possible.

>
> always an integer or half-integer" or "I only need six bits'
> accuracy in the result"), I see no approach that's likely to
> make a further dramatic difference. (Giant precomputed tables
> seem inappropriate for "a simple embedded platform.")

about what I'm trying to do.

> Sometimes it's more fruitful to take a step back and look
> at the overall problem: Not "How can I do *this* faster," but
> "If I did things differently, could I replace *this* with
> something else?" What problem are you trying to solve?
>

James Kuyper
Guest
Posts: n/a

 04-23-2013
On 04/23/2013 05:10 PM, pozz wrote:
> Il 23/04/2013 14:35, Richard Damon ha scritto:
> >
>> Hearing your problem, my guess would be that the issue isn't that the
>> exponent is a little bit off, but that there is an offset to the voltage
>> measured. Thus rather than changing to
>>
>> P = Pr * (x / xr ) ^ a
>>
>> you might want to look at
>>
>> P = k * (x + x0) ^ 2

>
> It is nonsense when x=0, because P must be 0 in that case.

You're essentially claiming certainty that there is no voltage offset,
that x0 must be exactly 0. You've already indicated that the data
doesn't exactly match theory. How certain are you (and on what basis?),
that the failure of reality to match theory doesn't involve a voltage
offset?