Velocity Reviews > Which operation is more expensive?(a <= b and a != 0)

# Which operation is more expensive?(a <= b and a != 0)

Ping Cheu
Guest
Posts: n/a

 02-27-2012
From a web-site that teaches optimization for c,

I saw bellow optimization rule.

int fact1_func (int n)
{
int i, fact = 1;
for (i = 1; i <= n; i++)
fact *= i;
return(fact);
}

int fact2_func(int n)
{
int i, fact = 1;
for (i = n; i != 0; i--)
fact *= i;
return(fact);
}

The author say 'fact2_func' is faster than 'fact1_func'.

That means 'i <= n' operation is more expensive than 'i != 0'.

So far, I thought all condition statements have same cost.

So I want to know the cost of condition states from the fast one to slow
one.

Can anyone tell me?

James Kuyper
Guest
Posts: n/a

 02-27-2012
On 02/27/2012 04:24 PM, Ping Cheu wrote:
> From a web-site that teaches optimization for c,
>
> I saw bellow optimization rule.
>
> int fact1_func (int n)
> {
> int i, fact = 1;
> for (i = 1; i <= n; i++)
> fact *= i;
> return(fact);
> }
>
> int fact2_func(int n)
> {
> int i, fact = 1;
> for (i = n; i != 0; i--)
> fact *= i;
> return(fact);
> }
>
> The author say 'fact2_func' is faster than 'fact1_func'.
>
> That means 'i <= n' operation is more expensive than 'i != 0'.
>
> So far, I thought all condition statements have same cost.
>
> So I want to know the cost of condition states from the fast one to slow
> one.
>
> Can anyone tell me?

Yes, someone can tell you. And the answers will be different on
different machines, and for different compilers on the same machine, and
for different optimization levels for the same compiler on the same
machine. The timings will be different depending upon the nature of the
surrounding code, which will affect how effective the compiler's
optimization will be. Finally, and most importantly, if there are two
different but logically equivalent ways of writing your code, the best
modern compilers will, at high optimization levels, re-write the slower
one so that it is implemented that same way as the faster one. As a
result, if you perform a test, you may find that the two loops execute
equally fast.

In a language like C, which is meant to be widely portable and which
gives you very poor control over the generated machine code, it's a
waste of time to acquire that kind of knowledge. It makes more sense to
worry about such things when you're writing assembly code, where you
have more precise control over what the generated machine code will look
like, and a corresponding responsibility to acquire precisely that kind
of information for the particular machine you're writing the assembly
code for. You don't expect such code to be portable, so you don't have
to worry about the fact that the answers will be different on different
machines.

In C, it makes more sense to worry about higher-level issues - make sure
your algorithm doesn't waste time doing the same thing twice (or worse,
the same thing N times, for N very large). Make sure it only does what
needs to be done, and doesn't waste time and source code doing things
that don't need to be done. Make sure your algorithm is correct, and
most important of all, write your code so that it is clearly a correct
implementation of the algorithm. Let the compiler worry about making it
fast.

If and only if you find your program has a actual speed problem, run a
profiler to find out where it is wasting the most time - it's almost
never the same place that you thought it was, and it's almost never
involves issues like the loop condition you're worrying about here. Keep
in mind that the changes you make to speed it up for the current
platform could make it run slower on some other platform - so you should
always be very reluctant to make such changes.

Keith Thompson
Guest
Posts: n/a

 02-27-2012
Ping Cheu <(E-Mail Removed)> writes:
> From a web-site that teaches optimization for c,
>
> I saw bellow optimization rule.
>

[snip]
>
> The author say 'fact2_func' is faster than 'fact1_func'.
>
> That means 'i <= n' operation is more expensive than 'i != 0'.
>
> So far, I thought all condition statements have same cost.
>
> So I want to know the cost of condition states from the fast one to slow
> one.
>
> Can anyone tell me?

No.

The C language standard says absolutely nothing about the relative
performance of different operations.

It may be the case that "i <= n" will be more expensive than "i !=0"
on some particular system. On another system, they might take
the same time, or "i != 0" might even be more expensive.

If I had to guess, I'd say that "i != 0" is likely to be slightly
faster, or at least no slower, since it only has to load one variable
rather than two before doing the comparison.

On the other hand, a decent optimizing compiler might generate equally
good code for both.

But *I don't have to guess*. I can measure it -- which is what you
should do if the performance matters.

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Malcolm McLean
Guest
Posts: n/a

 02-27-2012
On Feb 27, 9:24*pm, Ping Cheu <(E-Mail Removed)> wrote:
>
> Can anyone tell me?
>

If we say if(x != 0) then the processor can load x into a register,
and do a comparison for zero. If we say if(x < n) then the processor
needs to load x and n into registers, then do a subtraction, then look
for carry. So the first test is simpler. Whether it is actually faster
or not depends on whether the circuitry that would have done the more
complex calculation is otherwise employed, or is sitting idle. Often
on modern processors xand n will usually both be in registers anyway,
and an inequality compare completes in exactly as many clock cycles as
a test against zero.

--
Basic Algorithms - a second book of C
http://www.malcolmmclean.site11.com/www

Shao Miller
Guest
Posts: n/a

 02-27-2012
On 2/27/2012 16:24, Ping Cheu wrote:
> From a web-site that teaches optimization for c,
>

probability of this happening seems low.

> I saw bellow optimization rule.
>
> int fact1_func (int n)
> {
> int i, fact = 1;
> for (i = 1; i<= n; i++)
> fact *= i;
> return(fact);
> }
>
> int fact2_func(int n)
> {
> int i, fact = 1;
> for (i = n; i != 0; i--)
> fact *= i;
> return(fact);
> }
>
> The author say 'fact2_func' is faster than 'fact1_func'.
>
> That means 'i<= n' operation is more expensive than 'i != 0'.
>
> So far, I thought all condition statements have same cost.
>
> So I want to know the cost of condition states from the fast one to slow
> one.
>
> Can anyone tell me?

Please don't take a claim out of context and suppose it applies

C doesn't define "the speed" of evaluation of anything that I'm aware
of, so my answer would be "no, 'fact2_func' is not defined to always be
faster than 'fact1_func'."

However, if you consider that 'i <= n' involves _two_ reads and 'i != 0'
involves _one_ read, then unless you can read both at the same time or

Enjoy.

Richard Damon
Guest
Posts: n/a

 02-27-2012
On 2/27/12 4:24 PM, Ping Cheu wrote:
> From a web-site that teaches optimization for c,
>
> I saw bellow optimization rule.
>
> int fact1_func (int n)
> {
> int i, fact = 1;
> for (i = 1; i <= n; i++)
> fact *= i;
> return(fact);
> }
>
> int fact2_func(int n)
> {
> int i, fact = 1;
> for (i = n; i != 0; i--)
> fact *= i;
> return(fact);
> }
>
> The author say 'fact2_func' is faster than 'fact1_func'.
>
> That means 'i <= n' operation is more expensive than 'i != 0'.
>
> So far, I thought all condition statements have same cost.
>
> So I want to know the cost of condition states from the fast one to slow
> one.
>
> Can anyone tell me?

The difference is that the second compares to a constant 0, which MAY be
faster than comparing to a variable n.

Many processors can quickly test the value of a register to see if it is
negative or positive (sign bit test) or zero or non-zero without running
an explicit comparison operation.

Note that the second version will behave differently in the case of n<0,
likely taking a significantly longer time than the first, finally
returning a value that is probably 0 (after numerous instances of
undefined behavior of overflow), while the first will rapidly return 1.

If you change the second to use i > 0, they will both return rapidly,
but you have possibly lost some of the advantage, as this relationship
is less often a built in automatic, but may still be slightly faster due
to the constant 0 instead of the variable n being used.

Keith Thompson
Guest
Posts: n/a

 02-27-2012
Ping Cheu <(E-Mail Removed)> writes:
> From a web-site that teaches optimization for c,
>
> I saw bellow optimization rule.
>
> int fact1_func (int n)
> {
> int i, fact = 1;
> for (i = 1; i <= n; i++)
> fact *= i;
> return(fact);
> }
>
> int fact2_func(int n)
> {
> int i, fact = 1;
> for (i = n; i != 0; i--)
> fact *= i;
> return(fact);
> }
>
> The author say 'fact2_func' is faster than 'fact1_func'.

[...]

One other thing: if int is 32 bits, both functions will overflow
for any argument value greater than 12. If int is 64 bits (or if
you modify the functions to return a 64-bit result), it overflows
for arguments greater than 20.

If you're not calling these functions many many times (thousands or
millions), their performance isn't going to matter. And if you are,
the fix is to cache the results so you don't have to recompute the
same results; a table lookup would be a good approach.

Don't waste your time optimizing code whose performance doesn't matter.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Barry Schwarz
Guest
Posts: n/a

 02-27-2012
On Feb 27, 3:24*pm, Ping Cheu <(E-Mail Removed)> wrote:
> From a web-site that teaches optimization for c,
>
> I saw bellow optimization rule.
>
> * * int fact1_func (int n)
> * * {
> * * * * int i, fact = 1;
> * * * * for (i = 1; i <= n; i++)
> * * * * * fact *= i;
> * * * * return(fact);
> * * }
>
> * * int fact2_func(int n)
> * * {
> * * * * int i, fact = 1;
> * * * * for (i = n; i != 0; i--)
> * * * * * *fact *= i;
> * * * * return(fact);
> * * }
>
> The author say 'fact2_func' is faster than 'fact1_func'.
>
> That means 'i <= n' operation is more expensive than 'i != 0'.
>
> So far, I thought all condition statements have same cost.
>
> So I want to know the cost of condition states from the fast one to slow
> one.
>
> Can anyone tell me?

On at least one system, there are machine intsructions that will set
processor flags for an arithmetic operation that indicate whether the
result is positive, negative, or zero. On that machine, the i--
method will not require a comparison. The i++ method will always
require a comparison to n.

The same system has instructions that can test a value against zero
faster than a test against an arbitrary constant.

In the general case, tha author is merely displaying is lack of
experience and imagination.

Edward A. Falk
Guest
Posts: n/a

 02-27-2012
In article <(E-Mail Removed)>,
Keith Thompson <(E-Mail Removed)> wrote:
>
>On the other hand, a decent optimizing compiler might generate equally
>good code for both.
>
>But *I don't have to guess*. I can measure it -- which is what you
>should do if the performance matters.

Best answer. But bear in mind that the result will depend on the cpu
architecture and compiler used.

I can say that for gcc 4.4.3 on x86, with optimization level 4, the
"i <= n" variant generated one more instruction in the inner loop than the
"i != 0" variant.

Now with caching and other hardware optimizations, it might not make a
difference, but you'd have to run timing tests to be sure.

--
-Ed Falk, (E-Mail Removed)
http://thespamdiaries.blogspot.com/

Kaz Kylheku
Guest
Posts: n/a

 02-28-2012
On 2012-02-27, Ping Cheu <(E-Mail Removed)> wrote:
> From a web-site that teaches optimization for c,
>
> I saw bellow optimization rule.
>
> int fact1_func (int n)
> {
> int i, fact = 1;
> for (i = 1; i <= n; i++)
> fact *= i;
> return(fact);
> }

Your function will invoke multiplication overflow (undefined behavior) for
values of n which are not that large.

> int fact2_func(int n)
> {
> int i, fact = 1;
> for (i = n; i != 0; i--)
> fact *= i;
> return(fact);
> }
>
> The author say 'fact2_func' is faster than 'fact1_func'.

relative to the constrained range of int, that you cannot have a significantly
large domain value of n.

This also means it is practical to do it with a table lookup:

int fact3_func(int n)
{
/* This table is going to be quite small: */
static int fact_table[] = { 1, 1, 2, 24, 120, ...}

if (n >= 0 && n < sizeof fact_table / sizeof *fact_table)
return fact_table[n];

/* trouble: what to do? */
return -1;
}

Before you worry about how the direction of the loop affects machine
cycles, consider the limiting circumstances and the algorithm.