Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Faster for() loops?

Reply
Thread Tools

Faster for() loops?

 
 
Flash Gordon
Guest
Posts: n/a
 
      09-26-2005
Joe Butler wrote:
>>That is not an optimization, but a total waste of time. Read the first
>>example in "Elements of programming style" and learn...

>
> What if the difference is between fitting into memory and not?


If you are hunting for that small an amount to get the program to fit
then you are in trouble anyway. Something will need changing making it
no longer fit!
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.
 
Reply With Quote
 
 
 
 
Christian Bau
Guest
Posts: n/a
 
      09-26-2005
In article <dh8bgd$g5o$(E-Mail Removed)>,
Ian Bell <(E-Mail Removed)> wrote:

> Many micros have a decrement jmp if zero (or non zero) machine instruction
> so a decent optimising compiler should know this and use it in count down
> to zero loops. Counting up often needs a compare followed by a jmp zero (or
> non zero) which will be a tad slower.


The Pentium processors have a loop instruction. Every decent compiler
knows it and avoids it like hell because it runs slower than a subtract
+ compare + conditional branch
 
Reply With Quote
 
 
 
 
Christian Bau
Guest
Posts: n/a
 
      09-26-2005
In article <(E-Mail Removed)>,
"Peter Bushell" <(E-Mail Removed)> wrote:

> These days, most compilers can
> optimise almost as well as you can, for most "normal" operations.


Question: How can I optimise code better than the compiler?
Answer: If you ask, then you can't.
 
Reply With Quote
 
Joe Butler
Guest
Posts: n/a
 
      09-26-2005
I think I disagree.

If you can fit something into a cheaper processor model because you save a
couple of bytes by changing 1 or two loops, then you are not in trouble
anymore.


"Flash Gordon" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)-gordon.me.uk...
> Joe Butler wrote:
> >>That is not an optimization, but a total waste of time. Read the first
> >>example in "Elements of programming style" and learn...

> >
> > What if the difference is between fitting into memory and not?

>
> If you are hunting for that small an amount to get the program to fit
> then you are in trouble anyway. Something will need changing making it
> no longer fit!
> --
> Flash Gordon
> Living in interesting times.
> Although my email address says spam, it is real and I read it.



 
Reply With Quote
 
Flash Gordon
Guest
Posts: n/a
 
      09-26-2005
Joe Butler wrote:

Don't top post. Replies belong after the text you are replying to.

> "Flash Gordon" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)-gordon.me.uk...
>
>>Joe Butler wrote:
>>
>>>>That is not an optimization, but a total waste of time. Read the first
>>>>example in "Elements of programming style" and learn...
>>>
>>>What if the difference is between fitting into memory and not?

>>
>>If you are hunting for that small an amount to get the program to fit
>>then you are in trouble anyway. Something will need changing making it
>>no longer fit!
>>--
>>Flash Gordon
>>Living in interesting times.
>>Although my email address says spam, it is real and I read it.


Don't include peoples signatures unless you are commenting on them.

> I think I disagree.
>
> If you can fit something into a cheaper processor model because you
> save a couple of bytes by changing 1 or two loops, then you are not in
> trouble anymore.


I'll be more explicit then. EVERY SINGLE TIME I have come across a
system where people have tried to squeeze the code in believing it will
just about fit (either size or speed) one of the following has happened:

1) Customer required a subsequent change which proved to be impossible
unless the card was redesigned because there was no space for the new
code.
2) A bug fix requires some additional code and oops, there is no more
space.
3) By the time all the required stuff was added that the person who
thought it would only just fit had forgotten it did NOT fit by a mile
so it did not even come close to meeting the customers requirements
4) It turned out there were massive savings to be had else where because
of higher level problems allowing me to save far more space/time than
you could possibly save by such micro-optimisations.

Only with the third of those possibilities was it possible to meet the
requirements using the existing hardware, and meeting the requirements
involved fixing the algorithms or doing large scale changes where the
coding was just plain atrocious.

So my experience is that it is never worth bothering with such
micro-optimisations.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.
 
Reply With Quote
 
Joe Butler
Guest
Posts: n/a
 
      09-27-2005
OK, point taken. Although, when working with very small memories, it can
make all the difference if a byte can be saved here and there. Afterall, 50
such 'optimisations' could amount to 10% of the total memory available. I'm
not necessarily suggesting this should be done from day 1, but have found it
useful just to get a feel for what the compiler works best with.

"Flash Gordon" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)-gordon.me.uk...
> Joe Butler wrote:
>
> Don't top post. Replies belong after the text you are replying to.
>
> > "Flash Gordon" <(E-Mail Removed)> wrote in message
> > news:(E-Mail Removed)-gordon.me.uk...
> >
> >>Joe Butler wrote:
> >>
> >>>>That is not an optimization, but a total waste of time. Read the first
> >>>>example in "Elements of programming style" and learn...
> >>>
> >>>What if the difference is between fitting into memory and not?
> >>
> >>If you are hunting for that small an amount to get the program to fit
> >>then you are in trouble anyway. Something will need changing making it
> >>no longer fit!
> >>--
> >>Flash Gordon
> >>Living in interesting times.
> >>Although my email address says spam, it is real and I read it.

>
> Don't include peoples signatures unless you are commenting on them.
>
> > I think I disagree.
> >
> > If you can fit something into a cheaper processor model because you
> > save a couple of bytes by changing 1 or two loops, then you are not in
> > trouble anymore.

>
> I'll be more explicit then. EVERY SINGLE TIME I have come across a
> system where people have tried to squeeze the code in believing it will
> just about fit (either size or speed) one of the following has happened:
>
> 1) Customer required a subsequent change which proved to be impossible
> unless the card was redesigned because there was no space for the new
> code.
> 2) A bug fix requires some additional code and oops, there is no more
> space.
> 3) By the time all the required stuff was added that the person who
> thought it would only just fit had forgotten it did NOT fit by a mile
> so it did not even come close to meeting the customers requirements
> 4) It turned out there were massive savings to be had else where because
> of higher level problems allowing me to save far more space/time than
> you could possibly save by such micro-optimisations.
>
> Only with the third of those possibilities was it possible to meet the
> requirements using the existing hardware, and meeting the requirements
> involved fixing the algorithms or doing large scale changes where the
> coding was just plain atrocious.
>
> So my experience is that it is never worth bothering with such
> micro-optimisations.
> --
> Flash Gordon
> Living in interesting times.
> Although my email address says spam, it is real and I read it.



 
Reply With Quote
 
Mark VandeWettering
Guest
Posts: n/a
 
      09-27-2005
["Followup-To:" header set to comp.arch.embedded.]
On 2005-09-26, Neo <(E-Mail Removed)> wrote:
> Hi Folks,http://www.abarnett.demon.co.uk/tutorial.html#FASTFOR Page
> states:for( i=0; i<10; i++){ ... }i loops through the values
> 0,1,2,3,4,5,6,7,8,9 If you don't care about the order of the loop counter,
> you can do this instead: for( i=10; i--; ) { ... }Using this code, i loops
> through the values 9,8,7,6,5,4,3,2,1,0, and the loop should be faster. This
> works because it is quicker to process "i--" as the test condition, which
> says "is i non-zero? If so, decrement it and continue.". For the original
> code, the processor has to calculate "subtract i from 10. Is the result
> non-zero? if so, increment i and continue.". In tight loops, this make a
> considerable difference.
> How far it holds true.. in the light of modern optimizing compilers? and
> will it make a significant difference in case of embedded systems???


You could just test it.

I think it's a mistake to obfuscate your loops just for the sake of
(what is probably) executing one more instruction which in all likelihood
isn't on the critical path of your application _anyway_. If, as you say,
you don't use the loop index, you could indeed do without the one
extra compare instruction, but you'd probably benefit from loop unrolling
too.

Premature optimization is a hindrance to software development.

Mark

> Thanks,
> -Neo
> "Do U really think, what U think real is really real?"

 
Reply With Quote
 
David Brown
Guest
Posts: n/a
 
      09-27-2005
Mark McIntyre wrote:
> On Mon, 26 Sep 2005 12:11:23 +0530, in comp.lang.c , "Neo"
> <(E-Mail Removed)> wrote:
>
>
>>Hi Folks,http://www.abarnett.demon.co.uk/tutorial.html#FASTFOR Page
>>states

>
>
> (that reversing loop order is faster)
>
> The page is talking rot. It *may* be faster. It *may* be slower. The
> only way to know is to benchmark your particular implementation in the
> specific case you're examining.
>


Actually, the page is talking rubbish about a great deal more than just
this case. It's full of generalisations that depend highly on the
compiler and target in question (the post is cross-posted to
comp.arch.embedded, so we are looking at a wide range of targets). "Use
switch instead of if...else..." (varies widely according to
target/compiler and the size of the switch), "Avoid ++, -- in while ()
expressions" (good compilers work well with such expressions), "Use
word-size variables instead of chars" (great for PPC, indifferent for
msp430, terrible for AVR), "Addition is faster than multiplication - use
'val + val + val' instead of 'val * 3' " (wrong for most compiler/target
combinations).

It's a nice idea to try to list such tips, but the page is badly out of
date, and makes all sorts of unwarranted assumptions.

So, as Mark says, benchmark your implementation. Also examine the
generated assembly code (you do understand the generated assembly? If
not, forget about such minor "optimisations".) And remember Knuth's
rules regarding such code-level optimisations:

1. Don't do it.
2. (For experts only) Don't do it yet.
 
Reply With Quote
 
Anonymous 7843
Guest
Posts: n/a
 
      09-27-2005
In article <(E-Mail Removed)>,
Neo <(E-Mail Removed)> wrote:
> for( i=10; i--; )
> { ... }


I tend to avoid this kind of loop because it's a bit less
intuitive to use with unsigned loop counters. After the
loop is done, an unsigned i would be set to some very high
implementation-defined number.

There is not much to be gained on loops that only count to
10... that extra instruction 10 times through the loop
would only add an extra 10 nanoseconds. This is likely to
pale in significance to any useful work done in the body of
the loop.

Loops that range over memory should never count backwards,
at least not when speed is important. For better or worse,
operating systems and memory caches only prefetch when
reading ascending addresses.
 
Reply With Quote
 
Dave Hansen
Guest
Posts: n/a
 
      09-27-2005
On Tue, 27 Sep 2005 18:21:59 GMT, http://www.velocityreviews.com/forums/(E-Mail Removed) (Anonymous
7843) wrote:

>In article <(E-Mail Removed)>,
>Neo <(E-Mail Removed)> wrote:
>> for( i=10; i--; )
>> { ... }

>
>I tend to avoid this kind of loop because it's a bit less
>intuitive to use with unsigned loop counters. After the
>loop is done, an unsigned i would be set to some very high
>implementation-defined number.


FWIW, my bit-bang SPI output function looks something like

bit_ctr = 8;
do
{
Set_IO(SPI_DATA, (data&0x80) != 0);
Set_IO(SPI_CLOCK, 1);
data <<= 1;
Set_IO(SPI_CLOCK, 0);

} while (--bit_ctr);

which seems intuitive for the function at hand, and generates nearly
optimal assembly on all the platforms it's used on.

Regards,

-=Dave
--
Change is inevitable, progress is not.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Which is faster in ASIC: 2-input AND gate or a 2-input multiplexer Weng Tianxiang VHDL 12 08-11-2005 10:50 AM
NEW FIREFOX 1.0.6...It seems faster and use less memory!! Ron Firefox 3 07-23-2005 02:23 AM
Is Firefox really faster and IE Zimran Douglas Firefox 21 01-14-2005 12:38 PM
I'm considering buying a new motherboard/processor combo for faster synthesis Randy Thelen VHDL 9 04-17-2004 05:01 PM
Anything faster than stat() ? Ken Tucker Perl 1 07-08-2003 06:29 AM



Advertisments