Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > code optimization

Reply
Thread Tools

code optimization

 
 
syed
Guest
Posts: n/a
 
      09-24-2005
I want clues to optimize a function that copies elements of a
2-dimensional array to another. Is there a way to optimize this piece
of code so that it runs faster?



#define RIDX(i,j,n) ((i)*(n)+(j))

void myrotate(int dim, pixel *src, pixel *dst)
{

for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
dst[RIDX(dim-1-j, i, dim)] = src[RIDX(i, j, dim)];


}

 
Reply With Quote
 
 
 
 
Richard Tobin
Guest
Posts: n/a
 
      09-24-2005
In article <(E-Mail Removed). com>,
syed <(E-Mail Removed)> wrote:

>I want clues to optimize a function that copies elements of a
>2-dimensional array to another. Is there a way to optimize this piece
>of code so that it runs faster?


Probably not much. You are changing the order of the elements, so
you can't use memcpy().

>#define RIDX(i,j,n) ((i)*(n)+(j))


Depending on what you are doing with the array, you might be able to
avoid copying it altogether, and instead use a different version of
RIDX to access the elements in the new order.

-- Richard
 
Reply With Quote
 
 
 
 
Ulrich Eckhardt
Guest
Posts: n/a
 
      09-24-2005
One thing up front: C itself doesn't define anything like performance. I'll
somewhat assume a typical desktop CPU.

syed wrote:
> I want clues to optimize a function that copies elements of a
> 2-dimensional array to another. Is there a way to optimize this piece
> of code so that it runs faster?
>
>
>
> #define RIDX(i,j,n) ((i)*(n)+(j))


Why? Why not use an inline function?

> void myrotate(int dim, pixel *src, pixel *dst)


Okay, a few points here:
- If 'dim' doesn't change, you could make it a constant giving the compiler
possibility to optimise code better.
- You're not going to modify 'src', so make that a pointer to const pixel.
Note that this won't necessarily influence performance.
- There's the 'restrict' property in C99 which (AFAIK) can be used to tell
the compiler that the two ranges don't overlap. This might make things
faster as the compiler can prefetch the read-values earlier.

> {
>
> for (i = 0; i < dim; i++)


Hmm, where is 'i'? Making it a global doesn't speed up things. Also:
assert(src && dst && (dim>0));


> for (i = 0; i < dim; i++)
> for (j = 0; j < dim; j++)
> dst[RIDX(dim-1-j, i, dim)] = src[RIDX(i, j, dim)];


Hmm, not much that can be done here. What I'd try is to compute a pointer
to the row in the inner loop in order not to redo this computation for
every pixel. Maybe a "duff's device" would help, but I'd rather expect
current compilers to unroll loops themselves.

Lastly, the probably fastest way is when this is simply not done - instead
you could compute the index differently when accessing the matrix.

Uli

 
Reply With Quote
 
Syed
Guest
Posts: n/a
 
      09-24-2005
Hi

Thanks for your reply, it was very helpful. I would like to iterate through
the 2-dimensional array using a single loop (instead of two). Is it
possible?

Also, is there any room for loop-unrolling in the code snippet?

Thanks again.


Syed Ijaz

-----
#define RIDX(i,j,n) ((i)*(n)+(j))

void myrotate(int dim, pixel *src, pixel *dst)
{

for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
dst[RIDX(dim-1-j, i, dim)] = src[RIDX(i, j, dim)];


}


 
Reply With Quote
 
Malcolm
Guest
Posts: n/a
 
      09-24-2005
"syed" <(E-Mail Removed)> wrote
>
>I want clues to optimize a function that copies elements of a
> 2-dimensional array to another. Is there a way to optimize this piece
> of code so that it runs faster?
>
>
>
> #define RIDX(i,j,n) ((i)*(n)+(j))
>
> void myrotate(int dim, pixel *src, pixel *dst)
> {
>
> for (i = 0; i < dim; i++)
> for (j = 0; j < dim; j++)
> dst[RIDX(dim-1-j, i, dim)] = src[RIDX(i, j, dim)];
>
>
> }
>


You are calculating i and j * dim on each iteration.
If you maintain an offset index and increment or decrement by dim on each
pass, you might save a few instructions.


 
Reply With Quote
 
Richard Tobin
Guest
Posts: n/a
 
      09-24-2005
In article <dh3skm$h0s$(E-Mail Removed)-infra.bt.com>,
Malcolm <(E-Mail Removed)> wrote:

>> #define RIDX(i,j,n) ((i)*(n)+(j))


>> dst[RIDX(dim-1-j, i, dim)] = src[RIDX(i, j, dim)];


>You are calculating i and j * dim on each iteration.


A reasonable compiler will avoid this.

In general modern compilers are likely to be better at this kind of
micro-optimisation than programmers, because they know the
characteristics of the target processor. Hand-optimising for
one architecture may make things worse on another.

-- Richard
 
Reply With Quote
 
Walter Bright
Guest
Posts: n/a
 
      09-25-2005

"Richard Tobin" <(E-Mail Removed)> wrote in message
news:dh3ure$22go$(E-Mail Removed)...
> In article <dh3skm$h0s$(E-Mail Removed)-infra.bt.com>,
> Malcolm <(E-Mail Removed)> wrote:
>
> >> #define RIDX(i,j,n) ((i)*(n)+(j))

>
> >> dst[RIDX(dim-1-j, i, dim)] = src[RIDX(i, j, dim)];

>
> >You are calculating i and j * dim on each iteration.

>
> A reasonable compiler will avoid this.
>
> In general modern compilers are likely to be better at this kind of
> micro-optimisation than programmers, because they know the
> characteristics of the target processor. Hand-optimising for
> one architecture may make things worse on another.


Right. I'd add that it makes sense to look at the optimized assembler
produced by one's particular compiler, as it may already be doing quite a
few loop optimizations on it. Hand doing the optimizations may actually wind
up defeating the compiler ones and make things worse. Many compilers do
very, very well at optimizing loops such as this.

-Walter Bright
www.digitalmars.com free C, C++, D programming language compilers


 
Reply With Quote
 
Old Wolf
Guest
Posts: n/a
 
      09-26-2005
Ulrich Eckhardt wrote:
> One thing up front: C itself doesn't define anything like performance.
> I'll somewhat assume a typical desktop CPU.
>
> syed wrote:
>> I want clues to optimize a function that copies elements of a
>> 2-dimensional array to another. Is there a way to optimize this piece
>> of code so that it runs faster?
>>
>> #define RIDX(i,j,n) ((i)*(n)+(j))

>
> Why? Why not use an inline function?


Inline functions are a C99 feature, although many (but not all)
C89 compilers support them. One compiler I use will recognize the
keyword, but never actually inline any function (which is
conforming behaviour, in fact).

Macros may be 'evil', but can also be useful IMHO.

 
Reply With Quote
 
Sascha Springer
Guest
Posts: n/a
 
      09-26-2005
syed schrieb:
> I want clues to optimize a function that copies elements of a
> 2-dimensional array to another. Is there a way to optimize this piece
> of code so that it runs faster?


This depends on many things like processor type (RISC, CISC, VLIW,
etc.), number of registers, number of pipelines, cache sizes
(instruction, data), instruction set (SIMD, DSP) and available resources
(RAM) of the architecture involved in the given scenario.

So in general we cannot give you hints about how to optimize your code
when we don't know the details. Anyway, I'm afraid discussing such
details would be even more OT in this group.

However, the compiler for your specific platform should give you good
optimization and profiling options to find a way to improve the
performance of your code. Probably there's an optimized library for such
problems provided by a third party vendor.

Regards
Sascha
 
Reply With Quote
 
Kenneth Brody
Guest
Posts: n/a
 
      09-26-2005
syed wrote:
>
> I want clues to optimize a function that copies elements of a
> 2-dimensional array to another. Is there a way to optimize this piece
> of code so that it runs faster?
>
> #define RIDX(i,j,n) ((i)*(n)+(j))
>
> void myrotate(int dim, pixel *src, pixel *dst)
> {
>
> for (i = 0; i < dim; i++)
> for (j = 0; j < dim; j++)
> dst[RIDX(dim-1-j, i, dim)] = src[RIDX(i, j, dim)];
> }


In the "src[RIDX(i,j,dim)]" expression, "i" and "dim" don't change for the
duration of the inner for-loop. Since your RIDX macro will cause these two
same values to be multiplied each time through the loop, you can optimize
this away by calculating this only once:

for (i = 0; i < dim; i++)
{
int temp = i*dim;
for (j = 0; j < dim; j++)
dst[RIDX(dim-1-j, i, dim)] = src[temp+j];
}

You can do this one better by not calcluating the addition each time,
either:

for (i = 0; i < dim; i++)
{
pixel *src2 = &src[RIDX(i,0,dim)];
for (j = 0; j < dim; j++)
dst[RIDX(dim-1-j, i, dim)] = src2[j];
}

A similar optimization can be done on the dst[] array by realizing that the
subscript will be decrementing by dim each time through the loop. Calculate
the initial pointer as "&dst[RIDX(dim-1,i,dim)]" and then decrement that
pointer by "dim" each time through the loop.

for (i = 0; i < dim; i++)
{
pixel *src2 = &src[RIDX(i,0,dim)];
pixel *dst2 = &dst[RIDX(dim-1,i,dim)];
for (j = 0; j < dim; j++)
{
*dst2 = *src2++;
dst2 -= dim;
}
}

You have now eliminated the need for almost all of the multiplcations,
going from dim^2 to dim*2.

--
+-------------------------+--------------------+-----------------------------+
| Kenneth J. Brody | www.hvcomputer.com | |
| kenbrody/at\spamcop.net | www.fptech.com | #include <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------------+
Don't e-mail me at: <(E-Mail Removed)>

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Zero Optimization and Sign Optimization??? Ravikiran C Programming 22 11-24-2008 03:19 AM
optimization of vhdl code ashu VHDL 5 05-09-2006 07:12 PM
Compiler code optimization: see code below joshc C Programming 14 01-14-2005 01:02 AM
Optimization - any idea of actual machine code? Russell Wallace Java 7 06-11-2004 08:53 AM
Re: Code optimization... Pete Wright ASP .Net 2 07-06-2003 12:24 AM



Advertisments