Velocity Reviews > question about macro?

MBALOVER
Guest
Posts: n/a

 03-09-2010
Hi all,

Actually I want my code to be

for (i=0;i<N;i++)
{
array2[i*30]=array1[i];
array2[i*30+1]=array1[i];
array2[i*30+2]=array1[i];
array2[i*30+3]=array1[i];
..................
..................
..................
array2[i*30+28]=array1[i];
array2[i*30+29]=array1[i];
}

If there anyway to use macros to do it instead for listing out
manually in the code like above?
(the listing out such as the above example will be very bad in the
case I need, say, array2[i*1000+0]=array1[i]; to ...
array2[i*1000+999]=array1[i]

And please note that for some reasons of my device, I do not want to
use another loop such as

for (i=0;i<N;i++)
for (j=0;j<30;j++)
array2[i*30+j]=array1[i];

If I write a macro as follows:

#define myMacro (array1, array2) \
for ( j=0;j<30;j++) \
array2[i*30+j]=array1[i]; \

Does it help? I doubt it because it will be the same as the case with
two loops that I want to avoid.

Thanks a lot.

Ian Collins
Guest
Posts: n/a

 03-09-2010
MBALOVER wrote:
> Hi all,
>
> Actually I want my code to be
>
>
> for (i=0;i<N;i++)
> {
> array2[i*30]=array1[i];
> array2[i*30+1]=array1[i];
> array2[i*30+2]=array1[i];
> array2[i*30+3]=array1[i];
> ..................
> ..................
> ..................
> array2[i*30+28]=array1[i];
> array2[i*30+29]=array1[i];
> }
>
> If there anyway to use macros to do it instead for listing out
> manually in the code like above?

Forget macros, they won't help here.

(untested)

memset( array2+i*30, array1[i], 30*sizeof(array2[0]) );

--
Ian Collins

John Bode
Guest
Posts: n/a

 03-09-2010
On Mar 9, 2:32*pm, MBALOVER <(E-Mail Removed)> wrote:
> Hi all,
>
> Actually I want my code to be
>
> for (i=0;i<N;i++)
> {
> * * array2[i*30]=array1[i];
> * * array2[i*30+1]=array1[i];
> * * array2[i*30+2]=array1[i];
> * * array2[i*30+3]=array1[i];
> .................
> .................
> .................
> * * array2[i*30+28]=array1[i];
> * * array2[i*30+29]=array1[i];
>
> }
>
> If there anyway to use macros to do it instead for listing out
> manually in the code like above?
> (the listing out such as the above example will be very bad in the
> case I need, say, *array2[i*1000+0]=array1[i]; *to ...
> array2[i*1000+999]=array1[i]
>
> And please note that for some reasons of my device, I do not want to
> use another loop such as
>
> for (i=0;i<N;i++)
> for (j=0;j<30;j++)
> array2[i*30+j]=array1[i];
>
> If I write a macro as follows:
>
> #define myMacro (array1, array2) * * * * *\
> for ( j=0;j<30;j++) * * * * * * * * * * * * * * * * *\
> * * *array2[i*30+j]=array1[i]; * * * * * * * * * *\
>
> Does it help? I doubt it because it will be the same as the case with
> two loops that I want to avoid.
>
> Thanks a lot.

I'm curious as to why an inner loop would be that bad.

Frankly, you're stuck; if you absolutely cannot use an inner loop,
then you'll have to write out each initialization manually. You could
write another program to automate the process for you, such as:

printf("for (i = 0; i < N; i++)\n{\n");
for (j = 0; j < K; j++) /** where K is the number of items */
{
printf(" array[i*30+%d] = array[i];\n", j);
}
printf("}\n");

and then paste that output into your code.

Another alternative would be to look into a variation of Duff's device
(see http://en.wikipedia.org/wiki/Duff's_device), where you use an
inner loop but partially unroll it.

for (i = 0; i < N; i++)
{
size_t j = (K + 7)/8;
switch(j %
{
case 0 : do {
case 7 : array[i*30+j+7] = array[i];
case 6 : array[i*30+j+6] = array[i];
case 5 : array[i*30+j+5] = array[i];
case 4 : array[i*30+j+4] = array[i];
case 3 : array[i*30+j+3] = array[i];
case 2 : array[i*30+j+2] = array[i];
case 1 : array[i*30+j+1] = array[i];
} while (--j > 0);
}
}

Keith Thompson
Guest
Posts: n/a

 03-09-2010
MBALOVER <(E-Mail Removed)> writes:
> Actually I want my code to be
>
>
> for (i=0;i<N;i++)
> {
> array2[i*30]=array1[i];
> array2[i*30+1]=array1[i];
> array2[i*30+2]=array1[i];
> array2[i*30+3]=array1[i];
> .................
> .................
> .................
> array2[i*30+28]=array1[i];
> array2[i*30+29]=array1[i];
> }

It certainly looks like a job for a nested for loop (but see below).

> If there anyway to use macros to do it instead for listing out
> manually in the code like above?
> (the listing out such as the above example will be very bad in the
> case I need, say, array2[i*1000+0]=array1[i]; to ...
> array2[i*1000+999]=array1[i]

There's not really any way for the preprocessor, given a number N, to
generate N statements.

> And please note that for some reasons of my device, I do not want to
> use another loop such as
>
> for (i=0;i<N;i++)
> for (j=0;j<30;j++)
> array2[i*30+j]=array1[i];

So you insist on having, say, 30 distinct assignment statements; an
equivalent loop that iterates 30 times and performs exactly the same
assignments isn't good enough.

It might help if we know *how* your device imposes this requirement.

If you just write the nested for loop, how exactly does that not work?

> If I write a macro as follows:
>
> #define myMacro (array1, array2) \
> for ( j=0;j<30;j++) \
> array2[i*30+j]=array1[i]; \
>
>
>
> Does it help? I doubt it because it will be the same as the case with
> two loops that I want to avoid.

Right, the generated code will almost certainly be the same whether
the loop is written out or generated by a macro.

What you're doing is loop unrolling. It's an optimization technique
that can result in faster but larger code. (In some cases, the
code can be slower just because it's larger, due to cache effects.)
Doing loop unrolling at the source level would usually be considered
premature optimization; a good optimizing compiler should be able
to do this for you.

But if you really need to do this, you might consider writing another
tool that generates C code from an input specification. (m4 is a
popular preprocessing tool; I don't know whether it's powerful enough
for this, but it's worth looking into.)

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson
Guest
Posts: n/a

 03-09-2010
Ian Collins <(E-Mail Removed)> writes:
> MBALOVER wrote:
>> Actually I want my code to be
>>
>>
>> for (i=0;i<N;i++)
>> {
>> array2[i*30]=array1[i];
>> array2[i*30+1]=array1[i];
>> array2[i*30+2]=array1[i];
>> array2[i*30+3]=array1[i];
>> ..................
>> ..................
>> ..................
>> array2[i*30+28]=array1[i];
>> array2[i*30+29]=array1[i];
>> }
>>
>> If there anyway to use macros to do it instead for listing out
>> manually in the code like above?

>
> Forget macros, they won't help here.
>
>
> (untested)
>
> memset( array2+i*30, array1[i], 30*sizeof(array2[0]) );

No, memset sets each byte of the target to the same value. Unless
array1 and array2 are byte arrays, it won't help here.

Something similar to memset that works on elements bigger than bytes
would do the trick. But the obvious way to implement such a function
is to use a for loop, which the OP wants to avoid for some unknown
reason.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Ian Collins
Guest
Posts: n/a

 03-09-2010
Keith Thompson wrote:
> Ian Collins <(E-Mail Removed)> writes:
>> MBALOVER wrote:
>>> Actually I want my code to be
>>>
>>>
>>> for (i=0;i<N;i++)
>>> {
>>> array2[i*30]=array1[i];
>>> array2[i*30+1]=array1[i];
>>> array2[i*30+2]=array1[i];
>>> array2[i*30+3]=array1[i];
>>> ..................
>>> ..................
>>> ..................
>>> array2[i*30+28]=array1[i];
>>> array2[i*30+29]=array1[i];
>>> }
>>>
>>> If there anyway to use macros to do it instead for listing out
>>> manually in the code like above?

>> Forget macros, they won't help here.
>>
>>
>> (untested)
>>
>> memset( array2+i*30, array1[i], 30*sizeof(array2[0]) );

>
> No, memset sets each byte of the target to the same value. Unless
> array1 and array2 are byte arrays, it won't help here.

--
Ian Collins

MBALOVER
Guest
Posts: n/a

 03-09-2010
Thanks a lot for responses.

Let me explain why I do not want to use a inner loop.

for (i=0;i<N;i++)
for (j=0;j<M;j++)
array2[i*M+j]=array1[i];

for each for iteration of a for loop, we need a comparison and one
addition. In my system, each comparison takes 3 cycles and one
So each iteration takes 4 cycles.

If I use a inner loop, for each i, I need 4*M additional cycles. In
my case M , N both are large numbers and the system's time budget is
limited.

That is why I do not want to use this way.

On Mar 9, 3:53*pm, Keith Thompson <(E-Mail Removed)> wrote:
> MBALOVER <(E-Mail Removed)> writes:
> > Actually I want my code to be

>
> > for (i=0;i<N;i++)
> > {
> > * * array2[i*30]=array1[i];
> > * * array2[i*30+1]=array1[i];
> > * * array2[i*30+2]=array1[i];
> > * * array2[i*30+3]=array1[i];
> > .................
> > .................
> > .................
> > * * array2[i*30+28]=array1[i];
> > * * array2[i*30+29]=array1[i];
> > }

>
> It certainly looks like a job for a nested for loop (but see below).
>
> > If there anyway to use macros to do it instead for listing out
> > manually in the code like above?
> > (the listing out such as the above example will be very bad in the
> > case I need, say, *array2[i*1000+0]=array1[i]; *to ...
> > array2[i*1000+999]=array1[i]

>
> There's not really any way for the preprocessor, given a number N, to
> generate N statements.
>
> > And please note that for some reasons of my device, I do not want to
> > use another loop such as

>
> > for (i=0;i<N;i++)
> > for (j=0;j<30;j++)
> > array2[i*30+j]=array1[i];

>
> So you insist on having, say, 30 distinct assignment statements; an
> equivalent loop that iterates 30 times and performs exactly the same
> assignments isn't good enough.
>
> It might help if we know *how* your device imposes this requirement.
>
> If you just write the nested for loop, how exactly does that not work?
>
> > If I write a macro as follows:

>
> > #define myMacro (array1, array2) * * * * *\
> > for ( j=0;j<30;j++) * * * * * * * * * * * * * * * * *\
> > * * *array2[i*30+j]=array1[i]; * * * * * * * * * *\

>
> > Does it help? I doubt it because it will be the same as the case with
> > two loops that I want to avoid.

>
> Right, the generated code will almost certainly be the same whether
> the loop is written out or generated by a macro.
>
> What you're doing is loop unrolling. *It's an optimization technique
> that can result in faster but larger code. *(In some cases, the
> code can be slower just because it's larger, due to cache effects.)
> Doing loop unrolling at the source level would usually be considered
> premature optimization; a good optimizing compiler should be able
> to do this for you.
>
> But if you really need to do this, you might consider writing another
> tool that generates C code from an input specification. *(m4 is a
> popular preprocessing tool; I don't know whether it's powerful enough
> for this, but it's worth looking into.)
>
> --
> Keith Thompson (The_Other_Keith) (E-Mail Removed) *<http://www.ghoti.net/~kst>
> Nokia
> "We must do something. *This is something. *Therefore, we must do this."
> * * -- Antony Jay and Jonathan Lynn, "Yes Minister"

Ian Collins
Guest
Posts: n/a

 03-09-2010
MBALOVER wrote:

> Thanks a lot for responses.
>
> Let me explain why I do not want to use a inner loop.
>
> for (i=0;i<N;i++)
> for (j=0;j<M;j++)
> array2[i*M+j]=array1[i];
>
> for each for iteration of a for loop, we need a comparison and one
> addition. In my system, each comparison takes 3 cycles and one
> So each iteration takes 4 cycles.
>
> If I use a inner loop, for each i, I need 4*M additional cycles. In
> my case M , N both are large numbers and the system's time budget is
> limited.
>
> That is why I do not want to use this way.

Have you looked at the generated code with full optimisation on your
compiler? All those I've used will perform some degree of loop
unrolling, so you will only suffer an extra

(4*M)/L

where L is the number of iterations inlined by the compiler.

--
Ian Collins

bartc
Guest
Posts: n/a

 03-09-2010
MBALOVER wrote:
> Thanks a lot for responses.
>
> Let me explain why I do not want to use a inner loop.
>
> for (i=0;i<N;i++)
> for (j=0;j<M;j++)
> array2[i*M+j]=array1[i];
>
> for each for iteration of a for loop, we need a comparison and one
> addition. In my system, each comparison takes 3 cycles and one
> So each iteration takes 4 cycles.
>
> If I use a inner loop, for each i, I need 4*M additional cycles. In
> my case M , N both are large numbers and the system's time budget is
> limited.
>
> That is why I do not want to use this way.

How about a partially unrolled inner loop?

So in the the inner loop it does 10 assignments at a time, and the loop
executes M/10 times (with any odd assignments added at the end). Then the
overhead might only be 4*M/10 cycles, and you only ever have to write out
in, full, 10 assignments (or however many you want to do in one go).

--
Bartc

Eric Sosman
Guest
Posts: n/a

 03-09-2010
On 3/9/2010 3:32 PM, MBALOVER wrote:
> Hi all,
>
> Actually I want my code to be
>
>
> for (i=0;i<N;i++)
> {
> array2[i*30]=array1[i];
> array2[i*30+1]=array1[i];
> array2[i*30+2]=array1[i];
> array2[i*30+3]=array1[i];
> .................
> .................
> .................
> array2[i*30+28]=array1[i];
> array2[i*30+29]=array1[i];
> }
>
> If there anyway to use macros to do it instead for listing out
> manually in the code like above?

Use a second loop nested inside the first.

> (the listing out such as the above example will be very bad in the
> case I need, say, array2[i*1000+0]=array1[i]; to ...
> array2[i*1000+999]=array1[i]
>
> And please note that for some reasons of my device, I do not want to
> use another loop such as
>
> for (i=0;i<N;i++)
> for (j=0;j<30;j++)
> array2[i*30+j]=array1[i];

Use a second loop nested inside-- oh, wait, sorry. It's
the right solution, but if you don't want to use it ... What
are these mysterious "reasons" of yours? Tell us about them,
and maybe somebody will have a better idea than writing line
after line after repetitive line of code.

> If I write a macro as follows:
>
> #define myMacro (array1, array2) \
> for ( j=0;j<30;j++) \
> array2[i*30+j]=array1[i]; \
>
> Does it help? I doubt it because it will be the same as the case with
> two loops that I want to avoid.

It does the right thing, which is to use a second loop-- oh,
wait, we've been through that already.

The preprocessor has no iterative constructs.

--
Eric Sosman
(E-Mail Removed)lid