On 3/14/2011 12:10 PM, David Mathog wrote:
> Assume there are N (around 10) functions like:
> func1(void *, void*, int )
> ...
> funcN(void*,void*,int)
>
> A subset of these are to be called in the inner loop of a program,
> conditionally, but always in numerical order. That is, it might be
> 1,3,5, or 6, or 9,10, but never 2,1 or 2,3,2. I can think of three
> ways to code this (here "args" is filled in with the appropriate
> values using simple code that isn't
> relevant to the point under discussion here):
>
> 1),
> if(test1)func1(args);
> if(test2)func2(args);
> ...
> if(testN)funcN(args);
>
> 2)
> store function pointers in a list and do:
> for(i=0;i<listlen;i++){
> list[i](args); /* list[I] holds function pointer to funcI */
> }
>
> 3.
> use an enum selector with entries {USEFUNC1,USEFUNC2,... USEFUNCN}
> and store those selectors in a list:
>
> for(i=0;i<listlen;i++){
> switch (list[i]){
> case USEFUNC1: func1(args); break;
> case USEFUNC2: func2(args); break;
> ...
> case USEFUNCN: funcN(args); break;
> }
> }
>
> The functions all perform relatively small operations and could be
> inlined. Except, I don't think they can be for (2), because of the
> function pointers. Since this is in the inner loop, the resulting
> function call overhead is likely to be significant compared to the
> inlined access. If case (3) always reduced to the equivalent of a
> Fortran computed goto it would be faster than (1), since it could skip
> the unnecessary if tests. However, I don't think the standard
> guarantees that, and it might well reduce to the same code. In any
> case, it looks like (3) is likely to be the fastest, and in any case,
> should not be slower than (1). (2) would be fastest (never any extra
> logic tests) if the function pointers could be inlined, but is that
> possible in this case?
>
> Are there any other options for this situation?
Sure. For example,
switch (bitmask_of_funcs_to_call) {
case 1:
func1(args);
break;
case 2:
func2(args);
break;
case 3:
func1(args);
func2(args);
break;
case 4:
func3(args);
break;
...
Of course, with "around 10" functions the number of cases in the
switch might stress the compiler a little ... You could hammer
it back down to semi-reasonable size with a sequence of switch'es
on subsets of the bits:
switch (bitmask_of_funcs_to_call & 07) ...
switch (bitmask_of_funcs_to_call & 070) ...
switch (bitmask_of_funcs_to_call & 0700) ...
...
The technical term for this kind of transformation is "Dumb Idea."
Stick with your original #1. If the tests are expensive and their
results don't change during the loop, you might precompute them:
int flag1 = test1();
int flag2 = test2();
...
while (looping) {
if (flag1) func1(args);
if (flag2) func2(args);
...
If the test results *do* change during the loop, then please don't
overlook the cost of maintaining list[] in proposals #2 and #3.
Also, note that #2 and #3 eliminate the specific tests you're so
worried about, but introduce new tests of their own like "Should
I iterate again?" and "Is the switch value in range of the cases?"
--
Eric Sosman
d