Velocity Reviews > C++ > Fastest way to get minimum of four integer values?

# Fastest way to get minimum of four integer values?

anders.johansen@gmail.com
Guest
Posts: n/a

 04-01-2008
Hi,

I have a tight loop where I basically do this over and over (millions
of times):

int cell = <calculation>
const int above = <calculation>
if (cell > above) cell = above;
const int below = <calculation>
if (cell > below) cell = below;
const int trans = <calculation>
if (cell > trans) cell = trans;

Is there a faster way to do this?

Sincerely,
Anders

kasthurirangan.balaji@gmail.com
Guest
Posts: n/a

 04-01-2008
On Apr 1, 1:42*pm, (E-Mail Removed) wrote:
> Hi,
>
> I have a tight loop where I basically do this over and over (millions
> of times):
>
> int cell = <calculation>
> const int above = <calculation>
> if (cell > above) cell = above;
> const int below = <calculation>
> if (cell > below) cell = below;
> const int trans = <calculation>
> if (cell > trans) cell = trans;
>
> Is there a faster way to do this?
>
> Sincerely,
> * Anders

the if could be replaced like
cell=(cell > trans ? trans : cell);

By replacing i cannot say what would be the improvement. There are
other factors to be considered.
moreover, you haven't said anything about the <calculations>. you may
also want to consider running in parallel. all depends on what you are
trying to achieve, about which you haven't said much.

Thanks,
Balaji.

Martin
Guest
Posts: n/a

 04-01-2008
On Apr 1, 1:42*pm, (E-Mail Removed) wrote:
> Hi,
>
> I have a tight loop where I basically do this over and over (millions
> of times):
>
> int cell = <calculation>
> const int above = <calculation>
> if (cell > above) cell = above;
> const int below = <calculation>
> if (cell > below) cell = below;
> const int trans = <calculation>
> if (cell > trans) cell = trans;
>
> Is there a faster way to do this?
>
> Sincerely,
> * Anders

One obvious improvement is moving all <calculations> (or at least some
of them) out of the loop, if it's possible. If no - IMO there is no
considerable improvement.

Kai-Uwe Bux
Guest
Posts: n/a

 04-01-2008
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

> Hi,
>
> I have a tight loop where I basically do this over and over (millions
> of times):
>
> int cell = <calculation>
> const int above = <calculation>
> if (cell > above) cell = above;
> const int below = <calculation>
> if (cell > below) cell = below;
> const int trans = <calculation>
> if (cell > trans) cell = trans;
>
> Is there a faster way to do this?

Not from an algorithmic point of view. To find the minimum of four
values will require computing all of them and three additional comparisons.

On the other hand, there could be bit-fiddling techniques to find the
minimum without using branch operations. This can lead to better pipelining
on some processors. See, for instance:

http://www.cellperformance.com/artic...imination.html

YMMV

Best

Kai-Uwe Bux

Lionel B
Guest
Posts: n/a

 04-01-2008
On Tue, 01 Apr 2008 01:49:40 -0700, kasthurirangan.balaji wrote:

> On Apr 1, 1:42Â*pm, (E-Mail Removed) wrote:
>> Hi,
>>
>> I have a tight loop where I basically do this over and over (millions
>> of times):
>>
>> int cell = <calculation>
>> const int above = <calculation>
>> if (cell > above) cell = above;
>> const int below = <calculation>
>> if (cell > below) cell = below;
>> const int trans = <calculation>
>> if (cell > trans) cell = trans;
>>
>> Is there a faster way to do this?
>>
>> Sincerely,
>> Â* Anders

>
> the if could be replaced like
> cell=(cell > trans ? trans : cell);

Pointless. That would be no faster, is not (to my mind) any clearer and
potentially involves a redundant self-assignment of cell to itself
(although an optimising compiler would probably not generate code for
this).

[...]

--
Lionel B

Yannick Tremblay
Guest
Posts: n/a

 04-01-2008
In article <fssubs\$o9u\$(E-Mail Removed)>, Lionel B <(E-Mail Removed)> wrote:
>On Tue, 01 Apr 2008 01:49:40 -0700, kasthurirangan.balaji wrote:
>
>>
>> the if could be replaced like
>> cell=(cell > trans ? trans : cell);

>
>Pointless. That would be no faster, is not (to my mind) any clearer and
>potentially involves a redundant self-assignment of cell to itself
>(although an optimising compiler would probably not generate code for
>this).
>

Agree, any decent compiler should generate the exact same machine code for:

cell = ( cell > trans ? trans : cell);

and

if( cell > trans )
cell = trans;
else
cell = cell;

The fact that the first one can be written is C on one line is of no
relevance to the compiler.

Yan

Puppet_Sock
Guest
Posts: n/a

 04-01-2008
On Apr 1, 4:42*am, (E-Mail Removed) wrote:
> I have a tight loop where I basically do this over and over (millions
> of times):
>
> int cell = <calculation>
> const int above = <calculation>
> if (cell > above) cell = above;
> const int below = <calculation>
> if (cell > below) cell = below;
> const int trans = <calculation>
> if (cell > trans) cell = trans;
>
> Is there a faster way to do this?

Question: In the above psuedo code, the various <calculation>
entries use a lot of time, yes? That is, the various if-greater
tests are a tiny portion of the overall run-time, yes?
So, if the calcs happen every time through this loop, you
probably will find more opportunity for optimizing in those
calcs rather than the comparatively small potential in
the if-greater tests.

I have a standard response I trot out. Where's your stop-watch?
Where's your performance spec? First measure how much time
various parts of the code take to run. Then look to your spec
to see if that is acceptable. If you've made the spec, relax!
And don't sweat the stuff that takes a small part of the time.
Socks

anders.johansen@gmail.com
Guest
Posts: n/a

 04-01-2008
First of all, thanks to all who answered.

I was kinda hoping somebody had a magic bit-twiddeling trick up their
sleeve for this. Never hurts to ask

On Apr 1, 4:21 pm, Puppet_Sock <(E-Mail Removed)> wrote:
> On Apr 1, 4:42 am, (E-Mail Removed) wrote:
>
> > I have a tight loop where I basically do this over and over (millions
> > of times):

>
> > int cell = <calculation>
> > const int above = <calculation>
> > if (cell > above) cell = above;
> > const int below = <calculation>
> > if (cell > below) cell = below;
> > const int trans = <calculation>
> > if (cell > trans) cell = trans;

>
> > Is there a faster way to do this?

>
> Question: In the above psuedo code, the various <calculation>
> entries use a lot of time, yes? That is, the various if-greater
> tests are a tiny portion of the overall run-time, yes?
> So, if the calcs happen every time through this loop, you
> probably will find more opportunity for optimizing in those
> calcs rather than the comparatively small potential in
> the if-greater tests.

Actually the calculations are mainly table look-ups. It looks like
this:

int thismin = 100

<...>

<loop starts>

int cell = (acceptMatrix[false][source[i-1]][target[1]]) ?
distmatrix[i-1][1] : distmatrix[i-1][1]+1 ;
const int above (distmatrix[i-1][2]+1);
if (cell>above) cell=above;
const int left (distmatrix[i][1]+1);
if (cell>left) cell=left;
const int trans (
distmatrix[i-2][0] + 1 +
(!acceptMatrix[false][source[i-2]][target[1]]) +
(!acceptMatrix[true][source[i-1]][target[0]])
);
if (cell>trans) cell=trans;
distmatrix[i][2]=cell;
if (cell<thismin) thismin=cell;

<loop ends>

distmatrix is a 2-D array of int.

acceptmatrix is a 3D array of bool.

i and j are unsigned int loop indexes

source and target are const unsigned char * const

An additional question is whether the use of named const temporaries
is detrimental. Should I perhaps instead hand-inline the table
lookups, and let the compiler find the common expressions and
optimize, or...

I don't know how smart the compiler is, but I suspect "not too
bright". It's Borland C++ Builder version 6, so a 6 years old compiler
at best.

> I have a standard response I trot out. Where's your stop-watch?
> Where's your performance spec? First measure how much time
> various parts of the code take to run. Then look to your spec
> to see if that is acceptable. If you've made the spec, relax!
> And don't sweat the stuff that takes a small part of the time.
> Socks

The "stop-watch" is outside this algorithm, but currently this code
fragment dominates execution time. Changing the calculation of "trans"
from an if/then based structure to the current gobbledygook decreased
run time by approximately 15%, probably by eliminating a few branches
and a temporary or two. I compare the output to the original
implementation for a large dataset, so I am sure that behaviour was
not changed in any obvious way.I am also pursuing other ways to
decrease the data set that the algorithm iterates over, but that's an
entirely different discussion.

Of course I could just get some hotshot to hand-assy it for me, but
then I would be locked-in to that particular "magic" implementation.
Further, I doubt that solution would be as easy to hand-unroll as the
C/C++ version is.

Anders

Michael DOUBEZ
Guest
Posts: n/a

 04-02-2008
(E-Mail Removed) a écrit :
> First of all, thanks to all who answered.
>
> I was kinda hoping somebody had a magic bit-twiddeling trick up their
> sleeve for this. Never hurts to ask
>
> [snip]
>
> int thismin = 100
>
> <...>
>
> <loop starts>
>
> int cell = (acceptMatrix[false][source[i-1]][target[1]]) ?
> distmatrix[i-1][1] : distmatrix[i-1][1]+1 ;
> const int above (distmatrix[i-1][2]+1);
> if (cell>above) cell=above;
> const int left (distmatrix[i][1]+1);
> if (cell>left) cell=left;
> const int trans (
> distmatrix[i-2][0] + 1 +
> (!acceptMatrix[false][source[i-2]][target[1]]) +
> (!acceptMatrix[true][source[i-1]][target[0]])
> );

Funny that you add integers and boolean.
What value do you expect from a conversion from bool to int ?
Or is the code incomplete ?

> if (cell>trans) cell=trans;
> distmatrix[i][2]=cell;
> if (cell<thismin) thismin=cell;
>
> <loop ends>
>
> distmatrix is a 2-D array of int.
>
> acceptmatrix is a 3D array of bool.
>
> i and j are unsigned int loop indexes
>

I see no j variable in your code.

> source and target are const unsigned char * const
>
> An additional question is whether the use of named const temporaries
> is detrimental. Should I perhaps instead hand-inline the table
> lookups, and let the compiler find the common expressions and
> optimize, or...
>
> I don't know how smart the compiler is, but I suspect "not too
> bright". It's Borland C++ Builder version 6, so a 6 years old compiler
> at best.

In this case I suggest that you get a look at the assembly generated by
the compiler. And depending on the operations that seems consuming, you
might want to tinker a bit with compiler's directives.

Here is how I would rewrite the code you gave in order to minimize the
number of operation but I wouldn't expect to outsmart any modern compiler.

<loop starts>

int cell=distmatrix[i-1][1];
const char source_im1(source[i-1]); //reused later
const char target_1 (target[1]); //reused later
if(!acceptMatrix[false][src_im1][target_1])
{ //increment cell only in given case
++cell;
}

int cell2 (distmatrix[i-1][2]);
const int left (distmatrix[i][1]);
if (cell2>left) cell2=left;

int trans (distmatrix[i-2][0]);
if(!acceptMatrix[false][source[i-2]][target[1]])
{
++trans;
}
if(!acceptMatrix[true][source_im1][target_1])
{
++trans;
}
if (cell2>trans) cell2=trans;
++cell2
if (cell>cell2) cell=cell2;

distmatrix[i][2]=cell;
if (cell<thismin) thismin=cell;

<loop ends>

Michael

anders.johansen@gmail.com
Guest
Posts: n/a

 04-02-2008
Hmmm,

int cell = (acceptMatrix[false][source[i-1]][target[1]]) ?
distmatrix[i-1][1] : distmatrix[i-1][1]+1 ;
// Newval replaces above. All add one instrs. moved down
int newval (distmatrix[i-1][2]);
const int left (distmatrix[i][1]);
const int trans (
distmatrix[i-2][0] +
(!acceptMatrix[false][source[i-2]][target[1]]) +
(!acceptMatrix[true][source[i-1]][target[0]])
);
if (left < newval) newval = left;
if (trans < newval) newval = trans;
if (++newval < cell) cell=newval;
if (distmatrix[i][2]=cell < thismin) thismin=cell;

This saves two "+ 1" instructions, and replaces the remaining one with
a preincrement. The ++newval could (should?) be moved out of the if
statement for clearer code.