Velocity Reviews > What the fastest algorithm for find max/min in an array of integers?

# What the fastest algorithm for find max/min in an array of integers?

Ben Bacarisse
Guest
Posts: n/a

 04-25-2008
James Harris <(E-Mail Removed)> writes:

> On 24 Apr, 21:28, Eugeny Myunster <(E-Mail Removed)> wrote:
>> I know, only simple one:

> ...
>> for(i=0;i<12;i++)
>> {
>> if(arr[max]<arr[i])
>> max=i;
>> }
>> for(i=0;i<12;i++)
>> {
>> if(arr[min]>arr[i])
>> min=i;
>> }

>
> In addition to the comments of others try counting down as it can
> translate to more efficient machine code. If the array has at least
> one item initialise max and min to the first item such as the
> following to replace your code above.
>
> max = min = arr[0];
> for (i = 12; --i; {
> if (arr[i] > max) max = arr[i];
> if (arr[i] < min) min = arr[i];
> }

It is common, when the first element is used like this, to make the loop:

if (arr[i] > max)
max = arr[i];
else if (arr[i] < min)
min = arr[i];

since no element that is > max can also be < min.

--
Ben.

James Harris
Guest
Posts: n/a

 04-26-2008
On 26 Apr, 00:44, Ben Bacarisse <(E-Mail Removed)> wrote:

....

> > max = min = arr[0];
> > for (i = 12; --i; {
> > if (arr[i] > max) max = arr[i];
> > if (arr[i] < min) min = arr[i];
> > }

>
> It is common, when the first element is used like this, to make the loop:
>
> if (arr[i] > max)
> max = arr[i];
> else if (arr[i] < min)
> min = arr[i];
>
> since no element that is > max can also be < min.

Interesting idea and it makes sense. I'm not sure it will be faster,
though. In particular there is no dependable pattern to the first
branch. On a modern CPU the branch prediction may have trouble
guessing which way to go. Where it guesses wrong there can be a costly
delay while the pipeline catches up again.

Without going in to esoteric stuff like cache prefetching here is a
resonably fast version of code which could come from my C, above, on a
modern x86 CPU. The cmov instruction avoids the need for any branch at
all. As you can see the conditional statements

if (arr[i] > max) max = arr[i];
if (arr[i] < min) min = arr[i];

can be rendered into just four instructions which should take about
half a cycle each.

;Find min and max of an array
;Uses
; ebx = array index
; ebp = copy of the element
; ecx = min
; edx = max

mov ecx, [arr] ;min = arr.0
mov edx, ecx ;max = arr.0
mov ebx, 11 ;i = 11

next_iter:
mov ebp, [arr + ebx * 4] ;Fetch this element

;Carry out the min and max updates
; if (arr[i] < min) min = arr[i];
; if (arr[i] > max) max = arr[i];
cmp ecx, ebp ;if min > element
cmovg ecx, ebp ;min = element
cmp edx, ebp ;if max < element
cmovl edx, ebp ;max = element

dec ebx
jnz next_iter

--

user923005
Guest
Posts: n/a

 04-26-2008
On Apr 25, 4:44*pm, Ben Bacarisse <(E-Mail Removed)> wrote:
> James Harris <(E-Mail Removed)> writes:
> > On 24 Apr, 21:28, Eugeny Myunster <(E-Mail Removed)> wrote:
> >> I know, only simple one:

> > ...
> >> * * * * for(i=0;i<12;i++)
> >> * * * * {
> >> * * * * * * * * if(arr[max]<arr[i])
> >> * * * * * * * * * * * * max=i;
> >> * * * * }
> >> * * * * for(i=0;i<12;i++)
> >> * * * * {
> >> * * * * * * * * if(arr[min]>arr[i])
> >> * * * * * * * * * * * * min=i;
> >> * * * * }

>
> > In addition to the comments of others try counting down as it can
> > translate to more efficient machine code. If the array has at least
> > one item initialise max and min to the first item such as the
> > following to replace your code above.

>
> > max = min = arr[0];
> > for (i = 12; --i; {
> > * if (arr[i] > max) max = arr[i];
> > * if (arr[i] < min) min = arr[i];
> > }

>
> It is common, when the first element is used like this, to make the loop:
>
> * *if (arr[i] > max)
> * * * *max = arr[i];
> * *else if (arr[i] < min)
> * * * *min = arr[i];
>
> since no element that is > max can also be < min.

Branches can be expensive, so here is a truly awful trick to {possibly
- YMMV} speed things up a hair:

#include <stdlib.h>
/*
"Introduction to Algorithms"
Cormen, Leiserson, Rivest
pp. 186,187
ISBN: 0-07-013143-0

Simultaneous min & max
using only 3*N/2 comparisons

Written by Dann Corbit
9/25/2007
Donated to the public domain
*/
#ifdef e_type_LONG_DOUBLE
typedef long double e_type;
#elif defined(e_type_DOUBLE)
typedef double e_type;
#elif defined(e_type_FLOAT)
typedef float e_type;
#elif defined(e_type_UNSIGNED_LONG_LONG)
typedef unsigned long long e_type;
#elif defined(e_type_LONG_LONG)
typedef long long e_type;
#elif defined(e_type_UNSIGNED_LONG)
typedef unsigned long e_type;
#elif defined(e_type_LONG)
typedef long e_type;
#elif defined(e_type_UNSIGNED)
typedef unsigned e_type;
#elif defined(e_type_INT)
typedef int e_type;
#elif defined(e_type_SHORT)
typedef short e_type;
#elif defined(e_type_UNSIGNED_SHORT)
typedef unsigned e_type;
#elif defined(e_type_CHAR)
typedef char e_type;
#elif defined(e_type_UNSIGNED_CHAR)
typedef unsigned char e_type;
#elif defined (__cplusplus)
template < class e_type > // works with stl string class etc...
#endif

/*
Computation of min and max using pairwise comparison.
*/
void minmax(
e_type * a, // input array
size_t arr_size, // array length
e_type * min_e, // smallest thing found
e_type * max_e // biggest thing found
)
{
e_type min_et;
e_type max_et;
size_t i,
n;
if (arr_size % 2) {
min_et = a[0];
max_et = a[0];
n = 1;
} else {
if (a[0] > a[1]) {
max_et = a[0];
min_et = a[1];
} else {
min_et = a[0];
max_et = a[1];
}
n = 2;
}
for (i = n; i < arr_size; i += 2) {

if (a[i] > a[i + 1]) {
max_et = max_et > a[i] ? max_et : a[i];
min_et = min_et < a[i + 1] ? min_et : a[i + 1];
} else {
max_et = max_et > a[i + 1] ? max_et : a[i + 1];
min_et = min_et < a[i] ? min_et : a[i];
}
}
*min_e = min_et;
*max_e = max_et;
}

void minmaxb(
e_type * a, // input array
size_t arr_size, // array length
e_type * min_e, // smallest thing found
e_type * max_e // biggest thing found
)
{
e_type Max[2], Min[2];
size_t i;
Max[1] = Min[1] = a[0];
for (i = 1; i < arr_size; i++) {
Max[(a[i] > Max[1])] = a[i];
Min[(a[i] < Min[1])] = a[i];
}
*min_e = Min[1];
*max_e = Max[1];
}

#if defined( UNIT_TEST ) && (defined (e_type_DOUBLE) ||
defined( __cplusplus))
#include <stdio.h>
#include <time.h>

char string[32767];
double foo[32767];
int main(void)
{
size_t i = 0;
size_t index;
double dmin=0,
dmax=0;
clock_t start,
end;
start = clock();
while (fgets(string, sizeof string, stdin)) {
foo[i++] = atof(string);
if (i > 32766)
break;
}
end = clock();
printf("Reading the data took %f seconds\n", (end - start) * 1.0 /
CLOCKS_PER_SEC);
start = clock();
for (index = 0; index < 10000; index++)
minmax(foo, i, &dmin, &dmax);
end = clock();
printf("10000 finds of min and max: min=%f, max=%f, time = %f\n",
dmin, dmax, (end - start) * 1.0 / CLOCKS_PER_SEC);
start = clock();
for (index = 0; index < 10000; index++)
minmaxb(foo, i, &dmin, &dmax);
end = clock();
printf("10000 finds of min and max: min=%f, max=%f, time = %f\n",
dmin, dmax, (end - start) * 1.0 / CLOCKS_PER_SEC);
return 0;
}
#endif
/*

Seems like a lot of fuss for: (1.906 - 1.797)/10000 = 0.0000109 second

C:\tmp>dir n*.txt
Volume in drive C has no label.
Volume Serial Number is 0890-87CA

Directory of C:\tmp

04/08/2008 04:13 PM 47,206,979 n.txt
1 File(s) 47,206,979 bytes
0 Dir(s) 4,932,272,128 bytes free
C:\tmp>cl /W4 /Ox /DUNIT_TEST /De_type_DOUBLE minmax.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762
for 80x86

minmax.c
Microsoft (R) Incremental Linker Version 8.00.50727.762

/out:minmax.exe
minmax.obj

C:\tmp>minmax < n.txt
Reading the data took 0.031000 seconds
10000 finds of min and max: min=0.000043, max=20920.000000, time =
1.906000
10000 finds of min and max: min=0.000043, max=20920.000000, time =
1.797000
*/

Harald van Dĳk
Guest
Posts: n/a

 04-26-2008
On Fri, 25 Apr 2008 18:19:51 -0700, Dann Corbit wrote:
> There is a cheesy thing which can be done if branch misprediction is
> expensive.
>
> int Max[2]={0}, Min[2] = {0};
> for (i = 0; i < arr_size; i++) {
> Max[(arr[i] > Max[1])] = arr[i];
> Min[(arr[i] < Max[1])] = arr[i];

I think you meant to write

Min[(arr[i] < Min[1])] = arr[i];

> }
>
> Max[0] and Min[0] contain "who cares what" and Max[1] and Min[1] contain
> maximum and minimum.

Ben Bacarisse
Guest
Posts: n/a

 04-26-2008
"Dann Corbit" <(E-Mail Removed)> writes:

> "Ben Bacarisse" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...

>> if (arr[i] > max)
>> max = arr[i];
>> else if (arr[i] < min)
>> min = arr[i];

<snip>
> There is a cheesy thing which can be done if branch misprediction is
> expensive.

Of course there will be data sets for which the branch prediction will
always be correct (unless it is a very unusual system).

> int Max[2]={0}, Min[2] = {0};
> for (i = 0; i < arr_size; i++) {
> Max[(arr[i] > Max[1])] = arr[i];
> Min[(arr[i] < Max[1])] = arr[i];
> }

Cheesy indeed!

<snip>
> So much for writing code that is as clear as possible and communicates the
> algorithm best.

It seems strange to me that, with computer far faster than I ever
imagined, we should still be contemplating altering source code to
squeeze performance from programs. I suppose it will never stop,
since so many people want code that is as fast as possible, rather
than fast enough.

--
Ben.

James Harris
Guest
Posts: n/a

 04-26-2008
On 26 Apr, 18:28, Ben Bacarisse <(E-Mail Removed)> wrote:

....

> It seems strange to me that, with computer far faster than I ever
> imagined, we should still be contemplating altering source code to
> squeeze performance from programs. I suppose it will never stop,
> since so many people want code that is as fast as possible, rather
> than fast enough.

Agreed. And yet despite the amazing power of modern machines there's a
detach from user experience in many cases. I suspect a small part of
this is user interface and the rest is plain old inefficiency. Maybe
the seeking for speed is happening in the wrong places. Not sure,
though. It is a /big/ disconnect.

--

user923005
Guest
Posts: n/a

 04-28-2008
On Apr 26, 3:22*pm, James Harris <(E-Mail Removed)>
wrote:
> On 26 Apr, 18:28, Ben Bacarisse <(E-Mail Removed)> wrote:
>
> ...
>
> > It seems strange to me that, with computer far faster than I ever
> > imagined, we should still be contemplating altering source code to
> > squeeze performance from programs. *I suppose it will never stop,
> > since so many people want code that is as fast as possible, rather
> > than fast enough.

>
> Agreed. And yet despite the amazing power of modern machines there's a
> detach from user experience in many cases. I suspect a small part of
> this is user interface and the rest is plain old inefficiency. Maybe
> the seeking for speed is happening in the wrong places. Not sure,
> though. It is a /big/ disconnect.

I agree with both sentiments. The awful gyrations in the code samples
I provided produce such a small improvement it would be virtually
impossible to measure on a single pass over the data (reading the data
is *by far* the slowest step anyway). Performance optimizations
should occur if and only if the code does not meet specification. If
the code does not meet spec, before anything else, it should be
profiled. After profiling, the thing that should be attacked {if
possible} is the fundamental underlying algorithm.

On the other, other hand, it is a good idea to write code that uses
appropriate and fast algorithms. If we write code intially that uses
the best choice of algorithm we are far less likely to have
performance problems that require all the other performance
enhancement steps.