Velocity Reviews > C++ > About parallel sorting...

aminer
Guest
Posts: n/a

 09-04-2012
Hello,

http://www.redgenes.com/Lecture-Sorting.pdf

I have tried to simulate this parallel partition method ,
but i don't think it will scale cause we have to do a merging,
which essentially is an array-copy operation but this array-copy
operations will be expensive compared to an integer compare
operation that you find inside the partition fuinction, and it will still
be expensive compared to a string compare operation that you find
inside the partition function. So since it's not scaling i have abondoned
the idea to implement this parallel partition method in my parallel
quicksort.

http://www.economyinformatics.ase.ro.../EN4/alecu.pdf

And i have implemented this algorithm just to see what is its performance.
and i have noticed that the serial algorithm is 8 times slower than the
merge
function that you find in the serial mergesort algorithm.So 8 times slower,
it's too slow.

So the only way to do a somewhat better parallel sorting algorithm,
it's to use the following algorithm;

http://www.drdobbs.com/parallel/para...arallel%2Bsort

The idea is simple:

Let's assume we want to merge sorted arrays X and Y. Select X[m]
median element in X. Elements in X[ .. m-1] are less than or equal to
X[m]. Using binary search find index k of the first element in Y greater
than X[m]. Thus Y[ .. k-1] are less than or equal to X[m] as well.
Elements in X[m+1..] are greater than or equal to X[m] and Y[k .. ]
are greater. So merge(X, Y) can be defined as
concat(merge(X[ .. m-1], Y[ .. k-1]), X[m], merge(X[m+1.. ], Y[k .. ]))
now we can recursively in parallel do merge(X[ .. m-1], Y[ .. k-1]) and
merge(X[m+1 .. ], Y[k .. ]) and then concat results.

Thank you,
Amine Moulay Ramdane.

aminer
Guest
Posts: n/a

 09-04-2012
Hello,

There is another parallel method for parallel partition, here it's:

http://www.cs.sunysb.edu/~rezaul/Spr...-lecture-9.pdf

but as you will notice it's still too expensive, causeyou have to create
3 arrays and copy in them:

3. array B[ 0: n ? 1 ], lt[ 0: n ? 1 ], gt[ 0: n ? 1 ]

You can use SIMD instructions on the parallel-prefix-sum function

8. lt [ 0: n ? 1 ] ¬ Par-Prefix-Sum ( lt[ 0: n ? 1 ], + )
:
But the algorithm is still expensive i think on a quad or eight cores or
even
16 cores, you have to have much more than 16 cores to be able to benefit
from this method i think.

Thank you,
Amine Moulay Ramdane..

"aminer" <(E-Mail Removed)> wrote in message
news:k25gf2\$i53\$(E-Mail Removed)...
> Hello,
>
> Look down the the following link , it's about parallel partition:
>
> http://www.redgenes.com/Lecture-Sorting.pdf
>
>
> I have tried to simulate this parallel partition method ,
> but i don't think it will scale cause we have to do a merging,
> which essentially is an array-copy operation but this array-copy
> operations will be expensive compared to an integer compare
> operation that you find inside the partition fuinction, and it will still
> be expensive compared to a string compare operation that you find
> inside the partition function. So since it's not scaling i have abondoned
> the idea to implement this parallel partition method in my parallel
> quicksort.
>
> I have also just read the following paper about Parallel Merging:
>
> http://www.economyinformatics.ase.ro.../EN4/alecu.pdf
>
> And i have implemented this algorithm just to see what is its
> performance.
> and i have noticed that the serial algorithm is 8 times slower than the
> merge
> function that you find in the serial mergesort algorithm.So 8 times
> slower,
> it's too slow.
>
> So the only way to do a somewhat better parallel sorting algorithm,
> it's to use the following algorithm;
>
> http://www.drdobbs.com/parallel/para...arallel%2Bsort
>
>
> The idea is simple:
>
> Let's assume we want to merge sorted arrays X and Y. Select X[m]
> median element in X. Elements in X[ .. m-1] are less than or equal to
> X[m]. Using binary search find index k of the first element in Y greater
> than X[m]. Thus Y[ .. k-1] are less than or equal to X[m] as well.
> Elements in X[m+1..] are greater than or equal to X[m] and Y[k .. ]
> are greater. So merge(X, Y) can be defined as
> concat(merge(X[ .. m-1], Y[ .. k-1]), X[m], merge(X[m+1.. ], Y[k .. ]))
> now we can recursively in parallel do merge(X[ .. m-1], Y[ .. k-1]) and
> merge(X[m+1 .. ], Y[k .. ]) and then concat results.
>
>
>
> Thank you,
> Amine Moulay Ramdane.
>
>

aminer
Guest
Posts: n/a

 09-04-2012
Hello,

"Parallel hybrid merge algorithm was developed that outperformed
sequential simple merge as well as STL merge by 0.9-5.8 times overall
and by over 5 times for larger arrays"

http://www.drdobbs.com/parallel/para...9204454?pgno=3

This paper as you have noticed has fogot to tell that this method is
dependant on the distribution of the data

"Select X[m] median element in X. Elements in X[ .. m-1] are less than
or equal to X[m]. Using binary search find index k of the first element in
Y greater than X[m]."

So if "median element:" of X is not near or equal to the "median
element" of Y so this method will have bad parallel performance
and it will not scale.

Merci,
Amine Moulay Ramdamne

"aminer" <(E-Mail Removed)> wrote in message
news:k25gf2\$i53\$(E-Mail Removed)...
> Hello,
>
> Look down the the following link , it's about parallel partition:
>
> http://www.redgenes.com/Lecture-Sorting.pdf
>
>
> I have tried to simulate this parallel partition method ,
> but i don't think it will scale cause we have to do a merging,
> which essentially is an array-copy operation but this array-copy
> operations will be expensive compared to an integer compare
> operation that you find inside the partition fuinction, and it will still
> be expensive compared to a string compare operation that you find
> inside the partition function. So since it's not scaling i have abondoned
> the idea to implement this parallel partition method in my parallel
> quicksort.
>
> I have also just read the following paper about Parallel Merging:
>
> http://www.economyinformatics.ase.ro.../EN4/alecu.pdf
>
> And i have implemented this algorithm just to see what is its
> performance.
> and i have noticed that the serial algorithm is 8 times slower than the
> merge
> function that you find in the serial mergesort algorithm.So 8 times
> slower,
> it's too slow.
>
> So the only way to do a somewhat better parallel sorting algorithm,
> it's to use the following algorithm;
>
> http://www.drdobbs.com/parallel/para...arallel%2Bsort
>
>
> The idea is simple:
>
> Let's assume we want to merge sorted arrays X and Y. Select X[m]
> median element in X. Elements in X[ .. m-1] are less than or equal to
> X[m]. Using binary search find index k of the first element in Y greater
> than X[m]. Thus Y[ .. k-1] are less than or equal to X[m] as well.
> Elements in X[m+1..] are greater than or equal to X[m] and Y[k .. ]
> are greater. So merge(X, Y) can be defined as
> concat(merge(X[ .. m-1], Y[ .. k-1]), X[m], merge(X[m+1.. ], Y[k .. ]))
> now we can recursively in parallel do merge(X[ .. m-1], Y[ .. k-1]) and
> merge(X[m+1 .. ], Y[k .. ]) and then concat results.
>
>
>
> Thank you,
> Amine Moulay Ramdane.
>
>