![]() |
|
|
|||||||
![]() |
VHDL - How much time does it need to sort 1 million random 64-bit/32-bit integers? |
|
|
Thread Tools | Search this Thread |
|
|
#1 |
|
Hi,
Can I find a website that lists the time for sorting 1 million 64-bit/32-bit random data using the fastest software sorting algorithm in a PC? Thank you. Weng Weng Tianxiang |
|
|
|
|
#2 |
|
Posts: n/a
|
Weng Tianxiang said:
> Hi, > Can I find a website that lists the time for sorting 1 million > 64-bit/32-bit random data using the fastest software sorting algorithm > in a PC? Apparently not. Note that, even if you could, such results would be highly hardware-dependent (although the /relative/ ordering of various sorting algorithms should be fairly consistent on any given platform). PCs vary wildly in performance characteristics. The first PC I used was an Atari ST, running at - what was it, 8 MHz? The first IBM-compatible I used was an 8088 running at - er, I forget, but it was in the 8-12MHz range, I think. The first PC I /owned/ was an 80386-SX running at 20MHz. Whoa, that baby was *fast*. Currently, I'm using a PIII which is probably a brazillion times faster than the 386. The term "a PC" doesn't mean a lot when it comes to benchmarking. I suggest you just write up the algorithms yourself, or grab them off the Net, and try them on your own hardware. Beware of the results being corrupted by caching. Do not measure I/O times. Do each test a few hundred times. -- Richard Heathfield "Usenet is a strange place" - dmr 29/7/1999 http://www.cpax.org.uk email: rjh at above domain (but drop the www, obviously) |
|
|
|
#3 |
|
Posts: n/a
|
Weng Tianxiang <> wrote:
>Can I find a website that lists the time for sorting 1 million >64-bit/32-bit random data using the fastest software sorting algorithm Weng, Just about every "good" sorting algorithm has an execution time that scales roughly as N.log(N) with the number of items to be sorted. There are, of course, many tradeoffs about speed vs. space vs. hardware cost, but for any given algorithm that's not too far from right. A million data items will fit into main memory easily on a modern PC, so there's no need to worry about file I/O time. Donald Knuth's book "Sorting and Searching" is about as good as it gets on this topic. -- Jonathan Bromley, Consultant DOULOS - Developing Design Know-how VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK http://www.MYCOMPANY.com The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated. |
|
|
|
#4 |
|
Posts: n/a
|
Weng Tianxiang schrieb:
> Hi, > Can I find a website that lists the time for sorting 1 million > 64-bit/32-bit random data using the fastest software sorting algorithm > in a PC? That depends on a lot of parameters, but I can give you one datapoint as an example: - Sorting 16777216 pair wise different 24-bit numbers takes no time at all. Kolja Sulimma |
|
|
|
#5 |
|
Posts: n/a
|
Hi,
Actually I read the related article in a website that lists the timing run by the author, but I couldn't find it again. For example, he lists times in parallel with number of data, the frequency of his computer and all related parameters that may affect the performance, he also lists the performances using different algorithms to do a comparison and made some conclusions. Thank you. Weng |
|
|
|
#6 |
|
Posts: n/a
|
"Weng Tianxiang" <> wrote in message
news: ups.com... > Hi, > Can I find a website that lists the time for sorting 1 million > 64-bit/32-bit random data using the fastest software sorting algorithm > in a PC? > > Thank you. > > Weng Since you posted on fpga and vhdl newsgroups in addition to programming, are you interested in an FPGA solution? Most of the information that one *can* find on the web deals with serial implementations. Parallel implementations for an FPGA such as the Bitonic sort can speed up the implementation significantly. |
|
|
|
#7 |
|
Posts: n/a
|
Weng Tianxiang wrote:
> Hi, > Can I find a website that lists the time for sorting 1 million > 64-bit/32-bit random data using the fastest software sorting algorithm > in a PC? > > Thank you. > > Weng A post to c.p, c.a.f, c.l.v for 3 different sets of answers. wikipedia is probably a good place to start for software only solution. Largely depends on overall memory performance since very little work is done for the cost of memory access and also what you really want to sort for, whether it is already mostly sorted or always random ordered and if you are only counting frequencies rather than true sorting. Research for quicksort, mergesort plus radix sort and counter sorts for the larger cases and bubble or insertion sort for the tiny cases. If you have Knuths 3 vol set that will help, the material has been repeated many times in every CS text. Most give you the answer in terms of ideal machines using big O notation. Knuths reference books were written in the 60s when machines ran below 1mips with memory in same ballpark as cpu, so no caches. For the PC case For very small sorts say upto many thousands of items, the entire sort can be done from L1, L2 caches and is likely to be a good test of the cpu & cache system. Bigger sorts can use these smaller sorts in succession with some sort of merging. 1 random million isn't even remotely going to take longer than a blink since L2 caches are big enough to hold a good fraction of the 1M dataset. In any nix system you already have a basic quicksort on the cmd line but IIRC its not known to be stable for data that is already nearly sorted. If you code up quicksort, you can randomize the order to guarantee quicksort sticks to O N log N time otherwise it can degrade badly. Mergesort always takes bigger O N log N time since it does more work but it doesn't degrade with data patterns. A quicksort with pre randomization probably comes out even. Now if computers still had a form of memory that had the performance of SRAM but the cost and size of DRAM, the answer would be much easier to predict and radix sort would come out handily as that sort time follows the no of partitions performed on the word * no of items. Most radix sorts will use byte sorting so that would mean 4 passes of byte testing and writing items to 1 of 256 respective buckets which can get very large, esp if the data is mostly the same. If you have enough of this flat memory to hold the data on input and output, then you can do it in 2 passes with 65536 buckets, this will break modern PC caches though. Since most PCs don't have SRAM like memory they simulate that with cache hierarchy the answer gets more complicated to predict. Fully random uncached accesses into PC main memory are surprisingly slow, towards several 100ns. Thats called a Memory wall. If you sort for N random ordered items and sweep N from very small to large to enormous, the quicksort and others follow O N log N time but with abrupt steps in O that increase as the hardware works ever harder to simulate flat memory access times. IIRC O can vary over over a factor of 10. O isn't anywhere near constant except on ideal machines with something like RLDRAM or infinitely big caches. I believe that mergesort will have a much more constant O. For the FPGA case. Each compare and swap could be done in an atomic operation needing 2 reads and possibly 2 conditional writes in merge sort. WIth a dual port memory that might take 1 or 2 clocks in a pipeline fashion for sorts that fit into Blockram. Now if the data is spread over many sort engines, the parallelism may make up for the much lower clock. I might favor radix 3 sort since you could effectively use BlockRams as temporary buckets. Basically stream all the DRAM values through to 2048 distinct buckets depending in turn on upper, middle, lower 11b fields. As each bucket fills, save it back to DRAM and reset to empty. Same can be done slower with radix 4 with 256 buckets or even faster with radix 2 with 65536 buckets on a really big FPGA. The sort time will approach 6M (or resp 8M, 4M) memory cycles at the fastest possible rate for just the keys. Some complexity comes in managing the buckets in the output stream into seemingly contiguous buckets. If you are just doing counter sort of values limited to x values, things get so much simpler, just funnel a memory read stream through x counters. Does x have to be 2^32? John Jakson transputer guy |
|
|
|
#8 |
|
Posts: n/a
|
Weng Tianxiang wrote:
> Hi, > Can I find a website that lists the time for sorting 1 million > 64-bit/32-bit random data using the fastest software sorting algorithm > in a PC? > > Thank you. > > Weng > About a half second on my machine. Is that what you really wanted to know? -- Joe Wright "Everything should be made as simple as possible, but not simpler." --- Albert Einstein --- |
|
|
|
#9 |
|
Posts: n/a
|
Hi,
What I really want to know is that the following formula for best sorting timing is correct (it is copied from Knuth's "Sorting and Searching") and it is not too far away from the PC computer reality: 14.5* N * (lg N). There are about 20 algorithms, and I reall don't know what formula should be selected as a representative for best software algorithm running for 1 million random data. 14.5* N * (lg N) is likely the best one. Thank you. Weng |
|
|
|
#10 |
|
Posts: n/a
|
"Weng Tianxiang" <> wrote in message news: ups.com... > Hi, > Can I find a website that lists the time for sorting 1 million > 64-bit/32-bit random data using the fastest software sorting algorithm > in a PC? Here's a ballpark figure: Let n = 1000000 n*log2(n) / 3Ghz = 6.64385619 milliseconds - Oliver |
|