Velocity Reviews > Re: How to arrange some numbers

# Re: How to arrange some numbers

Ian Collins
Guest
Posts: n/a

 03-16-2013
David Brown wrote:
>
> Insert sort is not "as simple /and/ better" than bubblesort. To do an
> efficient insertion sort (and it has to be efficient, or it is not
> "better"), you need to use a binary search to find the insertion point
> for each element, and you need to hold your elements in a linked list
> rather than a static array (otherwise you need to move half your data
> for every insertion, which is not "better").

That perceived wisdom may not be true on modern hardware.

The cost of searching and moving elements will probably be swamped by
the cost of filling caches. So in practice, searching and shuffling
data in an array will be significantly faster than doing the same with a
"more efficient" disjoint structure like a linked list. Locality of
reference trumps elegant theory every time!

--
Ian Collins

Nobody
Guest
Posts: n/a

 03-16-2013
On Sat, 16 Mar 2013 13:43:17 +0000, glen herrmannsfeldt wrote:

> There is some discussion about Radix sort, and whether it is really O(N).
> It is for fixed word size, but O() is supposed to be for limit as N goes
> to infinity.

Radix sort is O(n.log(N)), where n is the number of elements in the list
and N is the number of elements in the set from which the list elements
are drawn (i.e. log(N) is a measure of the size of an element).

Whether radix sort is O(n) boils down to whether you treat N as a constant.

E.g. you could write a function to radix sort strings, with the (worst
case) number of passes proportional to the length of the longest string.
Such a function would be O(m.n) where m is the length of the longest
string, i.e. O(n.log(N)) where N = (2^m) = the cardinality of the set of
strings of length m.

Öö Tiib
Guest
Posts: n/a

 03-16-2013
On Saturday, 16 March 2013 16:40:23 UTC+2, David Brown wrote:
> Insert sort is not "as simple /and/ better" than bubblesort.

Nonsense, insertion sort *is* bubble sort with pointless sequential
swaps replaced with insert that is both cheaper and shorter.

> To do an efficient insertion sort (and it has to be efficient, or it is not
> "better"), you need to use a binary search to find the insertion point
> for each element, and you need to hold your elements in a linked list
> rather than a static array (otherwise you need to move half your data
> for every insertion, which is not "better").

You add complexity that does not win you much here. Always think and
measure. Linked list is least efficient thing. If you go for dynamic
structure then take red black tree.

glen herrmannsfeldt
Guest
Posts: n/a

 03-16-2013
David Brown <(E-Mail Removed)> wrote:

(snip, someone wrote)
>> but insert sort is as simple *and* better

> Insert sort is not "as simple /and/ better" than bubblesort. To do an
> efficient insertion sort (and it has to be efficient, or it is not
> "better"), you need to use a binary search to find the insertion point
> for each element, and you need to hold your elements in a linked list
> rather than a static array (otherwise you need to move half your data
> for every insertion, which is not "better").

I haven't looked at the details recently, but the faster implementations
of quicksort don't sort subarrays smaller than a given (small) size,
then do an insertion sort on the whole array at the end. Seems to me
that if you do linear search starting at the end that it is fast for

> Of course, this is for a general sort - for some kinds of applications,
> insertion sort is very well suited.

For an almost sorted array, both bubble sort and (some implementations
of) insertion sort are fast.

> Unless you meant that insert sort is as simple and better than bogosort...

-- glen

glen herrmannsfeldt
Guest
Posts: n/a

 03-16-2013
Eric Sosman <(E-Mail Removed)> wrote:

(snip)

>> for(i=1; i<N; i++)
>> for(j=i; j>0; j--)
>> {
>> if (a[j-1] > a[j])
>> swap(a[j-1], a[j]);
>> else
>> break;
>> }

> A simple change will speed this up significantly on most
> machines. The problem is with the swap() operation: It will
> exchange (for example) items [10] and [9], then [9] and [8],
> then [8] and [7] -- six memory writes to slide the original
> [10] element three places to the left. By holding the "moving"
> element in a temporary throughout, you can avoid almost half
> the writes:

Yes, but the cache might also figure that out. Or it might not.

It is usual to study sort algorithms based on comparisons
and stores, but that might not accurately indicate the time
on real hardware. But much sort research was done before caching
was common in processors, and even so it isn't easy to take
into account.

-- glen

glen herrmannsfeldt
Guest
Posts: n/a

 03-16-2013
pete <(E-Mail Removed)> wrote:
> David Brown wrote:

(snip)
>> Insert sort is not "as simple /and/ better" than bubblesort. To do an
>> efficient insertion sort (and it has to be efficient, or it is not
>> "better"), you need to use a binary search to find the insertion point
>> for each element, and you need to hold your elements in a linked list
>> rather than a static array (otherwise you need to move half your data
>> for every insertion, which is not "better").

> No.
> For insertion sort to be better than bubble sort,
> it merely needs to be better than bubble sort in one way,
> and not worse than bubble sort in any way.

But you have to carefully define "way".

> Insertion sort does less data comparisons
> than bubble sort for the average case.
> If insertion sort is implemented
> with a temporary data variable,
> then insertion sort can do less data movement than bubble sort.
> If insertion sort is implemented as a swapping algorithm
> then insertion sort does the same amount of data movement
> as bubble sort.

You are assuming that comparisons and stores are the only
thing that count.

Consider a highly parallel system with N (very simple)
processors that can, on each clock cycle compare two
elements, store one, and move one. I believe that you
will find that the compare and swap pattern of bubblesort
lends itself to highly parallel sorting better than
insertion sort.

> For data distributions when the running time
> of insertion sort varies with (N*N),
> the running time of binary insertion sort
> also varies with (N*N),
> because both algorithms
> do about the same amount of data movement.

But bubble sort has run time O(N) with
O(N) processing units and neighbor to neighbor
communication. I am not so sure about insertion
sort.

> There are certain data distributions
> for which the running time of insertion sort
> varies directly with N, where N is the number of
> elements in the array, the running time of binary
> insertion sort can never do better than to vary
> directly with N*log(N).

But again, it depends on the initial data ordering and
also the hardware available.

-- glen

Stephen Sprunk
Guest
Posts: n/a

 03-16-2013
On 16-Mar-13 14:41, Eric Sosman wrote:
> On 3/16/2013 2:39 PM, Robert Wessel wrote:
>> No, ordinary insertion sort requires neither of those, and can sort
>> an array with no additional storage, just like bubble sort. The
>> basic idea is that you take each element and then move it back one
>> spot at a time in the array (by swapping with the prior element)
>> until it's in the correct spot. It's still O(n**2), but usually
>> runs about twice as fast as bubble sort, and the code is usually
>> shorter too.

>
> A simple change will speed this up significantly on most machines.
> The problem is with the swap() operation: It will exchange (for
> example) items [10] and [9], then [9] and [8], then [8] and [7] --
> six memory writes to slide the original [10] element three places to
> the left. By holding the "moving" element in a temporary throughout,
> you can avoid almost half the writes:

Writes are effectively free, and have been for a long time. Reads are
what kill performance, but an insertion sort (like a bubble sort)
exhibits sufficient spatial locality to be exploited by even the dumbest
cache predictor.

OTOH, if the data set doesn't fit entirely in L1 cache, you're better
off using a faster algorithm anyway for O(N^2) reasons.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

glen herrmannsfeldt
Guest
Posts: n/a

 03-16-2013
Nobody <(E-Mail Removed)> wrote:
> On Sat, 16 Mar 2013 13:43:17 +0000, glen herrmannsfeldt wrote:

>> There is some discussion about Radix sort, and whether it is really O(N).
>> It is for fixed word size, but O() is supposed to be for limit as N goes
>> to infinity.

> Radix sort is O(n.log(N)), where n is the number of elements in the list
> and N is the number of elements in the set from which the list elements
> are drawn (i.e. log(N) is a measure of the size of an element).

> Whether radix sort is O(n) boils down to whether you treat N as a constant.

Yes. Some people believe that N should increase proportionally to n.
In practical cases, that rarely happens.

As far as I know, the most radix sorting was done on machines like:

As cards have a fixed width, there is a limit to N, while there is no
well defined limit to n. (Possible MTBF, but even then it is easy to
restart a sort after failure.)

> E.g. you could write a function to radix sort strings, with the (worst
> case) number of passes proportional to the length of the longest string.
> Such a function would be O(m.n) where m is the length of the longest
> string, i.e. O(n.log(N)) where N = (2^m) = the cardinality of the set of
> strings of length m.

-- glen

Eric Sosman
Guest
Posts: n/a

 03-17-2013
On 3/16/2013 7:21 PM, glen herrmannsfeldt wrote:
> Eric Sosman <(E-Mail Removed)> wrote:
>
> (snip)
>
>>> for(i=1; i<N; i++)
>>> for(j=i; j>0; j--)
>>> {
>>> if (a[j-1] > a[j])
>>> swap(a[j-1], a[j]);
>>> else
>>> break;
>>> }

>
>> A simple change will speed this up significantly on most
>> machines. The problem is with the swap() operation: It will
>> exchange (for example) items [10] and [9], then [9] and [8],
>> then [8] and [7] -- six memory writes to slide the original
>> [10] element three places to the left. By holding the "moving"
>> element in a temporary throughout, you can avoid almost half
>> the writes:

>
> Yes, but the cache might also figure that out. Or it might not.
>
> It is usual to study sort algorithms based on comparisons
> and stores, but that might not accurately indicate the time
> on real hardware. But much sort research was done before caching
> was common in processors, and even so it isn't easy to take
> into account.

That's why nice academic treatises on the asymptotic behavior
of this or that algorithm are not as useful as their authors may
wish you'd think.

People count comparisons (or swaps) because it's easy, not
because it's relevant. As you say, it used to be relevant, maybe
twenty or twenty-five years ago -- but if you're using such counts
to make performance decisions today, well, you're in the wrong
century. "Memory is the new disk," says Cliff Click, pointing out
that cache misses have supplanted page faults as the performance
bugaboos of modern systems. As far as I know (which ain't all *that*
far), modern computer scientists have not yet figured out how to
treat the Memory Gotcha mathematically. Not satisfactorily, anyhow:

Meanwhile, what's a practical programmer to do? You can write
N different sort routines and measure their speed on the target
hardware, but it'll turn out tomorrow's hardware favors a different
implementation than today's. That's a losing game: "It takes all
the running you can do, to keep in the same place." An approach
with less certainty but (perhaps) greater hope is to try to put the
least pressure on those parts of the system you imagine may be prone
to slowness under stress. In the case at hand, we can hope that a
cache (or caches, or write buffers, or magic) will deal with what
looks like a lot of memory writes, or we can make fewer of them to
begin with. Although there's no certainty, the latter approach seems
to my mind to be less likely to come a cropper.

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)d

Eric Sosman
Guest
Posts: n/a

 03-17-2013
On 3/16/2013 7:11 PM, glen herrmannsfeldt wrote:
> David Brown <(E-Mail Removed)> wrote:
>
> (snip, someone wrote)
>>> but insert sort is as simple *and* better

>
>> Insert sort is not "as simple /and/ better" than bubblesort. To do an
>> efficient insertion sort (and it has to be efficient, or it is not
>> "better"), you need to use a binary search to find the insertion point
>> for each element, and you need to hold your elements in a linked list
>> rather than a static array (otherwise you need to move half your data
>> for every insertion, which is not "better").

>
> I haven't looked at the details recently, but the faster implementations
> of quicksort don't sort subarrays smaller than a given (small) size,
> then do an insertion sort on the whole array at the end. [...]

That was once true, but apparently is not true any more on many
modern machines. Sorting the small sub-arrays as they appear instead
of making one Grand Pass at the end exploits the fact that the arrays'
data is likely to be in a close-to-the-CPU cache; leaving everything
for a Final Sweep virtually guarantees that much of the data will have
been evicted from cache before the Sweep touches it.

--
Eric Sosman
(E-Mail Removed)d