Alan Krueger <(E-Mail Removed)> wrote:

> John C. Bollinger wrote:

> > TreeSet has to maintain its internal order, including keeping its tree

> > balanced, with every addition. This can be tremendously slower than

> > putting the same objects in a List and sorting them once, and we have

> > seen it to be so, given enough elements in the set.

>

> Not sure how this can be true. TreeSet Javadoc says, "This

> implementation provides guaranteed log(n) time cost for the basic

> operations (add, remove and contains)." This implies O(N log N) time to

> insert N items. Collections.sort Javadoc says, "This algorithm offers

> guaranteed n log(n) performance."
The difference is in the definition of "n". The time to perform some

set of add/remove/contains operations on a TreeSet would actually be on

the order of m * log(n), where 'm' is the total number of operations and

n is the average number of elements in the TreeSet. If all operations

are add operations as in your example, then this turns out to be the

same as n * log(n), but only because we can demonstrate that m == n / 2.

However, if m is not on the order of n, then the fact doesn't hold.

Furthermore, in reality, the number of operations on a long-lived data

structure is rarely on the order of the average size, but is instead

much greater. If the average size of the tree is on the order of the

square root of the number of operations, then the total time of the task

becomes O(n^2 * log(n)), which is looking considerably worse. If the

number of ordered traversals is constant or otherwise doesn't scale with

the size of the problem, then an explicit sort per traversal is still

only O(n * log(n)), and the TreeSet is looking pretty bad by comparison.

> Doubtful, since adding an item to a TreeSet would take O(log N) time;

> adding to a sorted list would be no faster and the add-and-resort for an

> unsorted list would certainly be slower.
Adding to a sorted list is definitely O(log(n)), but adding to an

unsorted list can be constant-time. Sorting at need can then be cheaper

than working with the SortedSet data structure, as explained above.

--

www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer

MindIQ Corporation