Why is Quicksort that quick?

Quicksort has a reputation for speed. In school, you learn that it is O(n log n) … the optimum for all single-threaded, comparison-based sorts. In fact, when combined with median-of-three pivot selection, tiny-partition insertion-sort, and special equal-keys handling (qv Top 3 Quicksort optimizations) it is faster than its closest comparison-based competitors: merge sort and heap sort. (That doesn’t apply to near-sorted input, but that’s another story). Timsort makes the best of merge sort, with similar extensions, to cover more edge cases than the above quicksort add-ons.

But it puzzled me that Quicksort was often much faster than, say, a heap-with-hole sort — an optimization of heap sort. Then, a data-oriented-design concept rang a bell: real-world keys are often complex data structures: pointers to key-parts scattered across the heap. Getting all the parts from RAM to cache is a dominating cost. And here are some pertinent features of Quicksort:

  • all comparisons are against one (pivot) key. No matter how scattered the pivot key parts are (in RAM) it is guaranteed they are in L1 cache.
  • as Quicksort recurses into smaller and smaller partitions, it eventually reaches one small enough that all scattered key parts are in cache. From there on down, every child recursion will be using keys completely in cache. This is repeatedly true as Quicksort works its way down through the sizes of L3/L2/L1 cache, then the bottom-level tiny insertion sorts.

Next up: how radix sort can beat Quicksort on its home (cache) turf, for real-world (complex) keys.

Advertisements

About mischasan

I've had the privilege to work in a field where abstract thinking has concrete value. That applies at the macro level --- optimizing actions on terabyte database --- or the micro level --- fast parallel string searches in memory. You can find my documents on production-system radix sort (NOT just for academics!) and some neat little tricks for developers, on my blog https://mischasan.wordpress.com My e-mail sig (since 1976): Engineers think equations approximate reality. Physicists think reality approximates the equations. Mathematicians never make the connection.
This entry was posted in algorithm, sort, Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s