Re: Why do we still perform a check for pre-sorted input within qsort variants?

From: Greg Stark <stark(at)mit(dot)edu>
To: Dann Corbit <DCorbit(at)connx(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Peter Geoghegan <peter(dot)geoghegan86(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why do we still perform a check for pre-sorted input within qsort variants?
Date: 2013-03-10 01:15:44
Message-ID: CAM-w4HN9w+fD=91WcVOaSHjpM8uf_PO9Xmu3CM+3vPhmovjgSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 9, 2013 at 10:22 PM, Dann Corbit <DCorbit(at)connx(dot)com> wrote:
> Yes, you are right. I knew of a median of medians technique for pivot selection and I mistook that for the median of medians median selection algorithm (which it definitely isn't).
> I was not aware of a true linear time selection of the median algorithm {which is what median of medians accomplishes). The fastest median selection algorithm that I was aware of was quickselect, which is only linear on average.
> I think that you analysis is correct, at any rate.

Hm, I was using the terminology differently than the Wikipedia page. I
was referring to the recursive median of 5s used as the pivot
selection as "median of medians". And I still called Quicksort or
Quickselect using that pivot Quicksort or Quickselect with that
specific pivot choice algorithm.

When using that pivot choice Quicksort is O(n*log(n)) and Quickselect
(Median of Medians on Wikipedia) is O(n). But the constant factor
becomes larger than if the pivot choice algorithm is O(1). I suppose
it's more interesting in the case of Quickselect since there's no
other alternative algorithms that could be used that have better
constant factors whereas for sorting we have other options.

I wonder if it makes sense to use timsort as the fallback for
quicksort if the partition sizes are skewed. Timsort is specifically
designed to handle presorted inputs well. On the other hand it is
itself a hybrid sort so it might be getting overly complex to make it
part of a hybrid algorithm.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dann Corbit 2013-03-10 01:45:43 Re: Why do we still perform a check for pre-sorted input within qsort variants?
Previous Message Thom Brown 2013-03-10 00:21:04 Re: Call for Google Summer of Code mentors, admins