From: | Matthew Wakeling <matthew(at)flymine(dot)org> |
---|---|
To: | pgsql-performance <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Index creation time and distribution |
Date: | 2008-05-22 14:10:06 |
Message-ID: | Pine.LNX.4.64.0805221500310.16756@aragorn.flymine.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Thu, 22 May 2008, Tom Lane wrote:
> Do you have maintenance_work_mem set large enough that the index
> creation sort is done in-memory? 8.1 depends on the platform's qsort
> and a lot of them are kinda pessimal for input like this.
Looking at the fact that other indexes on the same table are created
quickly, it seems that the maintenance_work_mem isn't the issue - the sort
algorithm is.
Having lots of elements the same value is a worst-case-scenario for a
naive quicksort. I am in the middle of testing sorting algorithms for a
performance lecture I'm going to give, and one of the best algorithms I
have seen yet is that used in Java's java.util.Arrays.sort(). I haven't
been able to beat it with any other comparison sort yet (although I have
beaten it with a bucket sort, but I wouldn't recommend such an algorithm
for a database).
From the JavaDoc:
> The sorting algorithm is a tuned quicksort, adapted from Jon L. Bentley
> and M. Douglas McIlroy's "Engineering a Sort Function",
> Software-Practice and Experience, Vol. 23(11) P. 1249-1265 (November
> 1993). This algorithm offers n*log(n) performance on many data sets that
> cause other quicksorts to degrade to quadratic performance.
Matthew
--
First law of computing: Anything can go wro
sig: Segmentation fault. core dumped.
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Marlowe | 2008-05-22 16:50:33 | Re: Index creation time and distribution |
Previous Message | Guillaume Smet | 2008-05-22 13:38:25 | Re: Index creation time and distribution |