From: | "Dann Corbit" <DCorbit(at)connx(dot)com> |
---|---|
To: | "Josh Burdick" <jburdick(at)gradient(dot)cis(dot)upenn(dot)edu> |
Cc: | <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: finding medians |
Date: | 2002-05-30 20:48:28 |
Message-ID: | D90A5A6C612A39408103E6ECDD77B82906F460@voyager.corporate.connx.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
ACK! Sorting to find a median is criminal.
"Introduction to Algorithms" by Thomas H. Cormen, Charles E. Leiserson,
Ronald L. Rivest
ISBN: 0262031418
explains the better algorithm very well.
Here is a freely available C++ template (written by me) for a bunch of
statistics (everything *but* the selection problem):
ftp://cap.connx.com/pub/tournament_software/STATS.HPP
It uses this template for improved summation accuracy:
ftp://cap.connx.com/pub/tournament_software/Kahan.Hpp
Here is an outline for selection. I wrote it in C++, but a rewrite to C
is trivial:
// Quickselect: find Kth smallest of first N items in array A
// recursive routine finds Kth smallest in A[Low..High]
// Etype: must have copy constructor, oeprator=, and operator<
// Nonrecursive driver is omitted.
template < class Etype >
void
QuickSelect (Etype A[], int Low, int High, int k)
{
if (Low + Cutoff > High)
InsertionSort (&A[Low], High - Low + 1);
else
{
// Sort Low, Middle, High
int Middle = (Low + High) / 2;
if (A[Middle] < A[Low])
Swap (A[Low], A[Middle]);
if (A[High] < A[Low])
Swap (A[Low], A[High]);
if (A[High] < A[Middle])
Swap (A[Middle], A[High]);
// Place pivot at Position High-1
Etype Pivot = A[Middle];
Swap (A[Middle], A[High - 1]);
// Begin partitioning
int i, j;
for (i = Low, j = High - 1;;)
{
while (A[++i] < Pivot);
while (Pivot < A[--j]);
if (i < j)
Swap (A[i], A[j]);
else
break;
}
// Restore pivot
Swap (A[i], A[High - 1]);
// Recurse: only this part changes
if (k < i)
QuickSelect (A, Low, i - 1, k);
else if (k > i)
QuickSelect (A, i + 1, High, k);
}
}
template < class Etype >
void
QuickSelect (Etype A[], int N, int k)
{
QuickSelect (A, 0, N - 1, k - 1);
}
If you want to use this stuff to improve statistics with vacuum, be my
guest.
From | Date | Subject | |
---|---|---|---|
Next Message | Dann Corbit | 2002-05-30 21:03:26 | Re: finding medians |
Previous Message | Josh Berkus | 2002-05-30 20:30:09 | Re: finding medians |