From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Dann Corbit" <DCorbit(at)connx(dot)com> |
Cc: | "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, "Jerry Sievers" <jerry(at)jerrysievers(dot)com> |
Subject: | Re: qsort, once again |
Date: | 2006-03-17 05:27:05 |
Message-ID: | 25294.1142573225@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"Dann Corbit" <DCorbit(at)connx(dot)com> writes:
>> So my feeling is we should just remove the swap_cnt code and return
>> to the original B&M algorithm.
> Even if "hunks" of the input are sorted, the test is a very good idea.
Yah know, guys, Bentley and McIlroy are each smarter than any five of
us, and I'm quite certain it occurred to them to try prechecking for
sorted input. If that case is not in their code then it's probably
because it's a net loss. Unless you have reason to think that sorted
input is *more* common than other cases for the Postgres environment,
which is certainly a fact not in evidence.
(Bentley was my thesis adviser for awhile before he went to Bell Labs,
so my respect for him is based on direct personal experience. McIlroy
I only know by reputation, but he's sure got a ton of that.)
> Imagine also a table that was clustered but for which we have not
> updated statistics. Perhaps it is 98% sorted. Checking for order in
> our partitions is probably a good idea.
If we are using the sort code rather than the recently-clustered index
for such a case, then we have problems elsewhere. This scenario is not
a good argument that the sort code needs to be specialized to handle
this case at the expense of other cases; the place to be fixing it is
the planner or the statistics-management code.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Dann Corbit | 2006-03-17 05:28:52 | Re: qsort, once again |
Previous Message | Qingqing Zhou | 2006-03-17 01:51:21 | Re: Separate BLCKSZ for data and logging |