From: | Gregory Stark <stark(at)enterprisedb(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Limited Sort |
Date: | 2006-09-18 13:26:31 |
Message-ID: | 87fyepuwzc.fsf@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
So I have a quick prototype of this and in fact it handles the common use case
of a paged web page sorted on non-indexed columns very well. If you have only
a small limit like most web pages often avoids external sorts and produces
results 10-20x faster or more. Obviously by raising the size of the input data
set and lowering the limit you can get an arbitrarily large speedup.
The changes in tuplesort are really minimal. I refactored heapify since both
the regular external sort and the bounded sort needed it, but if it hadn't
been for that I think only about 25 lines would actually be added to tuplesort
and only a handful changed.
The part that's still bugging me is how nodeLimit communicates to nodeSort.
What I've done for now is below. It bothers me that nodeLimit is peeking into
nodeSort though. In particular it seems short sighted since nodeSort is far
from the only node that could benefit from this information. If there's a
HashAggregate for example it could stop creating new hash entries when the
limit is reached. So it seems worthwhile having some sort of abstract API
where other nodes could handle the information if they want without having to
teach nodeLimit about every other node type that can.
Index: src/backend/executor/nodeLimit.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeLimit.c,v
retrieving revision 1.27
diff -c -r1.27 nodeLimit.c
*** src/backend/executor/nodeLimit.c 26 Jul 2006 19:31:50 -0000 1.27
--- src/backend/executor/nodeLimit.c 14 Sep 2006 14:53:39 -0000
***************
*** 270,275 ****
--- 270,295 ----
/* Reset position to start-of-scan */
node->position = 0;
node->subSlot = NULL;
+
+ /* This is a bit of a kluge
+ *
+ * I'm not aware of any abstract way to communicate between the two nodes.
+ * Perhaps I should add a new method in nodeSort like ExecAdviseSort and
+ * have a dispatcher in ExecNode that only has one branch. It isn't likely
+ * to be abstract enough to handle other forms of "advice" though.
+ */
+ if (!node->noCount && (IsA(outerPlanState(node), SortState)))
+ {
+ SortState *sortState = (SortState*)outerPlanState(node);
+ int64 tuples_needed = node->count + node->offset;
+
+ /* check for overflow */
+ if (tuples_needed < 0)
+ return;
+
+ sortState->bounded = true;
+ sortState->bound = tuples_needed;
+ }
}
I also think it might be easy to do what Martin suggested and trim the tapes
to the limit as well. I can't see a big use case for this since you would have
to have an extremely large input set to make it kick in before the final merge
but there's no reason not to do it either. I don't think it adds any
significant complexity.
And lastly I find the idea of putting attention into OLAP functionality
interesting. Does anyone have any ideas on where to start with that?
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2006-09-18 13:30:43 | Re: Interesting CREATE TABLE AS misbehavior |
Previous Message | Peter Eisentraut | 2006-09-18 12:59:17 | Re: 8.2 beta blockers |