From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PoC: Duplicate Tuple Elidation during External Sort for DISTINCT |
Date: | 2014-01-22 03:53:27 |
Message-ID: | 26684.1390362807@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> writes:
> A rough summary of the patch follows:
> - a GUC variable enables or disables this capability
> - in nodeAgg.c, eliding duplicate tuples is enabled if the number of
> distinct columns is equal to the number of sort columns (and both are
> greater than zero).
> - in createplan.c, eliding duplicate tuples is enabled if we are
> creating a unique plan which involves sorting first
> - ditto planner.c
> - all of the remaining changes are in tuplesort.c, which consist of:
> + a new macro, DISCARDTUP and a new structure member, discardtup, are
> both defined and operate similar to COMPARETUP, COPYTUP, etc...
> + in puttuple_common, when state is TSS_BUILDRUNS, we *may* simply
> throw out the new tuple if it compares as identical to the tuple at
> the top of the heap. Since we're already performing this comparison,
> this is essentially free.
> + in mergeonerun, we may discard a tuple if it compares as identical
> to the *last written tuple*. This is a comparison that did not take
> place before, so it's not free, but it saves a write I/O.
> + We perform the same logic in dumptuples
[ raised eyebrow ... ] And what happens if the planner drops the
unique step and then the sort doesn't actually go to disk?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2014-01-22 03:55:38 | Re: ALTER SYSTEM SET typos and fix for temporary file name management |
Previous Message | Tom Lane | 2014-01-22 03:49:36 | Re: [bug fix] pg_ctl always uses the same event source |