From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Ross J(dot) Reedstrom" <reedstrm(at)rice(dot)edu> |
Cc: | Andrew Perrin <clists(at)perrin(dot)socsci(dot)unc(dot)edu>, Bruno Wolff III <bruno(at)wolff(dot)to>, Tim Lynch <admin+pgsqladmin(at)thirdage(dot)com>, pgsql-admin(at)postgresql(dot)org |
Subject: | Re: OT: seeking query help, where? |
Date: | 2003-01-17 16:24:19 |
Message-ID: | 12454.1042820659@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
"Ross J. Reedstrom" <reedstrm(at)rice(dot)edu> writes:
> On Fri, 17 Jan 2003, Bruno Wolff III wrote:
>> The union operator already removes duplicates.
> Right, which means everyone's UNION queries have too many DISTINCTs
> in them:
Relevant to this thread: if you don't want duplicate removal, write
UNION ALL. This is considerably cheaper than UNION, so it's a good
thing to keep in mind.
Examples:
regression=# explain analyze select unique1 from tenk1 union all select unique2 from tenk1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..916.00 rows=20000 width=4) (actual time=0.20..1416.60 rows=20000 loops=1)
-> Subquery Scan "*SELECT* 1" (cost=0.00..458.00 rows=10000 width=4) (actual time=0.18..549.34 rows=10000 loops=1)
-> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4) (actual time=0.15..279.58 rows=10000 loops=1)
-> Subquery Scan "*SELECT* 2" (cost=0.00..458.00 rows=10000 width=4) (actual time=0.14..548.90 rows=10000 loops=1)
-> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4) (actual time=0.11..278.33 rows=10000 loops=1)
Total runtime: 1570.02 msec
(6 rows)
regression=# explain analyze select unique1 from tenk1 union select unique2 from tenk1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=2344.77..2444.77 rows=20000 width=4) (actual time=2881.85..3477.51 rows=10000 loops=1)
-> Sort (cost=2344.77..2394.77 rows=20000 width=4) (actual time=2881.82..3169.29 rows=20000 loops=1)
Sort Key: unique1
-> Append (cost=0.00..916.00 rows=20000 width=4) (actual time=0.21..1590.55 rows=20000 loops=1)
-> Subquery Scan "*SELECT* 1" (cost=0.00..458.00 rows=10000 width=4) (actual time=0.19..570.31 rows=10000 loops=1)
-> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4) (actual time=0.15..293.18 rows=10000 loops=1)
-> Subquery Scan "*SELECT* 2" (cost=0.00..458.00 rows=10000 width=4) (actual time=0.15..698.43 rows=10000 loops=1)
-> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4) (actual time=0.12..419.75 rows=10000 loops=1)
Total runtime: 3574.98 msec
(9 rows)
The sort-and-unique phases are what implement duplicate removal, and as
you can see they add a good deal to the cost of the query.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Adam Witney | 2003-01-17 16:29:37 | Installing on Sun machine |
Previous Message | Ross J. Reedstrom | 2003-01-17 15:54:50 | Re: OT: seeking query help, where? |