From: | Anuradha Ratnaweera <ARatnaweera(at)virtusa(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Superfluous merge/sort |
Date: | 2003-02-25 04:11:43 |
Message-ID: | 20030225041143.GA4886@aratnaweera.virtusa.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Question in brief: does the planner/optimizer take into account the
foreign key constraints?
If the answer is "no", please stop reading here.
Here comes the details. Let me give a simple example to illustrate the
situation.
1. We have two tables t1 and t2.
create table t1 (
id integer primary key,
dummy integer
);
create table t2 (
id integer,
dummy integer
);
2. We create indexes on all the non-pkey fields.
create index t1_dummy_idx on t1(dummy);
create index t2_id_idx on t2(id);
create index t2_dummy_idx on t2(dummy);
3. We make t2(id) a foreign key of t1(id).
alter table t2 add constraint t2_fkey foreign key (id) references t1(id);
4. Populate "t1" with unique "id"s from 0 to 19999 with a dummy value.
copy "t1" from stdin;
0 654
1 86097
...
19998 93716
19999 9106
\.
5. Populate "t2" with 50000 "id"s with a normal distribution.
copy "t2" from stdin;
8017 98659
11825 5946
...
8202 35994
8436 19729
\.
Now we are ready to go ...
- First query is to find the "dummy" values with highest frequency.
=> explain select dummy from t2 group by dummy order by count(*) desc limit 10;
NOTICE: QUERY PLAN:
Limit (cost=2303.19..2303.19 rows=10 width=4)
-> Sort (cost=2303.19..2303.19 rows=5000 width=4)
-> Aggregate (cost=0.00..1996.00 rows=5000 width=4)
-> Group (cost=0.00..1871.00 rows=50000 width=4)
-> Index Scan using t2_dummy_idx on t2 (cost=0.00..1746.00 rows=50000 width=4)
EXPLAIN
- Second query is esseitially the same, but we do a merge with "t1" on
the foreign key (just for the sake of illustrating the point).
=> explain select t2.dummy from t1, t2 where t1.id = t2.id group by t2.dummy order by count(*) desc limit 10;
NOTICE: QUERY PLAN:
Limit (cost=7643.60..7643.60 rows=10 width=12)
-> Sort (cost=7643.60..7643.60 rows=5000 width=12)
-> Aggregate (cost=7086.41..7336.41 rows=5000 width=12)
-> Group (cost=7086.41..7211.41 rows=50000 width=12)
-> Sort (cost=7086.41..7086.41 rows=50000 width=12)
-> Merge Join (cost=0.00..3184.00 rows=50000 width=12)
-> Index Scan using t1_pkey on t1 (cost=0.00..638.00 rows=20000 width=4)
-> Index Scan using t2_id_idx on t2 (cost=0.00..1746.00 rows=50000 width=8)
EXPLAIN
Does this mean that the planner/optimizer doesn't take into account the
foreign key constraint?
Anuradha
--
Debian GNU/Linux (kernel 2.4.21-pre4)
Don't worry about the world coming to an end today. It's already tomorrow
in Australia.
-- Charles Schulz
From | Date | Subject | |
---|---|---|---|
Next Message | Mark Halliwell | 2003-02-25 08:03:40 | Query not using the index |
Previous Message | Tom Lane | 2003-02-24 20:50:35 | Re: slow query |