From: | Brandon Craig Rhodes <brandon(at)oit(dot)gatech(dot)edu> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | improving a badly optimized query |
Date: | 2002-11-19 18:09:06 |
Message-ID: | w6lm3pe8jh.fsf@guinness.ts.gatech.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
A query has surprised me by the amount of time it takes, and it seems
that PostgreSQL is performing insufficient optimization. To make the
example simple, consider the following database:
CREATE TABLE role_keys (
role SERIAL PRIMARY KEY
);
CREATE TABLE role_person (
role INTEGER UNIQUE NOT NULL REFERENCES role_keys,
person INTEGER NOT NULL
);
CREATE INDEX role_person_index ON role_person (person);
CREATE VIEW roles AS SELECT
role_keys.role, person
FROM role_keys NATURAL LEFT JOIN role_person;
Having populated these tables, I attempted the following query:
SELECT * FROM roles WHERE person = 28389;
It turns out that this query - equivalent to query (a) shown below -
takes more than ten times the amount of time required by query (b),
despite being guaranteed to give exactly the same result!
(a) (slow)
SELECT * FROM role_keys NATURAL LEFT JOIN role_person
WHERE person = 28389;
(b) (fast)
SELECT * FROM role_keys NATURAL JOIN role_person
WHERE person = 28389;
Apparently PostgreSQL does not realize that the rows created for
unmatched role_keys rows by the LEFT JOIN are guaranteed to be thrown
out by the WHERE clause (their `person' fields will be null). Because
of this it reads through the entire role_keys table:
(a) (when run with EXPLAIN)
Merge Join (cost=0.00..3990.83 rows=67524 width=12)
-> Index Scan using role_keys_pkey on role_keys
(cost=0.00..1280.67 rows=67524 width=4)
-> Index Scan using role_person_role_key on role_person
(cost=0.00..1359.68 rows=67525 width=8)
(b) (when run with EXPLAIN)
Nested Loop (cost=0.00..6.91 rows=1 width=12)
-> Index Scan using role_person_index on role_person
(cost=0.00..3.02 rows=1 width=8)
-> Index Scan using role_keys_pkey on role_keys
(cost=0.00..3.01 rows=1 width=4)
It is not obvious to me where in PostgreSQL's optimization routine to
insert the intelligence to reduce this from a `LEFT JOIN' to a `JOIN'.
Has anyone else had to deal with this case?
The VIEW itself must be a LEFT JOIN because I need all roles to appear
when I query the view; but I will frequently need to do queries like
the above, and would like to avoid either (a) having to create a
separate view for each combination of fields on which I might search,
or (b) querying using the raw database tables since I would like the
actualy design hidden from my business logic.
Thanks for any ideas,
--
Brandon Craig Rhodes http://www.rhodesmill.org/brandon
Georgia Tech brandon(at)oit(dot)gatech(dot)edu
From | Date | Subject | |
---|---|---|---|
Next Message | Medi Montaseri | 2002-11-19 18:15:21 | Re: stability of pg library usage |
Previous Message | Medi Montaseri | 2002-11-19 17:48:30 | Re: help in starting up / shutting down postgres as another |