Query planner unaware of possibly best plan

From: Dániel Dénes <panther-d(at)freemail(dot)hu>
To: pgsql-general(at)postgresql(dot)org
Subject: Query planner unaware of possibly best plan
Date: 2007-09-20 17:09:13
Message-ID: freemail.20070820190913.56817@fm03.freemail.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,
I think the query planner is unaware of the possibly best plan in some
situations. See the test case below:

-- --------------------------------------------------- --

CREATE TABLE tparent (
id integer NOT NULL,
ord integer NOT NULL,
CONSTRAINT par_pkey_id PRIMARY KEY (id),
CONSTRAINT par_uniq_ord UNIQUE (ord)
);

CREATE TABLE tchild (
par_id integer NOT NULL,
ord integer NOT NULL,
CONSTRAINT chi_pkey_parid_ord PRIMARY KEY (par_id, ord),
CONSTRAINT chi_fkey FOREIGN KEY (par_id) REFERENCES tparent(id)
);

INSERT INTO tparent VALUES (1, 3);
INSERT INTO tparent VALUES (2, 1);
INSERT INTO tparent VALUES (3, 4);
INSERT INTO tparent VALUES (4, 5);
INSERT INTO tparent VALUES (5, 2);

INSERT INTO tchild VALUES (1, 2);
INSERT INTO tchild VALUES (1, 1);
INSERT INTO tchild VALUES (2, 1);
INSERT INTO tchild VALUES (2, 3);
INSERT INTO tchild VALUES (2, 2);
INSERT INTO tchild VALUES (3, 1);
INSERT INTO tchild VALUES (3, 2);
INSERT INTO tchild VALUES (4, 1);
INSERT INTO tchild VALUES (5, 2);
INSERT INTO tchild VALUES (5, 1);

ANALYZE tparent;
ANALYZE tchild;

SET enable_seqscan TO false;
SET enable_bitmapscan TO false;
SET enable_hashjoin TO false;
SET enable_mergejoin TO false;
SET enable_sort TO false;

EXPLAIN ANALYZE
SELECT *
FROM tparent JOIN tchild ON tchild.par_id = tparent.id
WHERE tparent.ord BETWEEN 1 AND 4
ORDER BY tparent.ord, tchild.ord;

-- --------------------------------------------------- --

Sort
(cost=100000132.10..100000140.10 rows=8 width=16)
(actual time=0.440..0.456 rows=9 loops=1)
Sort Key: tparent.ord, tchild.ord

-> Nested Loop
(cost=0.00..84.10 rows=8 width=16)
(actual time=0.179..0.270 rows=9 loops=1)

-> Index Scan using par_uniq_ord on tparent
(cost=0.00..20.40 rows=4 width=8)
(actual time=0.089..0.098 rows=4 loops=1)
Index Cond: ((ord >= 1) AND (ord <= 4))

-> Index Scan using chi_pkey_parid_ord on tchild
(cost=0.00..9.93 rows=2 width=8)
(actual time=0.023..0.028 rows=2 loops=4)
Index Cond: (tchild.par_id = "outer".id)

-- --------------------------------------------------- --

Even though I forced the nested loop plan using both indexes (that
returns the rows in the correct order), there is a needless sort step on
the top, consuming half of the time even on such small tables.
Now it's clear why the planner did not choose this plan, why I had to
force it: because it isn't the best if the sort is still there.

The first time I posted this
( http://archives.postgresql.org/pgsql-general/2007-05/msg01306.php )
and read Tom's answer I was convinced that this is rarely a problem,
but now I don't think so, since I ran into it for the third time.

Can that sort step somehow be eliminated if the NestLoop's outer
table is being scanned via a unique index? If not, how can I rewrite my
indexes/query in such a way that it's still safe (the rows always come in
the order I want), but I don't have to wait for that needless sort?

I'm still using PostgreSQL 8.1.8.

Thanks in advance,
Denes Daniel

ÁÁÁÁÁ Ugye nem félsz?
Fotózz és Pályázz a The Witcher ajándékokért!!!!
____________________________________________________
http://freemail.hu/fototar/palyazat/palyazat.jspv?id=1768650

Browse pgsql-general by date

  From Date Subject
Next Message Bill Moran 2007-09-20 17:22:56 Re: autovacuum
Previous Message Collin 2007-09-20 17:04:25 Migration from PervasiveSQL