From: | Frank van Vugt <ftm(dot)van(dot)vugt(at)foxi(dot)nl> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | why is a constraint not 'pushed down' into a subselect when this subselect is using a 'group by' ?? |
Date: | 2004-06-15 06:49:29 |
Message-ID: | 200406150849.29470.ftm.van.vugt@foxi.nl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi all,
I noticed the following. Given two tables, just simply articles and their
packages:
article(id int)
package( id int, article_id int, amount)
When taking the minimum package for articles given some constraint on the
article table
select
article.id,
p_min
from
article,
(select
article_id,
min(amount) as p_min
from
package
group by
article_id
) as foo
where
article.id = foo.article_id and
article.id < 50;
The query plan shows
QUERY PLAN
----------------------------------------------------------------------------------------
Hash Join (cost=730.11..808.02 rows=1 width=8)
Hash Cond: ("outer".article_id = "inner".id)
-> Subquery Scan foo (cost=726.10..781.74 rows=4451 width=8)
-> HashAggregate (cost=726.10..737.23 rows=4451 width=8)
-> Seq Scan on package (cost=0.00..635.40 rows=18140 width=8)
-> Hash (cost=4.01..4.01 rows=1 width=4)
-> Index Scan using article_pkey on article (cost=0.00..4.01 rows=1
width=4)
Index Cond: (id < 50)
So, a seq scan on the complete package table is done, although it seems the
planner could have deduced that only the part that has 'article_id < 50'
would be relevant, since 'article.id < 50' and 'article.id = foo.article_id'
If I ditch the group by, then this contraint does get pushed into the
subselect :
select
article.id,
p_min
from
article,
(select
article_id,
amount as p_min
from
package
) as foo
where
article.id = foo.article_id and
article.id < 50;
QUERY PLAN
-----------------------------------------------------------------------------------
Nested Loop (cost=0.00..14.22 rows=5 width=8)
-> Index Scan using article_pkey on article (cost=0.00..4.01 rows=1
width=4)
Index Cond: (id < 50)
-> Index Scan using package_idx3 on package (cost=0.00..10.16 rows=4
width=8)
Index Cond: ("outer".id = package.article_id)
So it seems the math-knowledge is there ;)
I'm wondering about what I'm overlooking here. When I take the first query and
add the article constraint 'manually', I get:
select
article.id,
p_min
from
article,
(select
article_id,
min(amount) as p_min
from
package
where
article_id < 50
group by
article_id
) as foo
where
article.id = foo.article_id and
article.id < 50;
QUERY PLAN
----------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..9.65 rows=1 width=8)
Join Filter: ("outer".id = "inner".article_id)
-> Index Scan using article_pkey on article (cost=0.00..4.01 rows=1
width=4)
Index Cond: (id < 50)
-> Subquery Scan foo (cost=0.00..5.62 rows=1 width=8)
-> GroupAggregate (cost=0.00..5.61 rows=1 width=8)
-> Index Scan using package_idx3 on package (cost=0.00..5.60
rows=2 width=8)
Index Cond: (article_id < 50)
which would have been a nice plan for the first query ;)
Could someone shine a light on this, please?
--
Best,
Frank.
From | Date | Subject | |
---|---|---|---|
Next Message | Jeremy Dunn | 2004-06-15 12:43:49 | Re: 50 000 000 Table entries and I have to do a keyword search HELP NEEDED |
Previous Message | Neeraj | 2004-06-15 06:03:40 | Interpreting OSDB Results |