From: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
---|---|
To: | Postgresql General <pgsql-general(at)postgresql(dot)org> |
Subject: | Planning problem: pushing conditions through GROUP BY |
Date: | 2009-11-22 12:09:03 |
Message-ID: | 20091122120903.GC4341@svana.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hoi,
I've having a smaller planner problem (server version 8.3.5). I have a
table with revisions (revs) and I have a view on top of that gives the
latest revision of each rule. When I request a specific rule, it works
fine.
# explain select * from maxrevs where rul_id=(select rul_id from rules where rul_sid=15895);
QUERY PLAN
-----------------------------------------------------------------------------------------
GroupAggregate (cost=2.93..159.77 rows=2 width=8)
InitPlan
-> Index Scan using rules_index_rul_sid on rules (cost=0.00..4.27 rows=1 width=4)
Index Cond: (rul_sid = 15895)
-> Bitmap Heap Scan on revs (cost=2.93..159.32 rows=86 width=8)
Recheck Cond: (rul_id = $0)
-> Bitmap Index Scan on revs_rul_id (cost=0.00..2.91 rows=86 width=0)
Index Cond: (rul_id = $0)
(8 rows)
But when I want to join on this view, it all goes pear shaped, as can
be seen in the following (completely equivalent) query:
# explain select * from maxrevs where rul_id=ANY(select rul_id from rules where rul_sid=15895);
QUERY PLAN
---------------------------------------------------------------------------------------------
Hash IN Join (cost=6996.32..7080.72 rows=1 width=8)
Hash Cond: (revs.rul_id = rules.rul_id)
-> HashAggregate (cost=6992.04..7032.23 rows=3215 width=8)
-> Seq Scan on revs (cost=0.00..5610.36 rows=276336 width=8)
-> Hash (cost=4.27..4.27 rows=1 width=4)
-> Index Scan using rules_index_rul_sid on rules (cost=0.00..4.27 rows=1 width=4)
Index Cond: (rul_sid = 15895)
(7 rows)
As you can see, it correctly realises that there is only one rule that
matches, but fails to push this information through to the hash
aggregate. I would have expected a Nested Loop with a Bitmap Scan under
a GroupAggreate, as in the first case.
Switching between DISTINCT ON() and GROUP BY doesn't help, apparently
the planner is smart enough to see they are the same in this case.
If I manually rewrite the query to put a distinct at the outer level,
it all works fine. But the real situation is somewhat more complicated.
Is the kind of thing I can expect 8.4 to handle better? Or is there
some trick I can use to get the result I want?
Thanks in advance,
-- table definitions --
Table "test.revs"
Column | Type | Modifiers
-------------+---------+-----------
rev_id | integer |
rul_id | integer |
rev_content | text |
Indexes:
"revs_rev_id" btree (rev_id)
"revs_rul_id" btree (rul_id)
View "test.maxrevs"
Column | Type | Modifiers
--------+---------+-----------
rul_id | integer |
rev_id | integer |
View definition:
SELECT revs.rul_id, max(revs.rev_id) AS rev_id
FROM revs
GROUP BY revs.rul_id;
Table "public.rules"
Column | Type | Modifiers
-------------------+---------+-----------
rul_id | integer | not null
rul_sid | integer | not null
rul_repository_id | integer | not null
Indexes:
"rules_rul_id_key" UNIQUE, btree (rul_id)
"rules_index_rul_sid" btree (rul_sid)
Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.
From | Date | Subject | |
---|---|---|---|
Next Message | Alban Hertroys | 2009-11-22 12:09:07 | Re: How to get RTREE performance from GIST index? |
Previous Message | Clive Page | 2009-11-22 12:08:08 | Re: How to get RTREE performance from GIST index? |