From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Why will hashed SubPlan not use multiple batches |
Date: | 2013-01-25 20:50:48 |
Message-ID: | CAMkU=1zoV1wqLrrs7a_PAC8aAe8uDbiCachNQ1FmdBYAKcBKcg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
A hashed SubPlan will not be used if it would need more than one
batch. Is there a fundamental reason for that, or just that no one
got around to adding it?
A small decrease in work_mem leads to a 38000 fold change in estimated
query execution (and that might be accurate, as the actual change in
execution is too large to measure)
I have no control over the real query itself (otherwise changing it
from NOT IN to NOT EXISTS would fix it, because that hash plan will
use multiple batches).
I have temporarily fixed it by increasing work_mem, but it would be
better if the planner did the best with the resources it had.
This example works with default settings on "PostgreSQL 9.2.2 on
x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20120305 (Red
Hat 4.4.6-4), 64-bit". Same behavior on 9.1.7 and 9.3dev.
Is this a Todo item?
test case below.
create table foo as select (random()*10000)::integer as bar from
generate_series(1,100000);
create table foo2 as select (random()*10000)::integer as bar2 from
generate_series(1,100000);
analyze;
set work_mem TO 3300;
explain select foo.bar from foo where bar not in (select bar2 from foo2);
QUERY PLAN
--------------------------------------------------------------------
Seq Scan on foo (cost=1693.00..3386.00 rows=50000 width=4)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Seq Scan on foo2 (cost=0.00..1443.00 rows=100000 width=4)
(4 rows)
set work_mem TO 3100;
explain select foo.bar from foo where bar not in (select bar2 from foo2);
QUERY PLAN
--------------------------------------------------------------------------
Seq Scan on foo (cost=0.00..129201693.00 rows=50000 width=4)
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..2334.00 rows=100000 width=4)
-> Seq Scan on foo2 (cost=0.00..1443.00 rows=100000 width=4)
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Kohei KaiGai | 2013-01-25 21:09:39 | Re: replace plugins directory with GUC |
Previous Message | Bruce Momjian | 2013-01-25 20:40:57 | Re: setting per-database/role parameters checks them against wrong context |