From: | Kószó József <k(dot)joe(at)freemail(dot)hu> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | pg 9.3 exists subselect with limit brakes query plan |
Date: | 2014-03-18 00:36:57 |
Message-ID: | 53279529.2070902@freemail.hu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hello,
We plan to upgrade from PostgreSQL 8.3 to 9.3 and experience some
planner-related problems in our applications with subselects using
EXISTS and LIMIT keywords. We have two similar environment with these
postgres versions, in 8.3 does not matter, if the subselect using LIMIT
or not, but in 9.3 the LIMIT keyword causes odd behavior, enforces seq
scans. In 8.3 always get similar plans to both case, but in 9.3 the
LIMIT keyword slows down the query. Additionally without index to
subselect id field the plan and running time worse with one magnitude in
9.3 than 8.3.
The two test case:
SELECT ... WHERE EXISTS (SELECT ...);
SELECT ... WHERE EXISTS (SELECT ... LIMIT 1);
We try to make some application-independent, reproducible tests to show
the difference between versions:
===== Initializations =====
# createdb limit1test -U postgres
# pgbench -i -s 10 limit1test -U postgres
limit1test=# select count(*) from pgbench_branches;
count
-------
10
limit1test=# select count(*) from pgbench_accounts;
count
---------
1000000
limit1test=# select B.bid, count(*) from pgbench_branches B join
pgbench_accounts using (bid) group by 1 order by 1;
bid | count
-----+--------
1 | 100000
2 | 100000
3 | 100000
4 | 100000
5 | 100000
6 | 100000
7 | 100000
8 | 100000
9 | 100000
10 | 100000
===== PostgreSQL 9.3 =====
# explain analyze select bid from pgbench_branches B where exists
(select bid from pgbench_accounts A where A.bid=B.bid);
---------------------------------------------------------------------
Nested Loop Semi Join (cost=0.00..26402.35 rows=10 width=4) (actual
time=0.012..4972.868 rows=10 loops=1)
Join Filter: (b.bid = a.bid)
Rows Removed by Join Filter: 4500000
-> Seq Scan on pgbench_branches b (cost=0.00..1.10 rows=10
width=4) (actual time=0.003..0.013 rows=10 loops=1)
-> Seq Scan on pgbench_accounts a (cost=0.00..26394.00
rows=1000000 width=4) (actual time=0.002..242.321 rows=450001 loops=10)
Total runtime: 4972.904 ms
limit1test=# explain analyze select bid from pgbench_branches B where
exists (select bid from pgbench_accounts A where A.bid=B.bid limit 1);
---------------------------------------------------------------------
Seq Scan on pgbench_branches b (cost=0.00..3.99 rows=5 width=4)
(actual time=0.018..382.774 rows=10 loops=1)
Filter: (SubPlan 1)
SubPlan 1
-> Limit (cost=0.00..0.29 rows=1 width=4) (actual
time=38.269..38.269 rows=1 loops=10)
-> Seq Scan on pgbench_accounts a (cost=0.00..28894.00
rows=100000 width=4) (actual time=38.261..38.261 rows=1 loops=10)
Filter: (bid = b.bid)
Rows Removed by Filter: 450000
Total runtime: 382.816 ms
# create index pgbench_accounts_bid on pgbench_accounts(bid);
# explain analyze select bid from pgbench_branches B where exists
(select bid from pgbench_accounts A where A.bid=B.bid);
--------------------------------------------------------------------
Nested Loop Semi Join (cost=0.42..2739.24 rows=10 width=4) (actual
time=0.030..0.131 rows=10 loops=1)
-> Seq Scan on pgbench_branches b (cost=0.00..1.10 rows=10
width=4) (actual time=0.003..0.011 rows=10 loops=1)
-> Index Only Scan using pgbench_accounts_bid on pgbench_accounts a
(cost=0.42..2733.62 rows=100000 width=4) (actual time=0.010..0.010
rows=1 loops=10)
Index Cond: (bid = b.bid)
Heap Fetches: 0
Total runtime: 0.169 ms
# explain analyze select bid from pgbench_branches B where exists
(select bid from pgbench_accounts A where A.bid=B.bid limit 1);
--------------------------------------------------------------------
Seq Scan on pgbench_branches b (cost=0.00..3.99 rows=5 width=4)
(actual time=0.016..384.974 rows=10 loops=1)
Filter: (SubPlan 1)
SubPlan 1
-> Limit (cost=0.00..0.29 rows=1 width=4) (actual
time=38.490..38.490 rows=1 loops=10)
-> Seq Scan on pgbench_accounts a (cost=0.00..28894.00
rows=100000 width=4) (actual time=38.482..38.482 rows=1 loops=10)
Filter: (bid = b.bid)
Rows Removed by Filter: 450000
Total runtime: 385.015 ms
# explain analyze select bid from pgbench_branches B where exists
(select bid from pgbench_accounts A where A.bid=B.bid*100);
--------------------------------------------------------------------
Nested Loop Semi Join (cost=0.43..2989.31 rows=10 width=4) (actual
time=0.055..0.055 rows=0 loops=1)
-> Seq Scan on pgbench_branches b (cost=0.00..1.10 rows=10
width=4) (actual time=0.004..0.009 rows=10 loops=1)
-> Index Only Scan using pgbench_accounts_bid on pgbench_accounts a
(cost=0.43..2983.63 rows=100000 width=4) (actual time=0.002..0.002
rows=0 loops=10)
Index Cond: (bid = (b.bid * 100))
Heap Fetches: 0
Total runtime: 0.087 ms
# explain analyze select bid from pgbench_branches B where exists
(select bid from pgbench_accounts A where A.bid=B.bid*100 limit 1);
-------------------------------------------------------------------
Seq Scan on pgbench_branches b (cost=0.00..4.24 rows=5 width=4)
(actual time=1019.139..1019.139 rows=0 loops=1)
Filter: (SubPlan 1)
Rows Removed by Filter: 10
SubPlan 1
-> Limit (cost=0.00..0.31 rows=1 width=4) (actual
time=101.907..101.907 rows=0 loops=10)
-> Seq Scan on pgbench_accounts a (cost=0.00..31394.00
rows=100000 width=4) (actual time=101.898..101.898 rows=0 loops=10)
Filter: (bid = (b.bid * 100))
Rows Removed by Filter: 1000000
Total runtime: 1019.166 ms
# explain analyze select bid from pgbench_branches B where exists
(select bid from pgbench_accounts A where A.bid=B.bid offset 1);
------------------------------------------------------------------
Seq Scan on pgbench_branches b (cost=0.00..6.06 rows=5 width=4)
(actual time=0.042..0.166 rows=10 loops=1)
Filter: (SubPlan 1)
SubPlan 1
-> Limit (cost=0.46..3542.39 rows=99997 width=4) (actual
time=0.014..0.014 rows=1 loops=10)
-> Index Only Scan using pgbench_accounts_bid on
pgbench_accounts a (cost=0.42..3542.39 rows=99998 width=4) (actual
time=0.011..0.012 rows=2 loops=10)
Index Cond: (bid = b.bid)
Heap Fetches: 0
Total runtime: 0.200 ms
===== PostgreSQL 8.3 =====
# explain analyze select bid from branches B where exists (select bid
from accounts A where A.bid=B.bid);
---------------------------------------------------------------------
Seq Scan on branches b (cost=0.00..3.99 rows=5 width=4) (actual
time=0.020..436.306 rows=10 loops=1)
Filter: (subplan)
SubPlan
-> Seq Scan on accounts a (cost=0.00..28894.00 rows=100000
width=4) (actual time=43.618..43.618 rows=1 loops=10)
Filter: (bid = $0)
Total runtime: 436.348 ms
# explain analyze select bid from branches B where exists (select bid
from accounts A where A.bid=B.bid limit 1);
----------------------------------------------------------------------
Seq Scan on branches b (cost=0.00..3.99 rows=5 width=4) (actual
time=0.019..442.392 rows=10 loops=1)
Filter: (subplan)
SubPlan
-> Limit (cost=0.00..0.29 rows=1 width=4) (actual
time=44.231..44.231 rows=1 loops=10)
-> Seq Scan on accounts a (cost=0.00..28894.00 rows=100000
width=4) (actual time=44.223..44.223 rows=1 loops=10)
Filter: (bid = $0)
Total runtime: 442.440 ms
# create index accounts_bid on accounts(bid);
# explain analyze select bid from branches B where exists (select bid
from accounts A where A.bid=B.bid);
----------------------------------------------------------------------
Seq Scan on branches b (cost=0.00..1.55 rows=5 width=4) (actual
time=0.049..0.211 rows=10 loops=1)
Filter: (subplan)
SubPlan
-> Index Scan using accounts_bid on accounts a
(cost=0.00..4493.36 rows=100000 width=4) (actual time=0.017..0.017
rows=1 loops=10)
Index Cond: (bid = $0)
Total runtime: 0.258 ms
# explain analyze select bid from branches B where exists (select bid
from accounts A where A.bid=B.bid limit 1);
---------------------------------------------------------------------
Seq Scan on branches b (cost=0.00..1.55 rows=5 width=4) (actual
time=0.059..0.180 rows=10 loops=1)
Filter: (subplan)
SubPlan
-> Limit (cost=0.00..0.04 rows=1 width=4) (actual
time=0.015..0.015 rows=1 loops=10)
-> Index Scan using accounts_bid on accounts a
(cost=0.00..4493.36 rows=100000 width=4) (actual time=0.014..0.014
rows=1 loops=10)
Index Cond: (bid = $0)
Total runtime: 0.211 ms
# explain analyze select bid from branches B where exists (select bid
from accounts A where A.bid=B.bid*100);
---------------------------------------------------------------------
Seq Scan on branches b (cost=0.00..1.57 rows=5 width=4) (actual
time=0.050..0.050 rows=0 loops=1)
Filter: (subplan)
SubPlan
-> Index Scan using accounts_bid on accounts a
(cost=0.00..4493.36 rows=100000 width=4) (actual time=0.003..0.003
rows=0 loops=10)
Index Cond: (bid = ($0 * 100))
Total runtime: 0.082 ms
# explain analyze select bid from branches B where exists (select bid
from accounts A where A.bid=B.bid*100 limit 1);
---------------------------------------------------------------------
Seq Scan on branches b (cost=0.00..1.57 rows=5 width=4) (actual
time=0.066..0.066 rows=0 loops=1)
Filter: (subplan)
SubPlan
-> Limit (cost=0.00..0.05 rows=1 width=4) (actual
time=0.005..0.005 rows=0 loops=10)
-> Index Scan using accounts_bid on accounts a
(cost=0.00..4493.36 rows=100000 width=4) (actual time=0.003..0.003
rows=0 loops=10)
Index Cond: (bid = ($0 * 100))
Total runtime: 0.095 ms
# explain analyze select bid from branches B where exists (select bid
from accounts A where A.bid=B.bid offset 1);
----------------------------------------------------------------------
Seq Scan on branches b (cost=0.00..2.00 rows=5 width=4) (actual
time=0.038..0.181 rows=10 loops=1)
Filter: (subplan)
SubPlan
-> Limit (cost=0.04..4493.36 rows=99999 width=4) (actual
time=0.016..0.016 rows=1 loops=10)
-> Index Scan using accounts_bid on accounts a
(cost=0.00..4493.36 rows=100000 width=4) (actual time=0.013..0.014
rows=2 loops=10)
Index Cond: (bid = $0)
Total runtime: 0.213 ms
===== Postgres settings =====
PostgreSQL 9.3:
version
----------------------------------------------------------------------
PostgreSQL 9.3.3 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
4.7.2-5) 4.7.2, 64-bit
work_mem
----------
128MB
shared_buffers
----------------
1536MB
effective_cache_size
----------------------
3GB
default_statistics_target
---------------------------
100
PostgreSQL 8.3:
version
-----------------------------------------------------------------------
PostgreSQL 8.3.14 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.3.real
(Debian 4.3.2-1.1) 4.3.2
work_mem
----------
128MB
shared_buffers
----------------
1GB
effective_cache_size
----------------------
2GB
default_statistics_target
---------------------------
100
*_cost settings are all the same (default values), and vacuum, analyze
or reindex does not make any difference.
Any suggestions, what is causes this difference, and how to resolve it
globally? Or are these examples are too special/simple, and seqscan plan
wins in 9.3?
# explain analyze select bid from pgbench_branches B where exists
(select bid from pgbench_accounts A where A.bid=B.bid limit 1);
-------------------------------------------------------------------
Seq Scan on pgbench_branches b (cost=0.00..3.99 rows=5 width=4)
(actual time=0.028..379.426 rows=10 loops=1)
Filter: (SubPlan 1)
SubPlan 1
-> Limit (cost=0.00..0.29 rows=1 width=4) (actual
time=37.934..37.934 rows=1 loops=10)
-> Seq Scan on pgbench_accounts a (cost=0.00..28894.00
rows=100000 width=4) (actual time=37.927..37.927 rows=1 loops=10)
Filter: (bid = b.bid)
Rows Removed by Filter: 450000
Total runtime: 379.464 ms
# set enable_seqscan=off;
# explain analyze select bid from pgbench_branches B where exists
(select bid from pgbench_accounts A where A.bid=B.bid limit 1);
-------------------------------------------------------------------
Index Only Scan using pgbench_branches_pkey on pgbench_branches b
(cost=0.14..12.82 rows=5 width=4) (actual time=0.039..0.145 rows=10 loops=1)
Filter: (SubPlan 1)
Heap Fetches: 0
SubPlan 1
-> Limit (cost=0.42..0.45 rows=1 width=4) (actual
time=0.012..0.012 rows=1 loops=10)
-> Index Only Scan using pgbench_accounts_bid on
pgbench_accounts a (cost=0.42..2850.43 rows=100000 width=4) (actual
time=0.011..0.011 rows=1 loops=10)
Index Cond: (bid = b.bid)
Heap Fetches: 0
Total runtime: 0.178 ms
Thanks,
József Kószó
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Frost | 2014-03-18 00:48:36 | Re: BUG #9606: pg_class relhaspkey column not updated on removal of primary key |
Previous Message | Venkata Balaji Nagothi | 2014-03-17 23:21:10 | Re: BUG #9604: Unable to access table remotely |