From: | André Hänsel <andre(at)webkr(dot)de> |
---|---|
To: | <pgsql-performance(at)lists(dot)postgresql(dot)org> |
Subject: | Unworkable plan above certain row count |
Date: | 2022-04-28 00:52:57 |
Message-ID: | 035001d85a9a$4dcfd840$e96f88c0$@webkr.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
I noticed an issue in a simple query with WHERE NOT IN (SELECT ...). I am
aware that anti-joins with NOT IN are currently not optimized and should be
rewritten as WHERE NOT EXISTS (SELECT ...), so if this is irrelevant please
just ignore it.
Here is a setup that works:
CREATE TABLE a
(
a_id serial NOT NULL,
PRIMARY KEY (a_id)
);
CREATE TABLE b
(
b_id serial NOT NULL,
a_id int NOT NULL,
PRIMARY KEY (b_id)
);
INSERT INTO a(a_id) SELECT generate_series(1, 20000);
INSERT INTO b(b_id, a_id) SELECT generate_series(1, 500000), floor(random()
* 22000 + 1)::int;
ANALYZE a;
ANALYZE b;
EXPLAIN SELECT count(*) FROM b WHERE a_id NOT IN (SELECT a_id FROM a);
Finalize Aggregate (cost=7596.23..7596.24 rows=1 width=8)
-> Gather (cost=7596.12..7596.23 rows=1 width=8)
Workers Planned: 1
-> Partial Aggregate (cost=6596.12..6596.13 rows=1 width=8)
-> Parallel Seq Scan on b (cost=339.00..6228.47 rows=147059
width=0)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Seq Scan on a (cost=0.00..289.00 rows=20000
width=4)
Fiddle:
https://dbfiddle.uk/?rdbms=postgres_14&fiddle=497ab1d5eec6e02d4d1c0f6630b6f1
f1
Now if you change
INSERT INTO a(a_id) SELECT generate_series(1, 20000);
to
INSERT INTO a(a_id) SELECT generate_series(1, 200000);
i.e. add a zero, the plan becomes this:
Finalize Aggregate (cost=759860198.41..759860198.42 rows=1 width=8)
-> Gather (cost=759860198.29..759860198.40 rows=1 width=8)
Workers Planned: 1
-> Partial Aggregate (cost=759859198.29..759859198.30 rows=1
width=8)
-> Parallel Seq Scan on b (cost=0.00..759858830.65
rows=147059 width=0)
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..4667.00 rows=200000
width=4)
-> Seq Scan on a (cost=0.00..2885.00
rows=200000 width=4)
Fiddle:
https://dbfiddle.uk/?rdbms=postgres_14&fiddle=bec018196195635cb6ec05ccae3213
7c
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2022-04-28 02:08:14 | Re: Unworkable plan above certain row count |
Previous Message | Mickael van der Beek | 2022-04-27 14:54:35 | Re: Array of integer indexed nested-loop semi join |