From: | André Hänsel <andre(at)webkr(dot)de> |
---|---|
To: | <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Bad estimate with partial index |
Date: | 2022-04-19 11:25:21 |
Message-ID: | 01fb01d853e0$28bc5300$7a34f900$@webkr.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi list,
I have a case where Postgres chooses the wrong index and I'm not sure what
to do about it:
https://dbfiddle.uk/?rdbms=postgres_14&fiddle=f356fd56a920ea8a93c192f5a8c16b
1c
Setup:
CREATE TABLE t (
filename int,
cropped bool not null default false,
resized bool not null default false,
create_date date not null default '1970-01-01'
);
INSERT INTO t
SELECT generate_series(1, 1000000);
UPDATE t SET cropped = true, resized = true
WHERE filename IN (SELECT filename FROM t ORDER BY random() LIMIT 900000);
UPDATE t SET resized = false
WHERE filename IN (SELECT filename FROM t WHERE cropped = true ORDER BY
random() LIMIT 1000);
VACUUM FULL t;
ANALYZE t;
Data now looks like this:
SELECT cropped, resized, count(*)
FROM t
GROUP BY 1,2;
I create two indexes:
CREATE INDEX idx_resized ON t(resized) WHERE NOT resized;
CREATE INDEX specific ON t(cropped,resized) WHERE cropped AND NOT resized;
And then run my query:
EXPLAIN ANALYZE
SELECT count(*) FROM t WHERE cropped AND NOT resized AND create_date <
CURRENT_DATE;
Aggregate (cost=4001.25..4001.26 rows=1 width=8) (actual
time=478.557..478.558 rows=1 loops=1)
-> Index Scan using idx_resized on t (cost=0.29..3777.71 rows=89415
width=0) (actual time=478.177..478.480 rows=1000 loops=1)
Filter: (cropped AND (create_date < CURRENT_DATE))
Rows Removed by Filter: 100000
It takes 478 ms on dbfiddle.uk (on my machine it's faster but the difference
is still visible).
Now I delete an index:
DROP INDEX idx_resized;
and run the same query again and I get a much better plan:
Aggregate (cost=11876.27..11876.28 rows=1 width=8) (actual
time=0.315..0.316 rows=1 loops=1)
-> Bitmap Heap Scan on t (cost=35.50..11652.73 rows=89415 width=0)
(actual time=0.054..0.250 rows=1000 loops=1)
Recheck Cond: (cropped AND (NOT resized))
Filter: (create_date < CURRENT_DATE)
Heap Blocks: exact=6
-> Bitmap Index Scan on specific (cost=0.00..13.15 rows=89415
width=0) (actual time=0.040..0.040 rows=1000 loops=1)
which uses the index specific and completes in less than a ms on both
dbfiddle.uk and my machine.
Additional mystery - when I set the values not with an UPDATE but with a
DEFAULT, then the correct index is chosen. What is going on?
https://dbfiddle.uk/?rdbms=postgres_14&fiddle=dc7d8aea14e90f08ab6537a855f34d
8c
Regards,
André
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2022-04-19 11:39:57 | Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403) |
Previous Message | Alvaro Herrera | 2022-04-19 11:05:23 | Re: typos |