From: | Jan Kort <jan(dot)kort(at)genetics(dot)nl> |
---|---|
To: | "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Query with straightforward plan changes and becomes 6520 times slower after VACUUM ANALYZE |
Date: | 2021-05-18 12:46:24 |
Message-ID: | AM0PR0502MB3620D231418BE0B7B8A9E3E09B2C9@AM0PR0502MB3620.eurprd05.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hi,
I tried this on both PostgreSQL 12.7 and 13.3, I tried both VACUUM ANALYZE and VACUUM FULL ANALYZE, they both seem to change a straightforward plan to a reversed and complex plan that make the query slow. The only thing that works is to do a pg_dump, pg_restore, but that is not feasible.
It seems to depend on specific data, I tried generating other variations of around 1M records, but so far the attached DB (also generated) is the only one where I could reproduce it, I can’t figure out what is so special about this data.
Given the following record counts:
GFO_ZAKEN: 1048587
GFO_ZAKEN_KOSTEN: 3
GFO_ZAKEN_TYPECODE: 4
GFO_ZAKEN.ZAAKTYPECODE_ID has a value 9 times and the rest is null
When I run this query:
EXPLAIN ANALYZE
SELECT GFO_ZAKEN_KOSTEN.ID AS GFO_ZAKEN_KOSTEN_ID,
GFO_ZAKEN.ID AS GFO_ZAKEN_ID,
GFO_ZAKEN_TYPECODE.OMSCHRIJVING AS ZAAKTYPECODE_ID
FROM TRIAL.GFO_ZAKEN_KOSTEN
JOIN TRIAL.GFO_ZAKEN ON GFO_ZAKEN_KOSTEN.GFO_ZAKEN_ID = GFO_ZAKEN.ID
JOIN TRIAL.GFO_ZAKEN_TYPECODE ON GFO_ZAKEN.ZAAKTYPECODE_ID = GFO_ZAKEN_TYPECODE.ID
It is taking 500ms or so, which I associate with a full table scan, but they are just simple referential constraints and corresponding indexes:
CREATE TABLE TRIAL.GFO_ZAKEN_TYPECODE (ID INTEGER PRIMARY KEY, OMSCHRIJVING CHARACTER VARYING(4000));
CREATE TABLE TRIAL.GFO_ZAKEN (ID INTEGER PRIMARY KEY, ZAAKTYPECODE_ID INTEGER,
CONSTRAINT ZAAKTYPECODE_IDC1 FOREIGN KEY (ZAAKTYPECODE_ID) REFERENCES TRIAL.GFO_ZAKEN_TYPECODE (ID));
CREATE INDEX GFO_ZAKENO18 ON TRIAL.GFO_ZAKEN USING BTREE (ZAAKTYPECODE_ID);
CREATE TABLE TRIAL.GFO_ZAKEN_KOSTEN (ID INTEGER PRIMARY KEY, GFO_ZAKEN_ID INTEGER,
CONSTRAINT GFO_ZAKEN_IDC1 FOREIGN KEY (GFO_ZAKEN_ID) REFERENCES TRIAL.GFO_ZAKEN (ID));
CREATE INDEX GFO_ZAKEN_KOSTENO14 ON TRIAL.GFO_ZAKEN_KOSTEN USING BTREE (GFO_ZAKEN_ID ASC NULLS LAST);
After pg_restore it gives a straightforward plan, starting with the gfo_zaken_kosten primary key and continuing with the join on gfo_zaken_kosten.gfo_zaken_id:
Nested Loop (cost=0.56..3.64 rows=1 width=524) (actual time=0.036..0.037 rows=1 loops=1)
-> Nested Loop (cost=0.43..3.48 rows=1 width=12) (actual time=0.030..0.030 rows=1 loops=1)
-> Seq Scan on gfo_zaken_kosten (cost=0.00..1.04 rows=1 width=8) (actual time=0.016..0.016 rows=1 loops=1)
Filter: (id = 13)
Rows Removed by Filter: 2
-> Index Scan using gfo_zakenp on gfo_zaken (cost=0.43..2.45 rows=1 width=8) (actual time=0.012..0.012 rows=1 loops=1)
Index Cond: (id = gfo_zaken_kosten.gfo_zaken_id)
-> Index Scan using gfo_zaken_typecodep on gfo_zaken_typecode (cost=0.13..0.15 rows=1 width=520) (actual time=0.005..0.005 rows=1 loops=1)
Index Cond: (id = gfo_zaken.zaaktypecode_id)
Planning Time: 1.538 ms
Execution Time: 0.095 ms
After VACUUM ANALYZE the plan becomes inefficient again, and does not start with the gfo_zaken_kosten primary key, the plan starts at the wrong end with an index scan on 1M rows:
Merge Join (cost=1.48..1.59 rows=1 width=159) (actual time=619.374..619.376 rows=1 loops=1)
Merge Cond: (gfo_zaken.id = gfo_zaken_kosten.gfo_zaken_id)
-> Nested Loop (cost=0.43..96503.47 rows=1048587 width=155) (actual time=0.022..619.359 rows=9 loops=1)
Join Filter: (gfo_zaken.zaaktypecode_id = gfo_zaken_typecode.id)
Rows Removed by Join Filter: 4194316
-> Index Scan using gfo_zakenp on gfo_zaken (cost=0.43..33587.23 rows=1048587 width=8) (actual time=0.006..141.167 rows=1048587 loops=1)
-> Materialize (cost=0.00..1.06 rows=4 width=155) (actual time=0.000..0.000 rows=4 loops=1048587)
-> Seq Scan on gfo_zaken_typecode (cost=0.00..1.04 rows=4 width=155) (actual time=0.011..0.012 rows=4 loops=1)
-> Sort (cost=1.05..1.05 rows=1 width=8) (actual time=0.012..0.012 rows=1 loops=1)
Sort Key: gfo_zaken_kosten.gfo_zaken_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on gfo_zaken_kosten (cost=0.00..1.04 rows=1 width=8) (actual time=0.008..0.008 rows=1 loops=1)
Filter: (id = 13)
Rows Removed by Filter: 2
Planning Time: 69.151 ms
Execution Time: 619.410 ms
So 6520 times slower after vacuum analyze then before.
Doing VACUUM ANALYZE again doesn’t make it better, only pg_dump + pg_restore will go back to the original plan, but then it will break again on the first VACUUM ANALYZE.
I attached a 800K test DB with all sensitive data removed.
I tried both the default config without changes, and the default config with our settings appended:
max_connections = 1000
shared_buffers = 512MB
effective_cache_size = 6GB
work_mem = 10485kB
maintenance_work_mem = 512MB
min_wal_size = 1GB
max_wal_size = 2GB
checkpoint_completion_target = 0.7
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1
wal_sync_method = open_datasync
fsync = on
synchronous_commit = off
Doing a VACUUM ANALYZE shouldn’t change a straightforward plan.
Regards,
Jan Kort
Attachment | Content-Type | Size |
---|---|---|
trialdb.zip | application/x-zip-compressed | 807.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2021-05-18 13:26:24 | BUG #17017: Two versions of the same row of records are returned in one query |
Previous Message | PG Bug reporting form | 2021-05-18 07:42:03 | BUG #17016: Cannot sync pgdg-common repo with reposync due to failed signature check |