From: | James Thompson <james(at)jthompson(dot)dev> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, laurenz(dot)albe(at)cybertec(dot)at, dgrowleyml(at)gmail(dot)com |
Subject: | Re: Please help! Query jumps from 1s -> 4m |
Date: | 2020-05-05 21:10:18 |
Message-ID: | CABoe=cR48w59g7PgVRFacsocWcfdgRejiWomQkxYr3U7OV0b9w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
I've managed to replicate this now with prepared statements. Thanks for all
the guidance so far.
The slowness occurs when the prepared statement changes to a generic plan.
Initial plan:
-> Index Only Scan using table1_typea_include_uniqueid_col16_idx on table1
table1alias2 (cost=0.56..2549.70 rows=70 width=36) (actual
time=1.901..45.256 rows=65000 loops=1)
Output: table1alias2.uniqueid
Index Cond: ((table1alias2.col20 = '12345'::bigint) AND (table1alias2.
col8 = ANY ('{c5986b02-3a02-4639-8147-f286972413ba,...
98ed24b1-76f5-4b0e-bb94-86cf13a4809c}'::text[])))
Heap Fetches: 10
Buffers: shared hit=5048
after 5 executions of the statement:
-> Index Only Scan using table1_typea_include_uniqueid_col16_idx on table1
table1alias2 (cost=0.56..17.23 rows=1 width=36) (actual
time=125.344..126877.822 rows=65000 loops=1)
Output: table1alias2.uniqueid
Index Cond: (table1alias2.col20 = $1001)
Filter: ((table1alias2.col8)::text = ANY ((ARRAY[$1, ...,
$1000])::text[]))
Rows Removed by Filter: 2670023
Heap Fetches: 428
Buffers: shared hit=45933 read=42060 dirtied=4
The second plan looks worse to me as it's applying a filter rather than
using an index condition? I don't understand why it's not part of the
condition and also why this is so much slower though.
If I force a retrieval of all index rows for col20 = '12345' using an
ad-hoc query (below, which in my mind is what the 'bad' plan is doing),
that only takes 2s (2.7 mil rows). Where's the difference?
EXPLAIN (ANALYZE, BUFFERS, TIMING) SELECT COUNT(DISTINCT id) FROM table1
WHERE datatype='TypeA' AND col20 = 12345;
-> Index Only Scan using table1_typea_include_uniqueid_col16_idx on table1
(cost=0.56..2762.95 rows=55337 width=8) (actual time=3.976..1655.645
rows=2735023 loops=1)
Index Cond: (col20 = 12345)
Heap Fetches: 417
Buffers: shared hit=43843 read=44147 dirtied=8
>You could do: SET track_io_timing TO on;
I've not tried this yet, and haven't used it before - sounds like there's
some risks associated with running it on a production server / clocks going
backwards?
>Also, you should be able to check if that's the problem by doing either:
>plan_cache_mode = force_generic_plan;
>Or (I would think) DISCARD PLANS;
I think plan_cache_mode is just for pg12+ unfortunately? We're on 11
currently.
Just tested DISCARD PLANS locally, it didn't switch back from the generic
plan. Was that your expectation?
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2020-05-05 21:35:54 | Re: Please help! Query jumps from 1s -> 4m |
Previous Message | Justin Pryzby | 2020-05-05 19:28:36 | Re: Please help! Query jumps from 1s -> 4m |