From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Nur Agus <nuragus(dot)linux(at)gmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Query Slow After 2018 |
Date: | 2018-01-28 17:51:10 |
Message-ID: | 20180128175110.GA18115@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Mon, Jan 29, 2018 at 12:32:59AM +0700, Nur Agus wrote:
> The following query run in just 9 ms:
> "distrib_reports"."month" = 1 AND
> "distrib_reports"."year" = 2017 AND
> "distrib_reports"."state" = 'SUBMITTED' AND
> "distrib_report_groups"."distrib_report_group_type_id" =
> '559a5fdc-418d-4494-aebf-80ecf8743d35'
> The explain analyze of the 2 queries are resulting on really different
> query plan, here are the links to depesz:
> 2017 --> explain result on postgres-9: https://explain.depesz.com/s/qJF1
> 2018 --> explain result on postgres-9: https://explain.depesz.com/s/pT0y
> The question is, why the query planner choose such very different path just
> by changing one parameter?
Looks like this badly underestimates its rowcount:
Index Scan using index_distrib_reports_on_year on distrib_reports (cost=0.42..40.62 rows=8 width=32) (actual time=0.034..50.452 rows=17,055 loops=1)
Index Cond: (year = 2018)
Filter: ((month = 1) AND ((state)::text = 'SUBMITTED'::text))
Rows Removed by Filter: 1049
Maybe because "if year==2018" then, month=1 does essentialy nothing ..
..but postgres thinks it'll filters out some 90% of the rows.
And possibly the same for SUBMITTED (?)
You should probably use timestamp column rather than integer year+month.
On PG10, you could probably work around it using "CREATE STATISTICS".
> This behaviour is *not-reproducable* on postgres-10. On postgres-10, the
> query plan are consistent, and both have very acceptable time:
> 2017 --> explain result on postgres-10: https://explain.depesz.com/s/N9r5
> 2018 --> --> explain result on postgres-10:
> https://explain.depesz.com/s/Tf5K
..I think default max_parallel_workers_per_gather=3 by chance causes the plan
to be the same.
I think there's still a underestimate rowcount with PG10 (without CREATE
STATISTICS), but it's masked by "rows=1 actual rows=0" roundoff error with
high loop count.
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2018-01-28 21:40:44 | Re: Query Slow After 2018 |
Previous Message | Nur Agus | 2018-01-28 17:32:59 | Query Slow After 2018 |