Re: Improving Performance of Query ~ Filter by A, Sort by B

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Lincoln Swaine-Moore <lswainemoore(at)gmail(dot)com>
Cc: legrand legrand <legrand_legrand(at)hotmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Improving Performance of Query ~ Filter by A, Sort by B
Date: 2018-07-12 03:10:18
Message-ID: 23355.1531365018@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Lincoln Swaine-Moore <lswainemoore(at)gmail(dot)com> writes:
> Here's the result (I turned off the timeout and got it to finish):
> ...

I think the core of the problem here is bad rowcount estimation. We can't
tell from your output how many rows really match

> WHERE "a"."parent_id" IN (
> 49188,14816,14758,8402
> )

but the planner is guessing there are 5823 of them. In the case with
only three IN items, we have

> -> Index Scan using a_parent_id_idx1 on a_partition1 a (cost=0.43..4888.37 rows=4367 width=12) (actual time=5.581..36.270 rows=50 loops=1)
> Index Cond: (parent_id = ANY ('{19948,21436,41220}'::integer[]))

so the planner thinks there are 4367 matching rows but there are only 50.
Anytime you've got a factor-of-100 estimation error, you're going to be
really lucky if you get a decent plan.

I suggest increasing the statistics target for the parent_id column
in hopes of getting better estimates for the number of matches.

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Dinesh Chandra 12108 2018-07-12 03:18:49 Suggestion to optimize performance of the PLSQL procedure.
Previous Message Lincoln Swaine-Moore 2018-07-11 23:31:46 Re: Improving Performance of Query ~ Filter by A, Sort by B