Re: Ye olde slow query

From: "Murphy, Kevin" <MURPHYKE(at)email(dot)chop(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Ye olde slow query
Date: 2014-04-14 13:55:59
Message-ID: DF23153C-3DFD-4A76-B9A2-346E6B127752@email.chop.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Sorry for the delay; back on this, and thanks for the response.

On Mar 11, 2014, at 6:23 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Murphy, Kevin" <MURPHYKE(at)email(dot)chop(dot)edu> writes:
>> Synopsis: 8-table join with one "WHERE foo IN (...)" condition; works OK with fewer
>> than 5 items in the IN list, but at N=5, the planner starts using a compound index
>> for the first time that completely kills performance (5-6 minutes versus 0-12 seconds).
>> […]
>
> FWIW, I think the right question here is not "why is the slow query
> slow?", but "why is the fast query fast?”.
> […]
> It looks like the trouble spot is this intermediate nested loop:
>
>> -> Nested Loop (cost=4.32..283545.98 rows=80929 width=12) (actual time=163.609..571.237 rows=102 loops=1)
>> Buffers: shared hit=419 read=63
>> -> Nested Loop (cost=4.32..3426.09 rows=471 width=4) (actual time=93.595..112.404 rows=85 loops=1)
>> ...
>> -> Index Scan using sample_result_variant_id on sample_result (cost=0.00..593.01 rows=172 width=8) (actual time=5.147..5.397 rows=1 loops=85)
>> Index Cond: (variant_id = variant_effect.variant_id)
>> Buffers: shared hit=400 read=42
>
> which is creating the bulk of the estimated cost for the whole plan,
> but execution is actually pretty cheap. There seem to be two components
> to the misestimation: one is that the sub-nested loop is producing about a
> fifth as many rows as expected,

This may be because 3 out of the 4 user-supplied gene symbols were not present in the gene table at all. Restricting to valid genes prior to the query is probably a good idea.

> and the other is that the probes into
> sample_result are producing (on average) 1 row, not the 172 rows the
> planner expects. If you could get the latter estimate to be even within
> one order of magnitude of reality, the planner would certainly see this
> plan as way cheaper than the other.

I’m not sure about how to improve this. The stats were 5K globally and up to date, and I made them better, with no change. I tried increasing the stats on the foreign keys involved to 10K (and analyzing), but the same costs and plan are in play. I know the stats are updated now because I dumped and restored on new hardware and did a vacuum analyze. I previously mentioned that some of the vanilla n_distinct values were way off for the (790M row) sample_result table, so I have taken to coercing n_distinct using negative multipliers. This data doesn’t change very often (it hasn’t in many weeks).

There are 6M variants, but only 7.5% of them map to the sample_result table. Presumably the planner knows this because of the n_distinct value on sample_result.variant_id? Each variant maps to zero or sample_result records, but often very few, and never more than the number of samples (currently 1129).

>
> So I'm wondering if the stats on sample_result and variant_effect are up
> to date. If they are, you might try increasing the stats targets for the
> variant_id columns.

The stats were up to date and were at 5K globally. I tried increasing the stats on the foreign keys involved to 10K (and analyzing!), but the same costs and plan are in play. I know the stats are updated now because I dumped and restored on new hardware and did a vacuum analyze. I previously mentioned that some of the vanilla n_distinct values were way off for the (790M row) sample_result table, so I have taken to coercing n_distinct using negative multipliers.

Regards,
Kevin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2014-04-14 14:14:24 Re: SSI slows down over time
Previous Message Ryan Johnson 2014-04-14 12:59:02 Re: SSI slows down over time