Re: SV: bad plan using nested loops

From: Johan Fredriksson <eskil(at)kth(dot)se>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: SV: bad plan using nested loops
Date: 2018-02-02 09:02:07
Message-ID: 1517562127.315.1.camel@kth.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

tor 2018-02-01 klockan 20:34 +0000 skrev Johan Fredriksson:
> > Johan Fredriksson <eskil(at)kth(dot)se> writes:
> > > Bad plan: https://explain.depesz.com/s/avtZ
> > > Good plan: https://explain.depesz.com/s/SJSt
> > > Any suggestions on how to make the planner make better decisions
> > > for
> > > this query?
> >
> > Core of the problem looks to be the misestimation here:
> >
> >        Index Only Scan using shredder_cgm1 on
> > public.cachedgroupmembers cachedgroupmembers_4
> > (cost=0.43..2.33 rows=79 width=8) (actual time=0.020..0.903
> > rows=1492 loops=804)
> >          Output: cachedgroupmembers_4.memberid,
> > cachedgroupmembers_4.groupid,
> > cachedgroupmembers_4.disabled
> >          Index Cond: ((cachedgroupmembers_4.memberid =
> > principals_1.id) AND
> > (cachedgroupmembers_4.disabled = 0))
> >          Heap Fetches: 5018
> >
> > Probably, memberid and disabled are correlated but the planner
> > doesn't
> > know that, so it thinks the index condition is way more selective
> > than it
> > actually is.  In PG 10, you could very possibly fix that by
> > installing
> > extended statistics on that pair of columns.  See
> >
> > https://www.postgresql.org/docs/current/static/planner-stats.html#P
> > LANNER-STATS-EXTENDED
>
> I'm not sure what you mean by correlated, but there are only a
> handful (164 when I check it) disabled groupmembers out of total 7.5
> million.
> I'll give CREATE STATISTICS on those columns a shot and see if it
> gets any better.

It looks like you are right, Tom. There actually exists full
correlation between memberid, groupid and disabled.

rt4=# SELECT stxname, stxkeys, stxdependencies FROM pg_statistic_ext;
 stxname  | stxkeys |   stxdependencies    
-----------+---------+----------------------
 cgm_stat2 | 2 6     | {"2
=> 6": 1.000000}
 cgm_stat1 | 3 6     | {"3 => 6": 1.000000}
(2 rows)

However, this does not help the planner. It still picks the bad plan.

/ Eskil

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Laurenz Albe 2018-02-02 09:36:40 Re: Query optimiser is not using 'not null' constraint when 'order by nulls last' clause is used
Previous Message Johan Fredriksson 2018-02-01 20:34:24 SV: bad plan using nested loops