Re: Inaccurate Rows estimate for "Bitmap And" causes Planner to choose wrong join

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Steve Pritchard <steve(dot)pritchard(at)bto(dot)org>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Inaccurate Rows estimate for "Bitmap And" causes Planner to choose wrong join
Date: 2020-05-06 16:34:57
Message-ID: 20200506163457.GM28974@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, May 06, 2020 at 05:19:48PM +0100, Steve Pritchard wrote:
> Version: Postgres 9.6.3 production system (but also tested on Postgres 12)
>
> For my query the Planner is sometimes choosing an execution plan that uses
> "Bitmap And" (depending on the parameters):
>
> The Planner then carries this estimate of "1 row" through the rest of the
> query (which is quite complex), and then makes poor choices about joins.
> e.g. uses "Nested Loop Left Join" because it's only expecting one row,
> whereas in practice it has to do 15636 loops which is very slow.

> Note that in cases where the Planner selects a single Index Scan for this
> query (with different parameters), the Planner makes an accurate estimate
> of the number of rows and then makes sensible selections of joins (i.e.
> quick).
> i.e. the issue seems to be with the "Bitmap And".
>
> I don't have an index with both user_id & loc_id, as this is one of several
> different combinations that can arise (it would require quite a few indexes
> to cover all the possible combinations). However if I did have such an
> index, the planner would presumably be able to use the statistics for
> user_id and loc_id to estimate the number of rows.
>
> So why can't it make an accurate estimate of the rows with a "Bitmap And" &
> " Bitmap Heap Scan"? (as above)

It probably *has* statistics for user_id and loc_id, but doesn't have stats for
(user_id,loc_id).

Presumbly the conditions are partially redundant, so loc_id => user_id
(strictly implies or just correlated) or the other way around.

In pg10+ you can use "CREATE STATISTICS (dependencies)" to improve that.
https://www.postgresql.org/docs/devel/sql-createstatistics.html

Otherwise you can use the "CREATE TYPE / CREATE INDEX" trick Tomas described here:
https://www.postgresql.org/message-id/20190424003633.ruvhbv5ro3fawo67%40development

--
Justin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2020-05-06 18:24:55 Re: Inaccurate Rows estimate for "Bitmap And" causes Planner to choose wrong join
Previous Message Steve Pritchard 2020-05-06 16:19:48 Inaccurate Rows estimate for "Bitmap And" causes Planner to choose wrong join