From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Steve Pritchard <steve(dot)pritchard(at)bto(dot)org> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Inaccurate Rows estimate for "Bitmap And" causes Planner to choose wrong join |
Date: | 2020-05-06 16:34:57 |
Message-ID: | 20200506163457.GM28974@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Wed, May 06, 2020 at 05:19:48PM +0100, Steve Pritchard wrote:
> Version: Postgres 9.6.3 production system (but also tested on Postgres 12)
>
> For my query the Planner is sometimes choosing an execution plan that uses
> "Bitmap And" (depending on the parameters):
>
> The Planner then carries this estimate of "1 row" through the rest of the
> query (which is quite complex), and then makes poor choices about joins.
> e.g. uses "Nested Loop Left Join" because it's only expecting one row,
> whereas in practice it has to do 15636 loops which is very slow.
> Note that in cases where the Planner selects a single Index Scan for this
> query (with different parameters), the Planner makes an accurate estimate
> of the number of rows and then makes sensible selections of joins (i.e.
> quick).
> i.e. the issue seems to be with the "Bitmap And".
>
> I don't have an index with both user_id & loc_id, as this is one of several
> different combinations that can arise (it would require quite a few indexes
> to cover all the possible combinations). However if I did have such an
> index, the planner would presumably be able to use the statistics for
> user_id and loc_id to estimate the number of rows.
>
> So why can't it make an accurate estimate of the rows with a "Bitmap And" &
> " Bitmap Heap Scan"? (as above)
It probably *has* statistics for user_id and loc_id, but doesn't have stats for
(user_id,loc_id).
Presumbly the conditions are partially redundant, so loc_id => user_id
(strictly implies or just correlated) or the other way around.
In pg10+ you can use "CREATE STATISTICS (dependencies)" to improve that.
https://www.postgresql.org/docs/devel/sql-createstatistics.html
Otherwise you can use the "CREATE TYPE / CREATE INDEX" trick Tomas described here:
https://www.postgresql.org/message-id/20190424003633.ruvhbv5ro3fawo67%40development
--
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2020-05-06 18:24:55 | Re: Inaccurate Rows estimate for "Bitmap And" causes Planner to choose wrong join |
Previous Message | Steve Pritchard | 2020-05-06 16:19:48 | Inaccurate Rows estimate for "Bitmap And" causes Planner to choose wrong join |