Re: Inaccurate Rows estimate for "Bitmap And" causes Planner to choose wrong join

From: Steve Pritchard <steve(dot)pritchard(at)bto(dot)org>
To: Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>
Subject: Re: Inaccurate Rows estimate for "Bitmap And" causes Planner to choose wrong join
Date: 2020-05-13 09:33:35
Message-ID: CAF7AqmyicqVv99XjO5vgS1N4+dSUWrgB_DMp6meHLNktzBpesg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Many thanks Justin & Jeff for your replies.

Presumbly the conditions are partially redundant, so loc_id => user_id

Yes you're right. I had overlooked this.

I've done some further testing and this confirms what you say: if the WHERE
columns are independent, then the Planner makes a reasonable estimate of
the number of rows. irrespective of whether it uses a single index or a
"Bitmap And" of two indexes.

I've also tested "create statistics" on Postgres 12:

- gives good estimate with WHERE user_id = 'USER123' and loc_id =
'LOC12345678'
- but Plan Rows = 5 with WHERE user_id = 'USER123' and loc_id = ANY('{
LOC12345678 }'::text[])
- Note: if I omit the user_id condition then it gives a good estimate,
i.e. with WHERE loc_id = ANY('{ LOC12345678 }'::text[])

So statistics objects don't seem to be able to handle the combination of
dependencies and arrays (at least in 12.2).

Steve

On Wed, 6 May 2020 at 19:25, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Wed, May 6, 2020 at 12:20 PM Steve Pritchard <steve(dot)pritchard(at)bto(dot)org>
> wrote:
>
>> Version: Postgres 9.6.3 production system (but also tested on Postgres 12)
>>
>> For my query the Planner is sometimes choosing an execution plan that
>> uses "Bitmap And" (depending on the parameters):
>>
>> -> Bitmap Heap Scan on observation (cost=484.92..488.93 rows=1
>> width=203) (actual time=233.129..330.886 rows=15636 loops=1)
>> Recheck Cond: (((user_id)::text = 'USER123'::text) AND ((loc_id)::text
>> = ANY ('{LOC12345678}'::text[])))
>>
>
> If you change " = ANY(array_of_one)" to " = scalar", does that change
> anything? You might be able to fix this (in v12) using CREATE STATISTICS,
> but I don't know if that mechanism can see through the ANY(array_of_one)
> wrapper.
>
>
>> Note that in cases where the Planner selects a single Index Scan for this
>> query (with different parameters), the Planner makes an accurate estimate
>> of the number of rows and then makes sensible selections of joins (i.e.
>> quick).
>> i.e. the issue seems to be with the "Bitmap And".
>>
>
>
> I don't know if this nitpick matters, but I don't think that that is how
> the planner works. The row estimates work from the top down, not the
> bottom up. The row estimate of 1 is based on what conditions the bitmap
> heap scan implements, it is not arrived at by combining the estimates from
> the index scans below it. If it were to change to a different type of node
> but implemented the same conditions, I think it would have the same row
> estimate.
>
>
>>
>> I don't have an index with both user_id & loc_id, as this is one of
>> several different combinations that can arise (it would require quite a few
>> indexes to cover all the possible combinations).
>>
>
> Are you actually experiencing problems with those other combinations as
> well? If not, I wouldn't worry about solving hypothetical problems. If
> those other combinations are actually problems and you go with CREATE
> STATISTICS, then you would have to be creating a lot of different
> statistics. That would still be ugly, but at least the overhead for
> statistics is lower than for indexes.
>
>
>> However if I did have such an index, the planner would presumably be able
>> to use the statistics for user_id and loc_id to estimate the number of rows.
>>
>
> Indexes on physical columns do not have statistics, so making that index
> would not help with the estimation. (Expressional indexes do have
> statistics, but I don't see that helping you here). So while this node
> would execute faster with that index, it would still be kicking the unshown
> nested loop left join 15,636 times when it thinks it will be doing it
> once, and so would still be slow. The most robust solution might be to
> make the outer part of that nested loop left join faster, so that your
> system would be more tolerant of statistics problems.
>
>
>>
>> So why can't it make an accurate estimate of the rows with a "Bitmap And"
>> & " Bitmap Heap Scan"? (as above)
>>
>
> In the absence of custom statistics, it assumes the selectivities of user_id
> = 'USER123', of loc_id = ANY ('{LOC12345678}'::text[]), and of taxa =
> 'Birds' are all independent of each other and can be multiplied to arrive
> at the overall selectivity. But clearly that is not the case. Bird
> watchers mostly watch near where they live, not in random other places.
>
> Cheers,
>
> Jeff
>

--
Steve Pritchard
Database Developer

British Trust for Ornithology, The Nunnery, Thetford, Norfolk IP24 2PU, UK
Tel: +44 (0)1842 750050, fax: +44 (0)1842 750030
Registered Charity No 216652 (England & Wales) No SC039193 (Scotland)
Company Limited by Guarantee No 357284 (England & Wales)

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message James Thompson 2020-05-13 14:17:03 Re: Please help! Query jumps from 1s -> 4m
Previous Message github kran 2020-05-12 15:40:25 Re: AutoVacuum and growing transaction XID's