Re: Why does query planner choose slower BitmapAnd ?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Seamus Abshere <seamus(at)abshere(dot)net>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Why does query planner choose slower BitmapAnd ?
Date: 2016-02-22 17:30:18
Message-ID: CAMkU=1yV7WQLetrCVPqn=dTPdNzW3JD29ZsK0zJmgzO2tdcx-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Feb 22, 2016 at 8:20 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>
> Also agreed here, but I've seen field evidence (with reasonable
> configurations) that definitely shows that we're a bit too happy to go
> with a BitmapAnd scan across two indexes where one returns an order of
> magnitude (or less) pages to consider than the other and most of the
> time we spend on the overall query is in going through the index to find
> a bunch of pages we're just going to throw away when we do the AND.
>
> In one specific case which I can recall offhand (having seen it quite
> recently), there was a btree index and a gist index (PostGIS geometry)
> where the btree index pulled back perhaps 100k rows but the gist index
> returned nearly everything (the bounding box included in the query
> covering almost the entire table). Dropping the gist index greatly
> improved *that* query, but, of course, destroyed the performance of more
> selective queries bounding box queries which didn't include a constraint
> on the column with the btree index (forcing a sequential scan of the
> table).
>
> I've not looked into the specific costing here to see why the BitmapAnd
> ended up being chosen over just doing an index scan with the btree and
> then filtering, but I do believe it to be a problem area that would be
> good to try and improve. The first question is probably- are we
> properly accounting for the cost of scanning the index vs the cost of
> scanning one index and then applying the filter?

I looked into this before as well, and I think it is vastly
underestimating the cost of adding a bit into the bitmap, near this
comment:

/*
* Charge a small amount per retrieved tuple to reflect the costs of
* manipulating the bitmap. This is mostly to make sure that a bitmap
* scan doesn't look to be the same cost as an indexscan to retrieve a
* single tuple.
*/

It charges 0.1 CPU_operator_cost, while reality seemed to be more like
6 CPU_operator_cost.

The assumption seems to be that this estimate doesn't need to be
accurate, because the cost estimate when the bitmap is *used* will
make up for it. But when ANDing together bitmaps of very unequal
size, that doesn't happen.

I've been wanting to revisit this in the 9.6 code base, but can't find
my code and notes, so this all from memory.

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Seamus Abshere 2016-02-22 17:43:46 Re: Why does query planner choose slower BitmapAnd ?
Previous Message Tom Lane 2016-02-22 17:25:28 Re: Why does query planner choose slower BitmapAnd ?