Re: Too many BitmapAnds in the wild

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Seamus Abshere <seamus(at)abshere(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Too many BitmapAnds in the wild
Date: 2018-09-17 22:11:52
Message-ID: 28692.1537222312@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Seamus Abshere <seamus(at)abshere(dot)net> writes:
> hey,
> We see a fair number of incorrectly chosen BitmapAnd plans in the wild at Faraday... enough that googling the problem ends up at our old posts to this mailing list 😀. An attractive solution was proposed by Jeff Janes [1]

> - *cost += 0.1 * cpu_operator_cost * path->rows;
> + *cost += 6 * cpu_operator_cost * path->rows;

> It appears this constant hasn't been changed for 7 years [2].

AFAICT you are pointing at this:

cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
{
if (IsA(path, IndexPath))
{
*cost = ((IndexPath *) path)->indextotalcost;
*selec = ((IndexPath *) path)->indexselectivity;

/*
* Charge a small amount per retrieved tuple to reflect the costs of
* manipulating the bitmap. This is mostly to make sure that a bitmap
* scan doesn't look to be the same cost as an indexscan to retrieve a
* single tuple.
*/
*cost += 0.1 * cpu_operator_cost * path->rows;
}

which does not seem like the place to be putting your thumb on the scale
if you want to discourage bitmap ANDs. That would increase the estimate
for plain bitmap scans as well as ANDs. Moreover, there's no plausible
reasoning for this adjustment being more than a minimal one.

There's a different fudge factor in cost_bitmap_and_node (also in
cost_bitmap_or_node) that would probably be more plausible to twiddle:

* The runtime cost of the BitmapAnd itself is estimated at 100x
* cpu_operator_cost for each tbm_intersect needed. Probably too small,
* definitely too simplistic?

if (l != list_head(path->bitmapquals))
totalCost += 100.0 * cpu_operator_cost;

I'd be the first to say that that number has no experimental basis,
plus modeling it as a quasi-constant is theoretically wrong; surely
it ought to vary depending on how big we think the bitmap might be.

I'm not, however, very enamored of just replacing the "100" with some
other random constant without any evidence to back up the change.
Let's see some test cases, at least.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Michael Paquier 2018-09-18 00:33:07 Re: scram-sha-256 authentication broken in FIPS mode
Previous Message marcelo 2018-09-17 20:42:23 Re: Logical locking beyond pg_advisory