Re: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Chris Wilson <chris(dot)wilson(at)cantabcapital(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction
Date: 2019-06-20 15:56:47
Message-ID: CANP8+jKwFR2=Jjb30XQxVZaDo4knP4XbmS26sZVZ8yW9iaFV+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, 20 Jun 2019 at 16:13, Chris Wilson <chris(dot)wilson(at)cantabcapital(dot)com>
wrote:

> Dear Postgres performance experts,
>
>
>
> I noticed that when I added a BRIN index to a very large table, attempting
> to make a particular query faster, it became much slower instead. While
> trying to understand this, I noticed that the actual number of rows in the
> EXPLAIN ANALYZE output was much higher than I expected. I was able to
> produce a repeatable test case for this. I’m not sure if this is actually a
> bug, or simply that the “number of rows” means something different than I
> expected.
>
>
>
> This reproducible test case is not especially slow, because I wanted to
> make it easy and fast to run and understand. Right now I’d just like to
> understand why it behaves this way.
>
>
>
> The SQL is to create the test case is:
>
>
>
> *drop* *table* brin_test;
>
> *create* *table* brin_test *AS* *SELECT* *generate_series* *as* id,
> *generate_series* % 100 *as* r *from* *generate_series*(1,100000);
>
> *create* *index* idx_brin_test_brin *on* brin_test *using* brin (id, r)
> *with* (pages_per_range = 32);
>

You've created the index on (id,r) rather than just (id)

> *vacuum* *analyze* brin_test;
>
>
>
> And here are two queries to compare:
>
>
>
> *explain* *analyze* *select* * *from* brin_test *where* id >= 90000;
>
> *explain* *analyze* *select* * *from* brin_test *where* id >= 90000 *and*
> r *in* (1,3);
>
>
>
> With the following results:
>
>
>
> testing=# explain analyze select * from brin_test where id >= 90000;
>
> QUERY PLAN
>
>
> ---------------------------------------------------------------------------------------------------------------------------------
>
> Bitmap Heap Scan on brin_test (cost=8.55..630.13 rows=10146 width=8)
> (actual time=0.474..1.796 rows=10001 loops=1)
>
> Recheck Cond: (id >= 90000)
>
> Rows Removed by Index Recheck: 3215
>
> Heap Blocks: lossy=59
>
> -> Bitmap Index Scan on idx_brin_test_brin (cost=0.00..6.02
> rows=14286 width=0) (actual time=0.026..0.026 rows=640 loops=1)
>
> Index Cond: (id >= 90000)
>
> Planning Time: 0.155 ms
>
> Execution Time: 2.133 ms
>
> (8 rows)
>
>
>
> testing=# explain analyze select * from brin_test where id >= 90000 and r
> in (1,3);
>
> QUERY PLAN
>
>
> ---------------------------------------------------------------------------------------------------------------------------------
>
> Bitmap Heap Scan on brin_test (cost=6.06..556.21 rows=219 width=8)
> (actual time=6.101..23.927 rows=200 loops=1)
>
> Recheck Cond: ((id >= 90000) AND (r = ANY ('{1,3}'::integer[])))
>
> Rows Removed by Index Recheck: 13016
>
> Heap Blocks: lossy=59
>
> -> Bitmap Index Scan on idx_brin_test_brin (cost=0.00..6.01 rows=7143
> width=0) (actual time=0.038..0.038 rows=1280 loops=1)
>
> Index Cond: ((id >= 90000) AND (r = ANY ('{1,3}'::integer[])))
>
> Planning Time: 0.071 ms
>
> Execution Time: 23.954 ms
>
> (8 rows)
>
>
>
> Note that introducing a disjunction (set of possible values) into the
> query doubles the number of actual rows returned, and increases the number
> removed by the index recheck.
>

Strange, yes.

> It looks to me as though perhaps the BRIN index does not completely
> support queries with a set of possible values, and executes the query
> multiple times (try adding more values of R to see what I mean).
>

That doesn't appear to be happening.

> The execution time also increases massively.
>
>
>
> Could anyone help me to understand what’s going on here, and whether
> there’s a bug or limitation of BRIN indexes? If it’s a limitation, then the
> query planner does not seem to account for it, and chooses this plan even
> when it’s a bad one (much worse than removing result rows using a filter).
>

The second column changes the way the index is defined. It appears there
is very little locality for the r column, so try removing it.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Solutions for the Enterprise

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Chris Wilson 2019-06-20 16:00:43 RE: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction
Previous Message Chris Wilson 2019-06-20 15:12:18 EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction