Re: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Chris Wilson <chris(dot)wilson(at)cantabcapital(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction
Date: 2019-06-20 16:30:44
Message-ID: 20190620163044.GU16019@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Jun 20, 2019 at 05:18:33PM +0100, Simon Riggs wrote:
> On Thu, 20 Jun 2019 at 17:01, Chris Wilson <chris(dot)wilson(at)cantabcapital(dot)com>
> wrote:
>
>
> > I deliberately included r in the index, to demonstrate the issue that I’m
> > seeing. I know that there is very little locality in this particular,
> > dummy, arbitrary test case. I can try to produce a test case that has some
> > locality, but I expect it to show exactly the same results, i.e. that the
> > BRIN index performs much worse when we try to query on this column as well.
> >
>
> I'm suggesting that adding the second column to the index is the source of
> your problem, not adding the column to the query.

But it *is* odd that the index returns more rows with a strictly tighter
conditions, right ?

Note, it's not an issue of rowcount estimate being confused by redundant
conditions, but real rowcount, and it returns more rows even when the
conditions are duplicative. Compare:

postgres=# explain analyze select * from brin_test where id >= 90000 and r in (1);
...
-> Bitmap Index Scan on brin_test_id_r_idx (cost=0.00..12.03 rows=28125 width=0) (actual time=0.136..0.137 rows=37120 loops=1)
Index Cond: ((id >= 90000) AND (r = 1))

postgres=# explain analyze select * from brin_test where id >= 90000 and r in (1,1);
...
-> Bitmap Index Scan on brin_test_id_r_idx (cost=0.00..12.03 rows=28125 width=0) (actual time=0.263..0.263 rows=74240 loops=1)
Index Cond: ((id >= 90000) AND (r = ANY ('{1,1}'::integer[])))

postgres=# explain analyze select * from brin_test where id >= 90000 and r in (1,1,1);
...
-> Bitmap Index Scan on brin_test_id_r_idx (cost=0.00..12.03 rows=28125 width=0) (actual time=0.387..0.387 rows=111360 loops=1)
Index Cond: ((id >= 90000) AND (r = ANY ('{1,1,1}'::integer[])))

Note, the docs say:
https://www.postgresql.org/docs/devel/indexes-multicolumn.html
|A multicolumn BRIN index can be used with query conditions that involve any
|subset of the index's columns. Like GIN and unlike B-tree or GiST, index search
|effectiveness is the same regardless of which index column(s) the query
|conditions use. The only reason to have multiple BRIN indexes instead of one
|multicolumn BRIN index on a single table is to have a different pages_per_range
|storage parameter.

Justin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Simon Riggs 2019-06-20 16:51:26 Re: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction
Previous Message Simon Riggs 2019-06-20 16:18:33 Re: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction