Postgres using inefficient query plan when using OR filter, uses correct indices when using IN clause

From: Sabit Nepal <gta0004(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Postgres using inefficient query plan when using OR filter, uses correct indices when using IN clause
Date: 2019-05-02 17:14:31
Message-ID: CALa=TVpP4SFQHDDH7q2pkZ-c0oPGqBZKuqvmQ664231=fVdZTA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I have a query that is being ran on PGSQL, and when queried at a fast rate
for large data sets, it is taking a long time to run because it isn't
making use of the available indexes. I found that changing the filter from
multiple OR's to an IN clause causes the right index to be used. Is there a
way I can force the index to be used even when using OR's?

Query with Disjunction:

SELECT field1, field2,..., fieldN
> FROM table1 WHERE
> ((((field9='val1' OR field9='val2') OR field9='val3') OR field9='val4')
> AND (field6='val5'));

Query Plan:

"Bitmap Heap Scan on table1 (cost=18.85..19.88 rows=1 width=395) (actual
> time=0.017..0.017 rows=0 loops=1)"
> " Recheck Cond: (((field6)::text = 'val5'::text) AND (((field9)::text =
> 'val1'::text) OR ((field9)::text = 'val2'::text) OR ((field9)::text =
> 'val3'::text) OR ((field9)::text = 'val4'::text)))"
> " -> BitmapAnd (cost=18.85..18.85 rows=1 width=0) (actual
> time=0.016..0.016 rows=0 loops=1)"
> " -> Bitmap Index Scan on idx_field6_field9 (cost=0.00..9.01
> rows=611 width=0) (actual time=0.015..0.015 rows=0 loops=1)"
> " Index Cond: ((field6)::text = 'val5'::text)"
> " -> BitmapOr (cost=9.59..9.59 rows=516 width=0) (never executed)"
> " -> Bitmap Index Scan on idx_id_field9 (cost=0.00..2.40
> rows=129 width=0) (never executed)"
> " Index Cond: ((field9)::text = 'val1'::text)"
> " -> Bitmap Index Scan on idx_id_field9 (cost=0.00..2.40
> rows=129 width=0) (never executed)"
> " Index Cond: ((field9)::text = 'val2'::text)"
> " -> Bitmap Index Scan on idx_id_field9 (cost=0.00..2.40
> rows=129 width=0) (never executed)"
> " Index Cond: ((field9)::text = 'val3'::text)"
> " -> Bitmap Index Scan on idx_id_field9 (cost=0.00..2.40
> rows=129 width=0) (never executed)"
> " Index Cond: ((field9)::text = 'val4'::text)"
> "Planning time: 0.177 ms"
> "Execution time: 0.061 ms"

Query with IN

SELECT field1, field2,..., fieldN
> FROM table1
> WHERE
> ((field9 IN ('val1', 'val2', 'val3', 'val4'))
> AND (field6='val5'));

Query Plan:

"Index Scan using idx_field6_field9 on table1 (cost=0.43..6.77 rows=1
> width=395) (actual time=0.032..0.032 rows=0 loops=1)"
> " Index Cond: (((field6)::text = 'val5'::text) AND ((field9)::text = ANY
> ('{val1,val2,val3,val4}'::text[])))"
> "Planning time: 0.145 ms"
> "Execution time: 0.055 ms"

There is an index on field 6 and field 9 which the second query uses as
expected, which the first one also should. Field9 is also kind of like a
state field, so its cardinality is extremely low - there's only like 9
different values across the whole table. Unfortunately, it isn't
straightforward to change the query to use an IN clause, so getting PG to
use the right plan would be ideal, even if it means using a different type
of index.

Thank you

Browse pgsql-general by date

  From Date Subject
Next Message rihad 2019-05-02 17:45:25 Re: Upgrading locale issues
Previous Message Melvin Davidson 2019-05-02 16:57:48 Re: Back Slash \ issue