BUG #14711: Inconsistent statistics on expression with partial index

From: noskov(dot)d(at)gmail(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #14711: Inconsistent statistics on expression with partial index
Date: 2017-06-18 12:57:51
Message-ID: 20170618125751.1544.95074@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 14711
Logged by: Dzmitry Naskou
Email address: noskov(dot)d(at)gmail(dot)com
PostgreSQL version: 9.6.1
Operating system: Debian 6.2.0-10
Description:

[PostgreSQL 9.6.1 on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.2.0-10)
6.2.0 20161027, 64-bit]

I have a table with timestamp ranges:

create table testing.test as
select tsrange(d, null) ts from
generate_series(timestamp '2000-01-01', timestamp '2018-01-01', interval '1
minute') s(d);
I need to run the following query:

select *
from testing.test
where lower(ts)> '2017-06-17 20:00:00'::timestamp and upper_inf(ts)
Explain analyze result for table without indexes:

Seq Scan on test (cost=0.00..72482.26 rows=1052013 width=14) (actual
time=2165.477..2239.781 rows=283920 loops=1)
Filter: (upper_inf(ts) AND (lower(ts) > '2017-06-17 20:00:00'::timestamp
without time zone))
Rows Removed by Filter: 9184081
Planning time: 0.046 ms
Execution time: 2250.221 ms
Next I'm going to add a following partial index:

create index lower_rt_inf ON testing.test using btree(lower(ts)) where
upper_inf(ts);
analyze testing.test;
Explain analyze result for table with partial index:

Index Scan using lower_rt_inf on test (cost=0.04..10939.03 rows=1051995
width=14) (actual time=0.037..52.083 rows=283920 loops=1)
Index Cond: (lower(ts) > '2017-06-17 20:00:00'::timestamp without time
zone)
Planning time: 0.156 ms
Execution time: 62.900 ms

And:

SELECT null_frac, n_distinct, correlation FROM pg_catalog.pg_stats WHERE
tablename = 'lower_rt_inf'

null_frac |n_distinct |correlation |
----------|-----------|------------|
0 |-1 |1 |
Then I create an index similar to the previous one, but without partial
condition:

create index lower_rt_full ON testing.test using btree(lower(ts));
analyze testing.test;
And now the same index is used, but the cost/rows are different:

Index Scan using lower_rt_inf on test (cost=0.04..1053.87 rows=101256
width=14) (actual time=0.029..58.613 rows=283920 loops=1)
Index Cond: (lower(ts) > '2017-06-17 20:00:00'::timestamp without time
zone)
Planning time: 0.280 ms
Execution time: 71.794 ms

And a bit more: select * from testing.test where lower(ts)> '2017-06-17
20:00:00'::timestamp

Index Scan using lower_rt_full on test (cost=0.04..3159.52 rows=303767
width=14) (actual time=0.036..64.208 rows=283920 loops=1)
Index Cond: (lower(ts) > '2017-06-17 20:00:00'::timestamp without time
zone)
Planning time: 0.099 ms
Execution time: 78.759 ms

How can I effectively use partial indexes for expressions?

Browse pgsql-bugs by date

  From Date Subject
Next Message Andrew Gierth 2017-06-18 22:43:59 Re: BUG #14709: inconsistent answers with foreign data wrappers to mysql
Previous Message Peter Eisentraut 2017-06-17 12:56:06 Re: BUG #14710: Function now() and logical replication