Re: Secondary index access optimizations

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Secondary index access optimizations
Date: 2017-09-05 08:10:38
Message-ID: f0f0df40-c90e-fcce-5ddb-c7b371f4920f@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05.09.2017 04:02, Amit Langote wrote:
> Like Thomas, I'm not so sure about the whole predtest.c patch. The core
> logic in operator_predicate_proof() should be able to conclude that, say,
> k < 21 is implied by k <= 20, which you are trying to address with some
> special case code. If there is still problem you think need to be fixed
> here, a better place to look at would be somewhere around get_btree_test_op().

Frankly speaking I also do not like this part of my patch.
I will be pleased if you or somebody else can propose better solution.
I do not understand how get_btree_test_op() can help here.

Yes, k < 21 is implied by k <= 20. It is based on generic properties of
< and <= operators.
But I need to proof something different: having table partition
constraint (k < 21) I want to remove predicate (k <= 20) from query.
In other words, operator_predicate_proof() should be able to conclude
that (k <= 20) is implied by (k < 21).
But it is true only for integer types, not for floating point types. And
Postgres operator definition
doesn't provide some way to determine that user defined type is integer
type: has integer values for which such conclusion is true.

Why I think that it is important? Certainly, it is possible to rewrite
query as (k < 21) and no changes in operator_predicate_proof() are needed.
Assume the most natural use case: I have some positive integer key and I
wan to range partition table by such key, for example with interval 10000.
Currently standard PostgreSQL partitioning mechanism requires to specify
intervals with open high boundary.
So if I want first partition to contain interval [1,10000], second -
[10001,20001],... I have to create partitions in such way:

create table bt (k integer, v integer) partition by range (k);
create table dt1 partition of bt for values from (1) to (10001);
create table dt2 partition of bt for values from (10001) to (20001);
...

If I want to write query inspecting data of the particular partition,
then most likely I will use BETWEEN operator:

SELECT * FROM t WHERE k BETWEEN 1 and 10000;

But right now operator_predicate_proof() is not able to conclude that
predicate (k BETWEEN 1 and 10000) transformed to (k >= 1 AND k <= 10000)
is equivalent to (k >= 1 AND k < 10001)
which is used as partition constraint.

Another very popular use case (for example mentioned in PostgreSQL
documentation of partitioning:
https://www.postgresql.org/docs/10/static/ddl-partitioning.html)
is using date as partition key:

CREATE TABLE measurement (
city_id int not null,
logdate date not null,
peaktemp int,
unitsales int
) PARTITION BY RANGE (logdate);

CREATE TABLE measurement_y2006m03 PARTITION OF measurement
FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')

Assume that now I want to get measurements for March:

There are three ways to write this query:

select * from measurement where extract(month from logdate) = 3;
select * from measurement where logdate between '2006-03-01' AND
'2006-03-31';
select * from measurement where logdate >= '2006-03-01' AND logdate <
'2006-04-01';

Right now only for the last query optimal query plan will be constructed.
Unfortunately my patch is not covering date type.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-09-05 08:14:33 Re: show "aggressive" or not in autovacuum logs
Previous Message Amit Langote 2017-09-05 07:46:15 Re: Partition-wise join for join between (declaratively) partitioned tables