From: | Philippe Pepiot <phil(at)philpep(dot)org> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Range partitioning query performance with date_trunc (vs timescaledb) |
Date: | 2023-09-11 12:21:30 |
Message-ID: | ZP8GSpzTAbXZql4z@bezout.in.philpep.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 29/08/2023, David Rowley wrote:
> On Tue, 29 Aug 2023 at 19:40, Philippe Pepiot <phil(at)philpep(dot)org> wrote:
> > I'm trying to implement some range partitioning on timeseries data. But it
> > looks some queries involving date_trunc() doesn't make use of partitioning.
> >
> > BEGIN;
> > CREATE TABLE test (
> > time TIMESTAMP WITHOUT TIME ZONE NOT NULL,
> > value FLOAT NOT NULL
> > ) PARTITION BY RANGE (time);
> > CREATE INDEX test_time_idx ON test(time DESC);
> > CREATE TABLE test_y2010 PARTITION OF test FOR VALUES FROM ('2020-01-01') TO ('2021-01-01');
> > CREATE TABLE test_y2011 PARTITION OF test FOR VALUES FROM ('2021-01-01') TO ('2022-01-01');
> > CREATE VIEW vtest AS SELECT DATE_TRUNC('year', time) AS time, SUM(value) AS value FROM test GROUP BY 1;
> > EXPLAIN (COSTS OFF) SELECT * FROM vtest WHERE time >= TIMESTAMP '2021-01-01';
> > ROLLBACK;
> >
> > The plan query all partitions:
>
> > I wonder if there is a way with a reasonable amount of SQL code to achieve this
> > with vanilla postgres ?
>
> The only options I see for you are
>
> 1) partition by LIST(date_Trunc('year', time)), or;
> 2) use a set-returning function instead of a view and pass the date
> range you want to select from the underlying table via parameters.
>
> I imagine you won't want to do #1. However, it would at least also
> allow the aggregation to be performed before the Append if you SET
> enable_partitionwise_aggregate=1.
>
> #2 isn't as flexible as a view as you'd have to create another
> function or expand the parameters of the existing one if you want to
> add items to the WHERE clause.
>
> Unfortunately, date_trunc is just a black box to partition pruning, so
> it's not able to determine that DATE_TRUNC('year', time) >=
> '2021-01-01' is the same as time >= '2021-01-01'. It would be
> possible to make PostgreSQL do that, but that's a core code change,
> not something that you can do from SQL.
Ok I think I'll go for Set-returning function since
LIST or RANGE on (date_trunc('year', time)) will break advantage of
partitioning when querying with "time betwen x and y".
Thanks!
From | Date | Subject | |
---|---|---|---|
Next Message | Frits Hoogland | 2023-09-11 13:54:59 | Re: Planning time is time-consuming |
Previous Message | David Rowley | 2023-09-11 10:17:09 | Re: Planning time is time-consuming |