From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Proposal: Automatic partition creation |
Date: | 2020-07-06 14:59:47 |
Message-ID: | 20200706145947.GX4107@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 06, 2020 at 01:45:52PM +0300, Anastasia Lubennikova wrote:
> The previous discussion of automatic partition creation [1] has addressed
> static and dynamic creation of partitions and ended up with several syntax
> proposals.
...
> where partition_auto_create_clause is
>
> CONFIGURATION [IMMEDIATE| DEFERRED] USING partition_bound_spec
> - IMMEDIATE| DEFERRED is optional, DEFERRED is not implemented yet
> I wonder, is it worth placing a stub for dynamic partitioning, or we can
> rather add these keywords later.
I understand by "deferred" you mean that the partition isn't created at the
time CREATE TABLE is run but rather deferred until needed by INSERT.
For deferred, range partitioned tables, I think maybe what you'd want to
specify (and store) is the INTERVAL. If the table is partitioned by day, then
we'd date_trunc('day', time) and dynamically create that day. But if it was
partitioned by month, we'd create the month. I think you'd want to have an
ALTER command for that (we would use that to change tables between
daily/monthly based on their current size). That should also support setting
the MODULUS of a HASH partitioned table, to allow changing the size of its
partitions (currently, the user would have to more or less recreate the table
and move all its data into different partitions, but that's not ideal).
I don't know if it's important for anyone, but it would be interesting to think
about supporting sub-partitioning: partitions which are themselvese partitioned.
Like something => something_YYYY => something_YYYY_MM => something_YYYY_MM_DD.
You'd need to specify how to partition each layer of the heirarchy. In the
most general case, it could be different partition strategy.
If you have a callback function for partition renaming, I think you'd want to
pass it not just the current name of the partition, but also the "VALUES" used
in partition creation. Like (2020-04-05)TO(2020-05-06). Maybe instead, we'd
allow setting a "format" to use to construct the partition name. Like
"child.foo_bar_%Y_%m_%d". Ideally, the formats would be fixed-length
(zero-padded, etc), so failures with length can happen at "parse" time of the
statement and not at "run" time of the creation. You'd still have to handle
the case that the name already exists but isn't a partition (or is a partition
by doesn't handle the incoming tuple for some reason).
Also, maybe your "configuration" syntax would allow specifying other values.
Maybe including a retention period (as an INTERVAL for RANGE tables). That's
useful if you had a command to PRUNE the oldest partitions, like ALTER..PRUNE.
--
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-07-06 15:05:46 | Re: Ideas about a better API for postgres_fdw remote estimates |
Previous Message | Alvaro Herrera | 2020-07-06 14:56:53 | Re: Cache lookup errors with functions manipulation object addresses |