Re: Partitioning, Identity and Uniqueness (given pg 16 changes)

From: Ketan Popat <ketanpostgres(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Darryl Green <darryl(dot)green(at)gmail(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Partitioning, Identity and Uniqueness (given pg 16 changes)
Date: 2024-02-19 08:08:10
Message-ID: CAKx+mvZf9xsr1EtdtRm=wiY5vpb3tZMyw9yyemiQw9_=4UvpDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>
> 1) Is my attempt to improve performance of insert AND (as the number of
> sources goes up, as it has) querying which is invariably by id but it is
> easy to include "and src = x" in the majority of usage (the other common
> case is to query across all sources, asking for a set is unusual) a
> reasonable approach?
>
> 2) It would be nice to be able to specify the id as pk on the table being
> partitioned (as it was in the non-partitioned definition of the table) once
> to document and enforce that the partitions simply inherit the id pk. This
> would seem only to need the "partition by" validation to allow a column not
> mentioned in partition by clause to be defined as pk or unique if and only
> if the pk/unique column is an identity column. Not a big deal but is this
> practical/valid?
>
> One of the potential solutions could be to include src into your
partitioning key as follows.

create table tpart (id int generated always as identity not null, src int,
data varchar, primary key (id, src)) partition by hash(src) ;
create table tpart_p001 partition of ttest_part for values with (modulus
4, remainder 0);
...
create table tpart_p032 partition of ttest_part for values with (modulus
32, remainder 31);

App side changes required:

- Ensure all queries come with src as a filter, otherwise it could
backfire performance due to partition pruning overhead.

Here is how it could help:

- "generated always as identity" for id would ensure unique id and src
can be used for hash partitioning.
- insert performance may improve if you are seeing resource contention
related wait events during insert.
- it would greatly help with the access pattern querying time series for
single src

Thanks,
Ketan

On Sun, Feb 18, 2024 at 8:23 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> On Mon, 19 Feb 2024 at 16:32, Darryl Green <darryl(dot)green(at)gmail(dot)com> wrote:
> > 2) It would be nice to be able to specify the id as pk on the table
> being partitioned (as it was in the non-partitioned definition of the
> table) once to document and enforce that the partitions simply inherit the
> id pk. This would seem only to need the "partition by" validation to allow
> a column not mentioned in partition by clause to be defined as pk or unique
> if and only if the pk/unique column is an identity column. Not a big deal
> but is this practical/valid?
>
> Unfortunately, it's not as easy as you think. It's not a matter of
> dropping the check that requires all PRIMARY KEY columns are present
> in the PARTITION BY clause. For this to work a *single* index (i.e.
> non-partitioned index) would have to index all partitions. Otherwise,
> how would Postgres ensure that the value being inserted doesn't exist
> in some other partition?
>
> The problem with a single index is that it kinda defeats the purpose
> of partitioning, i.e., "my table is large and I want to split it up".
> Operations such as DETACH PARTITION would have to become more than
> just a metadata operation when you consider having to trawl through
> the index and remove all records belonging to a single partition.
>
> It may be possible to still have it work by doing a speculative record
> in the index for the target table then go and check all of the other
> indexes before marking the speculative entry as valid. I think it
> would be very tricky to make it work well, however. Imagine how
> expensive ATTACH PARTITION would be! There are probably other race
> conditions I've not thought about too. Likely, we'd get more
> complaints about this being a terrible feature than we do due to the
> fact that it's unsupported.
>
> David
>
>
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Darryl Green 2024-02-19 09:07:44 Re: Partitioning, Identity and Uniqueness (given pg 16 changes)
Previous Message Peter Eisentraut 2024-02-19 07:31:35 Re: Partitioning, Identity and Uniqueness (given pg 16 changes)