Re: multicolumn partitioning help

From: James Robertson <james(at)jsrobertson(dot)net>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: multicolumn partitioning help
Date: 2023-03-14 23:33:11
Message-ID: CAOA7M6u20MTMsRiWeKdtHUW4d15iEBiPN9GRBcZXyayFVhnBdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Laurenz, Justin,
Thank you both for thinking of this problem.

Laurenz your solution is how I thought I would work around my (lack of)
understanding of partitioning. (nested partitions).
I was hesitant because I didn't know what sort of performance problems I
would create for myself.

If we have true multi-column don't we get the benefit of:

TopLevelTable
|
|----> worker-thread 1
|
|----> worker-thread 2
|
|----> worker-thread n

Doesn't that give me more performance than:

TopLevelTable
|
|----> worker-thread 1
........|----> sub-table 1.1
........|----> sub-table 1.2
........|----> sub-table 1.n
|
|----> worker-thread 2
........|----> sub-table 2.1
........|----> sub-table 2.2
........|----> sub-table 2.n

or do we get?

TopLevelTable
|
|----> worker-thread 1 (default catch)
........|----> worker thread 2 -> sub-table 1.1
........|----> worker thread 3 -> sub-table 1.2
........|----> worker thread 4 -> sub-table 1.n
|
|----> worker-thread 5 (default catch)
........|----> worker thread 6 -> sub-table 2.1
........|----> worker thread 7 -> sub-table 2.2
........|----> worker thread 8 -> sub-table 2.n

Summary:
1) if we create nested partitions, do we create performance issues:
2) if nested partitions are the solutions, what is the point of
multi-column partitioning?

wish list) wouldn't it be neat if we can do mult-mode multi-column? like
PARTITION BY RANGE (EXTRACT(YEAR FROM dob)) LIST (SUBSTRING(hash, 1, 1));

On Tue, Mar 14, 2023 at 5:41 PM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
wrote:

> On Sun, 2023-03-12 at 13:59 -0400, James Robertson wrote:
> > I am having issues with multicolumn partitioning. For reference I am
> using the following link as my guide:
> > https://www.postgresql.org/docs/devel/sql-createtable.html
> >
> > To demonstrate my problem, I created a simple table called humans. I
> want to partition by the year
> > of the human birth and then the first character of the hash. So for each
> year I'll have year*16 partitions. (hex)
> >
> > CREATE TABLE humans (
> > hash bytea,
> > fname text,
> > dob date
> > )PARTITION BY RANGE (EXTRACT(YEAR FROM dob),substring(hash::text, 1,
> 1));
> >
> > Reading the documentation: "When creating a range partition, the lower
> bound specified with
> > FROM is an inclusive bound, whereas the upper bound specified with TO is
> an exclusive bound".
> >
> > However I can't insert any of the following after the first one, because
> it says it overlaps.
> > Do I need to do anything different when defining multi-column partitions?
> >
> >
> > This works:
> > CREATE TABLE humans_1968_0 PARTITION OF humans FOR VALUES FROM (1968,
> '0') TO (1969, '1');
> >
> >
> > These fail:
> > CREATE TABLE humans_1968_1 PARTITION OF humans FOR VALUES FROM (1968,
> '1') TO (1969, '2');
>
> Justin has explained what the problem is, let me supply a solution.
>
> I think you want subpartitioning, like
>
> CREATE TABLE humans (
> hash bytea,
> fname text,
> dob date
> ) PARTITION BY LIST (EXTRACT (YEAR FROM dob));
>
> CREATE TABLE humans_2002
> PARTITION OF humans FOR VALUES IN (2002)
> PARTITION BY HASH (hash);
>
> CREATE TABLE humans_2002_0
> PARTITION OF humans_2002 FOR VALUES WITH (MODULUS 26, REMAINDER 0);
>
> [...]
>
> CREATE TABLE humans_2002_25
> PARTITION OF humans_2002 FOR VALUES WITH (MODULUS 26, REMAINDER 25);
>
> and so on for the other years.
>
> Yours,
> Laurenz Albe
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Laurenz Albe 2023-03-15 07:37:39 Re: multicolumn partitioning help
Previous Message Laurenz Albe 2023-03-14 21:41:32 Re: multicolumn partitioning help