Re: [RFC] [PATCH] Flexible "partition pruning" hook

From: Mike Palmiotto <mike(dot)palmiotto(at)crunchydata(dot)com>
To: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] [PATCH] Flexible "partition pruning" hook
Date: 2019-02-26 18:06:31
Message-ID: CAMN686Ec3RM3HQVGz8Sk=KW4m29+mGRu81svrtCPB20hsHRzPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 26, 2019 at 1:55 AM Tsunakawa, Takayuki
<tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> wrote:
>
> From: Mike Palmiotto [mailto:mike(dot)palmiotto(at)crunchydata(dot)com]
> > Attached is a patch which attempts to solve a few problems:
> >
> > 1) Filtering out partitions flexibly based on the results of an external
> > function call (supplied by an extension).
> > 2) Filtering out partitions from pg_inherits based on the same function
> > call.
> > 3) Filtering out partitions from a partitioned table BEFORE the partition
> > is actually opened on-disk.
>
> What concrete problems would you expect this patch to solve? What kind of extensions do you imagine? I'd like to hear about the examples. For example, "PostgreSQL 12 will not be able to filter out enough partitions when planning/executing SELECT ... WHERE ... statement. But an extension like this can extract just one partition early."

My only application of the patch thus far has been to apply an RLS
policy driven by the extension's results. For example:

CREATE TABLE test.partpar
(
a int,
b text DEFAULT (extension_default_bfield('test.partpar'::regclass::oid))
) PARTITION BY LIST (extension_translate_bfield(b));

CREATE POLICY filter_select on test.partpar for SELECT
USING (extension_filter_by_bfield(b));

CREATE POLICY filter_select on test.partpar for INSERT
WITH CHECK (extension_generate_insert_bfield('test.partpar'::regclass::oid)
= b);

CREATE POLICY filter_update on test.partpar for UPDATE
USING (extension_filter_by_bfield(b))
WITH CHECK (extension_filter_by_bfield(b));

CREATE POLICY filter_delete on test.partpar for DELETE
USING (extension_filter_by_bfield(b));

The function would filter based on some external criteria relating to
the username and the contents of the b column.

The desired effect would be to have `SELECT * from test.partpar;`
return check only the partitions where username can see any row in the
table based on column b. This is applicable, for instance, when a
partition of test.partpar (say test.partpar_b2) is given a label with
`SECURITY LABEL on TABLE test.partpar_b2 IS 'foo';` which is exactly
the same as the b column for every row in said partition. Using this
hook, we can simply check the table label and kick the entire
partition out early on. This should greatly improve performance for
the case where you can enforce that the partition SECURITY LABEL and
the b column are the same.

>
> Would this help the following issues with PostgreSQL 12?
>
> * UPDATE/DELETE planning takes time in proportion to the number of partitions, even when the actually accessed partition during query execution is only one.
>
> * Making a generic plan takes prohibitably long time (IIRC, about 12 seconds when the number of partitoons is 1,000 or 8,000.)

In theory, we'd be checking fewer items (the labels of the partitions,
instead of the b column for every row), so it may indeed help with
performance in these cases.

Admittedly, I haven't looked at either of these very closely. Do you
have any specific test cases I can try out on my end to verify?
--
Mike Palmiotto
Software Engineer
Crunchy Data Solutions
https://crunchydata.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-02-26 18:25:29 A note about recent ecpg buildfarm failures
Previous Message Andres Freund 2019-02-26 17:56:57 Re: pgbench MAX_ARGS