Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add SPLIT PARTITION/MERGE PARTITIONS commands
Date: 2024-08-22 17:25:07
Message-ID: CA+TgmoY0=bT_xBP8csR=MFE=FxGE2n2-me2-31jBOgEcLvW7ug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 22, 2024 at 12:43 PM Alexander Korotkov
<aekorotkov(at)gmail(dot)com> wrote:
> Thank you for your feedback. Yes, it seems that there is not enough
> time to even carefully analyze all the issues in these features. The
> rule of thumb I can get from this experience is "think multiple times
> before accessing something already opened by its name". I'm going to
> revert these features during next couple days.

Thanks, and sorry about that. I would say even "think multiple times"
is possibly not strong enough -- it might almost be "just don't ever
do it". Even if (in some particular case) the invalidation mechanism
seems to protect you from getting wrong answers, there are often holes
in that, specifically around search_path = foo, bar and you're
operating on an object in schema bar and an identically-named object
is created in schema foo at just the wrong time. Sometimes there are
problems even when search_path is not involved, but when it is, there
are more.

Here, aside from the name lookup issues, there are also problems with
expression evaluation: we can't split partitions without reindexing
rows that those partitions contain, and it is critical to think
through which is going to do the evaluation and make sure it's
properly sandboxed. I think we might need
SECURITY_RESTRICTED_OPERATION here.

Another thing I want to highlight if you do have another go at this
patch is that it's really critical to think about where every single
property of the newly-created tables comes from. The original patch
didn't consider relpersistence or tableam, and here I just discovered
that owner is also an issue that probably needs more consideration,
but it goes way beyond that. For example, I was surprised to discover
that if I put per-partition constraints or triggers on a partition and
then split it, they were not duplicated to the new partitions. Now,
maybe that's actually the behavior we want -- I'm not 100% positive --
but it sure wasn't what I was expecting. If we did duplicate them when
splitting, then what's supposed to happen when merging occurs? That is
not at all obvious, at least to me, but it needs careful thought. ACLs
and rules and default values and foreign keys (both outbond and
inbound) all need to be considered too, along with 27 other things
that I'm sure I'm not thinking about right now. Some of this behavior
should probably be explicitly documented, but all of it should be
considered carefully enough before commit to avoid surprises later. I
say that both from a security point of view and also just from a user
experience point of view. Even if things aren't insecure, they can
still be annoying, but it's not uncommon in cases like this for
annoying things to turn out to also be insecure.

Finally, if you do revisit this, I believe it would be a good idea to
think a bit harder about how data is moved around. My impression (and
please correct me if I am mistaken) is that currently, any split or
merge operation rewrites all the data in the source partition(s). If a
large partition is being split nearly equally, I think that has a good
chance of being optimal, but I think that might be the only case. If
we're merging partitions, wouldn't it be better to adjust the
constraints on the first partition -- or perhaps the largest partition
if we want to be clever -- and insert the data from all of the others
into it? Maybe that would even have syntax that puts the user in
control of which partition survives, e.g. ALTER TABLE tab1 MERGE
PARTITION part1 WITH part2, part3, .... That would also make it really
obvious to the user what all of the properties of part1 will be after
the merge: they will be exactly the same as they were before the
merge, except that the partition constraint will have been adjusted.
You basically dodge everything in the previous paragraph in one shot,
and it seems like it would also be faster. Splitting there's no
similar get-out-of-jail free card, at least not that I can see. Even
if you add syntax that splits a partition by using INSERT/DELETE to
move some rows to a newly-created partition, you still have to make at
least one new partition. But possibly that syntax is worth having
anyway, because it would be a lot quicker in the case of a highly
asymmetric split. On the other hand, maybe even splits are much more
likely and we don't really need it. I don't know.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-08-22 17:27:31 Re: Detailed release notes
Previous Message Jacob Champion 2024-08-22 17:22:47 Re: Feature-test macros for new-in-v17 libpq features