From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, amul sul <sulamul(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key |
Date: | 2018-04-02 15:26:38 |
Message-ID: | CA+TgmobX6dhR2wn4Gji9+vQ5Bvnx_TTTAwzJvvOPv-4k5xqiGA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Mar 28, 2018 at 2:12 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> How will it break it? They'll see an invalid ctid and conclude that the
> tuple is dead? Without any changes that's already something that can
> happen if a later tuple in the chain has been pruned away. Sure, that
> code won't realize it should error out because the tuple is now in a
> different partition, but neither would a infomask bit.
>
> I think my big problem is that I just don't see what the worst that can
> happen is. We'd potentially see a broken ctid chain, something that very
> commonly happens, and consider the tuple to be invisible. That seems
> pretty sane behaviour for unadapted code, and not any worse than other
> potential solutions.
This is more or less my feeling as well. I think it's better to
conserve our limited supply of infomask bits as much as we can, and I
do think that we should try to reclaimed HEAP_MOVED_IN and
HEAP_MOVED_OFF in the future instead of defining the combination of
the two of them to mean something now.
The only scenario in which I can see this patch really leading to
disaster is if there's some previous release out there where the bit
pattern chosen by this patch has some other, incompatible meaning. As
far as we know, that's not the case: this bit pattern was previously
unused. Code seeing that bit pattern could potentially therefore (1)
barf on the valid CTID, but the whole point of this is to throw an
ERROR anyway, so if that happens then we're getting basically the
right behavior with the wrong error message or (2) just treat it as a
broken CTID link, in which case the result should be pretty much the
same as if this patch hadn't been committed in the first place.
Where the multixact patch really caused us a lot of trouble is that
the implications weren't just for the heap itself -- the relevant
SLRUs became subject to new retention requirements which in turn
affected vacuum, autovacuum, and checkpoint behavior. There is no
similar problem here -- the flag indicating the problematic situation,
however it ends up being stored, doesn't point to any external data.
Now, that doesn't mean there can't be some other kind of problem with
this patch, but I don't think that we should block the patch on the
theory that it might have an undiscovered problem that destroys the
entire PostgreSQL ecosystem with no theory as to what that problem
might actually be. Modulo implementation quality, I think the risk
level of this patch is somewhat but not vastly higher than
37484ad2aacef5ec794f4dd3d5cf814475180a78, which similarly defined a
previously-unused bit pattern in the tuple header. The reason I think
this one might be somewhat riskier is because AFAICS it's not so easy
to make sure we've found all the code, even in core, that might care,
as it was in that case; and also because updates happen more than
freezing.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2018-04-02 15:39:57 | Re: check_ssl_key_file_permissions should be in be-secure-common.c |
Previous Message | Craig Ringer | 2018-04-02 15:07:17 | Re: Feature Request - DDL deployment with logical replication |