From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Jeremy Schneider <schnjere(at)amazon(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code) |
Date: | 2020-08-26 19:15:51 |
Message-ID: | 20200826191551.GA19705@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2020-Aug-20, Jeremy Schneider wrote:
> While working with Nathan Bossart on an extension, we came across an
> interesting quirk and possible inconsistency in the PostgreSQL code
> around infomask flags. I'd like to know if there's something I'm
> misunderstanding here or if this really is a correctness/robustness gap
> in the code.
>
> At the root of it is the relationship between the XMAX_LOCK_ONLY and
> XMAX_COMMITTED infomask bits.
I spent a lot of time wondering about XMAX_COMMITTED. It seemed to me
that it would make no sense to have xacts that were lock-only yet have
XMAX_COMMITTED set; so I looked hard to make sure no place would ever
cause such a situation to arise. However, I still made my best effort
to make the code cope with such a combination correctly if it ever
showed up.
And I may have missed assumptions such as this one, even if they seem
obvious in retrospect, such as in compute_new_xmax_infomask (which, as
you'll notice, changed considerably from what was initially committed):
> But then there's one place in heapam.c where an assumption appears that
> this state will never happen - the compute_new_xmax_infomask() function:
>
> https://github.com/postgres/postgres/blob/9168793d7275b4b318c153d607fba55d14098c19/src/backend/access/heap/heapam.c#L4800
>
> else if (old_infomask & HEAP_XMAX_COMMITTED)
> {
> ...
> if (old_infomask2 & HEAP_KEYS_UPDATED)
> status = MultiXactStatusUpdate;
> else
> status = MultiXactStatusNoKeyUpdate;
> new_status = get_mxact_status_for_lock(mode, is_update);
> ...
> new_xmax = MultiXactIdCreate(xmax, status, add_to_xmax, new_status);
>
> When that code sees XMAX_COMMITTED, it assumes the xmax can't possibly
> be LOCK_ONLY and it sets the status to something sufficiently high
> enough to guarantee that ISUPDATE_from_mxstatus() returns true. That
> means that when you try to update this tuple and
> compute_new_xmax_infomask() is called with an "update" status, two
> "update" statuses are sent to MultiXactIdCreate() which then fails
> (thank goodness) with the error "new multixact has more than one
> updating member".
>
> https://github.com/postgres/postgres/blob/cd5e82256de5895595cdd99ecb03aea15b346f71/src/backend/access/transam/multixact.c#L784
Have you ever observed this error case hit? I've never seen a report of
that.
> The UpdateXmaxHintBits() code to always set the INVALID bit wasn't in
> any patches on the mailing list but it was committed and it seems to
> have worked very well. As a result it seems nearly impossible to get
> into the state where you have both XMAX_LOCK_ONLY and XMAX_COMMITTED
> bits set; thus it seems we've avoided problems in
> compute_new_xmax_infomask().
Phew.
(I guess I should fully expect that somebody, given sufficient time,
would carefully inspect the committed code against submitted patches ...
especially given that I do such inspections myself.)
> But nonetheless it seems worth making the code more robust by having the
> compute_new_xmax_infomask() function handle this state correctly just as
> the visibility code does. It should only require a simple patch like
> this (credit to Nathan Bossart):
>
> diff --git a/src/backend/access/heap/heapam.c
> b/src/backend/access/heap/heapam.c
> index d881f4cd46..371e3e4f61 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -4695,7 +4695,9 @@ compute_new_xmax_infomask(TransactionId xmax,
> uint16 old_infomask,
> l5:
> new_infomask = 0;
> new_infomask2 = 0;
> - if (old_infomask & HEAP_XMAX_INVALID)
> + if (old_infomask & HEAP_XMAX_INVALID ||
> + (old_infomask & HEAP_XMAX_COMMITTED &&
> + HEAP_XMAX_IS_LOCKED_ONLY(old_infomask)))
> {
> /*
> * No previous locker; we just insert our own TransactionId.
We could do this in stable branches, if there were any reports that
this inconsistency is happening in real world databases.
> Alternatively, if we don't want to take this approach, then I'd argue
> that we should update README.tuplock to explicitly state that
> XMAX_LOCK_ONLY and XMAX_COMMITTED are incompatible (just as it already
> states for HEAP_XMAX_IS_MULTI and HEAP_XMAX_COMMITTED) and clean up the
> code in heapam_visibility.c for consistency.
Yeah, I like this approach better for the master branch; not just clean
up as in remove the cases that handle it, but also actively elog(ERROR)
if the condition ever occurs (hopefully with some known way to fix the
problem; maybe by "WITH tup AS (DELETE FROM tab WHERE .. RETURNING *)
INSERT * INTO tab FROM tup" or similar.)
> Might be worth adding a note to README.tuplock either way about
> valid/invalid combinations of infomask flags. Might help avoid some
> confusion as people approach the code base.
Absolutely.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2020-08-26 20:12:22 | Re: Issue with past commit: Allow fractional input values for integer GUCs ... |
Previous Message | Alvaro Herrera | 2020-08-26 18:15:32 | Re: Handing off SLRU fsyncs to the checkpointer |