Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "Wood, Dan" <hexpert(at)amazon(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-committers(at)postgresql(dot)org, "Wong, Yi Wen" <yiwong(at)amazon(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Date: 2017-12-07 03:08:38
Message-ID: CAB7nPqSd6vRpo_0GutiQuEuUh==5V7nyMWn0iKDfpbCXMS1d_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Thu, Dec 7, 2017 at 1:21 AM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> Put together, I propose the attached delta for 0001.

I have been looking at Andres' 0001 and your tweaks here for some time
since yesterday...

I have also performed sanity checks using all the scripts that have
accumulated on my archives for this stuff. This looks solid to me. I
have not seen failures with broken hot chains, REINDEX, etc.

> This way, an xmax that has exactly the OldestXmin value will return
> RECENTLY_DEAD rather DEAD, which seems reasonable to me (since
> OldestXmin value itself is supposed to be still possibly visible to
> somebody). Also, this way it is consistent with the other comparison to
> OldestXmin at the bottom of the function. There is no reason for the
> "else" or the extra braces.

+1. It would be nice to add a comment in the patched portion
mentioning that the new code had better match what is at the bottom of
the function.

+ else if (!MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple), false))
+ {
+ /*
+ * Not in Progress, Not Committed, so either Aborted or crashed.
+ * Mark the Xmax as invalid.
+ */
+ SetHintBits(tuple, buffer, HEAP_XMAX_INVALID, InvalidTransactionId);
}

- /*
- * Not in Progress, Not Committed, so either Aborted or crashed.
- * Remove the Xmax.
- */
- SetHintBits(tuple, buffer, HEAP_XMAX_INVALID, InvalidTransactionId);
return HEAPTUPLE_LIVE;

I would find cleaner if the last "else if" is put into its own
separate if condition, and that for a multixact still running this
refers to an updating transaction aborted so hint bits are not set.

> Your commit message does a poor job of acknowledging prior work on
> diagnosing the problem starting from Dan's initial test case and patch.

(Nit: I have extracted from the test case of Dan an isolation test,
which Andres has reduced to a subset of permutations. Peter G. also
complained about what is visibly the same bug we are discussing here
but without a test case.)
--
Michael

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Amit Khandekar 2017-12-07 07:02:46 Re: pgsql: Support Parallel Append plan nodes.
Previous Message Bossart, Nathan 2017-12-07 02:42:12 Re: pgsql: When VACUUM or ANALYZE skips a concurrently dropped table, log i

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2017-12-07 03:20:44 Re: Postgres with pthread
Previous Message Bossart, Nathan 2017-12-07 02:42:12 Re: pgsql: When VACUUM or ANALYZE skips a concurrently dropped table, log i