Re: Inval reliability, especially for inplace updates

From: Noah Misch <noah(at)leadboat(dot)com>
To: Nitin Motiani <nitinmotiani(at)google(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Inval reliability, especially for inplace updates
Date: 2024-10-24 02:53:57
Message-ID: 20241024025357.a8.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

With the releases wrapping in 2.5 weeks, I'm ambivalent about pushing this
before the release or after. Pushing before means fewer occurrences of
corruption, but pushing after gives more bake time to discover these changes
were defective. It's hard to predict which helps users more, on a
risk-adjusted basis. I'm leaning toward pushing this week. Opinions?

On Sun, Oct 20, 2024 at 06:41:37PM +0530, Nitin Motiani wrote:
> I tested the patch locally and it works. And I have no other question
> regarding the structure. So this patch looks good to me to commit.

Thanks. While resolving a back-branch merge conflict, I found
AtEOXact_Inval() and AtEOSubXact_Inval() were skipping inplaceInvalInfo tasks
if transInvalInfo==NULL. If one PreInplace_Inval() failed, the session's next
inplace update would get an assertion failure. Non-assert builds left
inplaceInvalInfo pointing to freed memory, but I didn't find a reachable
malfunction. Separately, the xact.c comment edit wasn't reflecting that v4
brought back the transactional inval. v6 fixes those. Regarding the
back-branch alternatives to the WAL format change:

On Tue, Jun 18, 2024 at 08:23:49AM -0700, Noah Misch wrote:
> On Mon, Jun 17, 2024 at 06:57:30PM -0700, Andres Freund wrote:
> > On 2024-06-17 16:58:54 -0700, Noah Misch wrote:
> > > On Sat, Jun 15, 2024 at 03:37:18PM -0700, Noah Misch wrote:
> > > > - heap_xlog_inplace() could set the shared-inval-queue overflow signal on
> > > > every backend. This is more wasteful, but inplace updates might be rare
> > > > enough (~once per VACUUM) to make it tolerable.
> >
> > We already set that surprisingly frequently, as
> > a) The size of the sinval queue is small
> > b) If a backend is busy, it does not process catchup interrupts
> > (i.e. executing queries, waiting for a lock prevents processing)
> > c) There's no deduplication of invals, we often end up sending the same inval
> > over and over.
> >
> > So I suspect this might not be too bad, compared to the current badness.
>
> That is good.

I benchmarked that by hacking 027_stream_regress.pl to run "pgbench
--no-vacuum --client=4 -T 30 -b select-only" on the standby, concurrent with
the primary running the regression tests. Standby clients acted on sinval
resetState ~16k times, and pgbench tps decreased 4.5%. That doesn't
necessarily mean the real-life cost would be unacceptable, but it was enough
of a decrease that I switched to the next choice:

> We might be able to do the overflow signal once at end of
> recovery, like RelationCacheInitFileRemove() does for the init file. That's
> mildly harder to reason about, but it would be cheaper. Hmmm.

I'm attaching the branch-specific patches for that and for the main fix.
Other notes from back-patching:

- All branches change the ABI of PrepareToInvalidateCacheTuple(), a function
catcache.c exports for the benefit of inval.c. No PGXN extension calls
that, and I can't think of a use case in extensions.

- Due to v15 commit 3aafc03, the patch for v14 differs to an unusual degree
from the patch for v15+v16. The difference is mostly mechanical, though.

Thanks,
nm

Attachment Content-Type Size
inplace155-backbranch-inval-v1_14.patch text/plain 5.1 KB
inplace155-backbranch-inval-v1_16.patch text/plain 5.1 KB
inplace155-backbranch-inval-v1_17.patch text/plain 4.6 KB
inplace160-inval-durability-inplace-v6_13.patch text/plain 30.4 KB
inplace160-inval-durability-inplace-v6_14.patch text/plain 30.9 KB
inplace160-inval-durability-inplace-v6_16.patch text/plain 35.2 KB
inplace160-inval-durability-inplace-v6_17.patch text/plain 35.9 KB
inplace160-inval-durability-inplace-v6.patch text/plain 41.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-10-24 04:17:39 Re: Refactor to use common function 'get_publications_str'.
Previous Message Tom Lane 2024-10-24 02:38:59 Re: Using Expanded Objects other than Arrays from plpgsql