Re: Recovering from detoast-related catcache invalidations

From: Noah Misch <noah(at)leadboat(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Xiaoran Wang <fanfuxiaoran(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Recovering from detoast-related catcache invalidations
Date: 2025-01-07 21:56:53
Message-ID: 20250107215653.82.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 24, 2024 at 12:18:09AM +0200, Heikki Linnakangas wrote:
> On 14/12/2024 02:06, Heikki Linnakangas wrote:
> > Ok, I missed that. It does not handle the 2nd scenario though: If a new
> > catalog tuple is concurrently inserted that should be part of the list,
> > it is missed.

> > Attached is an injection point test case to reproduce that. If you
> > change the test so that the function's body is shorter, so that it's not
> > toasted, the test passes.

Nice discovery.

> I'm thinking of the attached to fix this. It changes the strategy for
> detecting concurrent cache invalidations. Instead of the "recheck" mechanism
> that was introduced in commit ad98fb1422, keep a stack of "build
> in-progress" entries, and CatCacheInvalidate() invalidate those
> "in-progress" entries in addition to the actual CatCTup and CatCList
> entries.
>
> My first attempt was to insert the CatCTup or CatCList entry to the catcache
> before starting to build it, marked with a flag to indicate that the entry
> isn't fully built yet. But when I started to write that it got pretty
> invasive, and it felt easier to add another structure to hold the
> in-progress entries instead.

That's similar to how relcache has been doing it (in_progress_list). I see no
problem applying that technique here.

> not sure I want to
> commit the test with the injection point, but it's useful now to demonstrate
> the bug.)

I'd err on the side of including it. Apart from some copied comments, the
test looks ready.

On Wed, Dec 25, 2024 at 11:27:38AM +0200, Heikki Linnakangas wrote:
> +++ b/src/test/modules/test_misc/t/007_bugs.pl

test_misc/t/007_bugs.pl could be home to almost anything. How about naming it
007_catcache_inval.pl?

> @@ -744,6 +770,13 @@ ResetCatalogCache(CatCache *cache)
> #endif
> }
> }
> +
> + /* Also invalidate any entries that are being built */
> + for (CatCCreating *e = catcache_creating_stack; e != NULL; e = e->next)
> + {
> + if (e->cache == cache)
> + e->dead = true;
> + }
> }

With debug_discard_caches=1, "make check" hangs early with some INSERT using
100% CPU. The new test file does likewise. I bet this needs a special case
to short-circuit debug_discard_caches, like RelationCacheInvalidate() has.

> @@ -1665,6 +1698,8 @@ SearchCatCacheList(CatCache *cache,
> HeapTuple ntp;
> MemoryContext oldcxt;
> int i;
> + volatile CatCCreating creating_list;

You could drop the volatile by copying catcache_creating_stack to an
additional var that you reference after longjmp, instead of referencing
creating_list.next after longjmp. Likewise for the instance of volatile in
CatalogCacheCreateEntry().
https://pubs.opengroup.org/onlinepubs/9799919799/functions/longjmp.html has
the rules. Using volatile is fine, though.

> + bool first_iter = true;

This is okay as non-volatile, but it could just as easily move inside the
PG_TRY.

> @@ -2076,38 +2118,33 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, SysScanDesc scandesc,

> + PG_TRY();
> {
> - matches = equalTuple(before, ntp);
> - heap_freetuple(before);
> + dtp = toast_flatten_tuple(ntp, cache->cc_tupdesc);

gcc 4.8.5 warns:

catcache.c: In function ‘CatalogCacheCreateEntry’:
catcache.c:2159:29: warning: ‘dtp’ may be used uninitialized in this function [-Wmaybe-uninitialized]
ct->tuple.t_tableOid = dtp->t_tableOid;

I remember some commit of a gcc 4.x warning fix in a recent year, and we do
have buildfarm representation. Consider silencing it.

The rest looks good. Thanks.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ilia Evdokimov 2025-01-07 22:07:57 Re: Sample rate added to pg_stat_statements
Previous Message Nathan Bossart 2025-01-07 21:36:37 Re: Small patch to use pqMsg_Query instead of `Q`