From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Noah Misch <noah(at)leadboat(dot)com>, Xiaoran Wang <fanfuxiaoran(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Recovering from detoast-related catcache invalidations |
Date: | 2024-12-14 00:06:53 |
Message-ID: | 2234dc98-06fe-42ed-b5db-ac17384dc880@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 13/12/2024 17:30, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
>> CatalogCacheCreateEntry() can accept catcache invalidations when it
>> opens the toast table, and it now has recheck logic to detect the case
>> that the tuple it's processing (ntp) is invalidated. However, isn't it
>> also possible that it accepts an invalidation message for a tuple that
>> we had processed in an earlier iteration of the loop? Or that a new
>> catalog tuple was inserted that should be part of the list we're building?
>
> The expectation is that the list will be built and returned to the
> caller, but it's already marked as stale so it will be rebuilt
> on next request.
Ah, you mean this at the end:
> /* mark list dead if any members already dead */
> if (ct->dead)
> cl->dead = true;
Ok, I missed that. It does not handle the 2nd scenario though: If a new
catalog tuple is concurrently inserted that should be part of the list,
it is missed.
I was able to reproduce that, by pausing a process with gdb while it's
building the list in SearchCatCacheList():
1. Create a function called foofunc(integer). It must be large so that
its pg_proc tuple is toasted.
2. In one backend, run "SELECT foofunc(1)". It calls
FuncnameGetCandidates() which calls
"SearchSysCacheList1(PROCNAMEARGSNSP, CStringGetDatum(funcname));". Put
a break point in SearchCatCacheList() just after the systable_beginscan().
3. In another backend, create function foofunc() with no args.
4. continue execution from the breakpoint.
5. Run "SELECT foofunc()" in the first session. It fails to find the
function. The error persists, it will fail to find that function if you
try again, until the syscache is invalidated again for some reason.
Attached is an injection point test case to reproduce that. If you
change the test so that the function's body is shorter, so that it's not
toasted, the test passes.
--
Heikki Linnakangas
Neon (https://neon.tech)
Attachment | Content-Type | Size |
---|---|---|
0001-Demonstrate-catcache-list-invalidation-bug.patch | text/x-patch | 5.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Noah Misch | 2024-12-14 00:38:05 | Re: Hot standby queries see transient all-zeros pages |
Previous Message | Tom Lane | 2024-12-14 00:06:16 | Re: Exceptional md.c paths for recovery and zero_damaged_pages |