Re: BUG #18630: Incorrect memory access inside ReindexIsProcessingIndex() on VACUUM

From: Tender Wang <tndrwang(at)gmail(dot)com>
To: exclusion(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18630: Incorrect memory access inside ReindexIsProcessingIndex() on VACUUM
Date: 2024-09-25 10:19:20
Message-ID: CAHewXNkKeE3i7aT_nysWphRHhwMnyG1r55Di=RCNtLa5dRSd4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Tender Wang <tndrwang(at)gmail(dot)com> 于2024年9月25日周三 17:28写道:

>
>
> PG Bug reporting form <noreply(at)postgresql(dot)org> 于2024年9月25日周三 13:35写道:
>
>> The following bug has been logged on the website:
>>
>> Bug reference: 18630
>> Logged by: Alexander Lakhin
>> Email address: exclusion(at)gmail(dot)com
>> PostgreSQL version: 17rc1
>> Operating system: Ubuntu 22.04
>> Description:
>>
>> The following script:
>> psql -c "SELECT pg_sleep(5)" &
>>
>> echo "
>> SET lock_timeout = '3s';
>> CREATE TABLE t(i int, t text);
>> REINDEX TABLE CONCURRENTLY t;
>> SELECT pg_sleep(3);
>> " | psql
>>
>> psql -c "VACUUM (PROCESS_MAIN FALSE, FULL TRUE) t;"
>>
>> produces:
>> WARNING: cannot reindex invalid index
>> "pg_toast.pg_toast_16384_index_ccnew"
>> on TOAST table, skipping
>>
>> and then a Valgrind-detected error:
>> ==00:00:00:10.727 3193327== Invalid read of size 4
>> ==00:00:00:10.727 3193327== at 0x5A6D80: list_member_oid (list.c:726)
>> ==00:00:00:10.727 3193327== by 0x33FE2F: ReindexIsProcessingIndex
>> (index.c:4083)
>> ==00:00:00:10.727 3193327== by 0x27B43F: systable_beginscan
>> (genam.c:396)
>> ==00:00:00:10.727 3193327== by 0x4CE8F9: vac_update_datfrozenxid
>> (vacuum.c:1723)
>> ==00:00:00:10.727 3193327== by 0x4CCFAB: vacuum (vacuum.c:691)
>> ==00:00:00:10.727 3193327== by 0x4CC910: ExecVacuum (vacuum.c:449)
>> ==00:00:00:10.727 3193327== by 0x7CE082: standard_ProcessUtility
>> (utility.c:859)
>> ==00:00:00:10.727 3193327== by 0x7CD61D: ProcessUtility (utility.c:523)
>> ==00:00:00:10.727 3193327== by 0x7CBE98: PortalRunUtility
>> (pquery.c:1158)
>> ==00:00:00:10.727 3193327== by 0x7CC10F: PortalRunMulti (pquery.c:1316)
>> ==00:00:00:10.727 3193327== by 0x7CB559: PortalRun (pquery.c:791)
>> ==00:00:00:10.727 3193327== by 0x7C3C7A: exec_simple_query
>> (postgres.c:1284)
>> ==00:00:00:10.727 3193327== Address 0x72f4878 is 7,496 bytes inside a
>> recently re-allocated block of size 8,192 alloc'd
>> ==00:00:00:10.727 3193327== at 0x4848899: malloc
>> (vg_replace_malloc.c:381)
>> ==00:00:00:10.727 3193327== by 0x9FDA95: AllocSetContextCreateInternal
>> (aset.c:444)
>> ==00:00:00:10.727 3193327== by 0x2E0BBC: AtStart_Memory (xact.c:1206)
>> ==00:00:00:10.727 3193327== by 0x2E1C56: StartTransaction (xact.c:2143)
>> ==00:00:00:10.727 3193327== by 0x2E2CA8: StartTransactionCommand
>> (xact.c:3050)
>> ==00:00:00:10.727 3193327== by 0x9DF444: InitPostgres (postinit.c:830)
>> ==00:00:00:10.727 3193327== by 0x7C8B3A: PostgresMain (postgres.c:4349)
>> ==00:00:00:10.727 3193327== by 0x7BF5AE: BackendMain
>> (backend_startup.c:107)
>> ==00:00:00:10.727 3193327== by 0x6D1E75: postmaster_child_launch
>> (launch_backend.c:274)
>> ==00:00:00:10.727 3193327== by 0x6D7CE8: BackendStartup
>> (postmaster.c:3420)
>> ==00:00:00:10.727 3193327== by 0x6D539A: ServerLoop (postmaster.c:1653)
>> ==00:00:00:10.727 3193327== by 0x6D4C92: PostmasterMain
>> (postmaster.c:1351)
>> ==00:00:00:10.727 3193327==
>> ...
>> 2024-09-25 02:44:16.496 UTC|||66f378f6.30b9b3|LOG: server process (PID
>> 3193327) exited with exit code 1
>> 2024-09-25 02:44:16.496 UTC|||66f378f6.30b9b3|DETAIL: Failed process was
>> running: VACUUM (PROCESS_MAIN FALSE, FULL TRUE) t;
>>
>> or an assertion failure (when executed without Valgrind):
>> TRAP: failed Assert("IsOidList(list)"), File: "list.c", Line: 726, PID:
>> 3213057
>>
>> Reproduced on REL_16_STABLE (starting from 4211fbd84) .. master.
>>
>>
> Thanks for reporting. I can reproduce this issue.
>
> When this statement "REINDEX TABLE CONCURRENTLY t;" failed because of
> lock timeout.
> If we do vacuum like this case, when we do toast_relid of table t, we will
> get two index oids.
> pg_toast.pg_toast_16384_index_ccnew is invalid because the REINDEX failed.
>
> Now we only report warings in reindex_relation(). The
> pg_toast.pg_toast_16384_index_ccnew is
> still on the pendingReindexedIndexes list. After finishing the toast_rel
> of table t, the transatiocn committed,
> and the memory of pendingReindexedIndexes was reset but not NIL. So it
> will trigger assert failure when calling ReindexIsProcessingIndex().
>
> I think we can remove the invalid index oid from the
> pendingReindexedIndexes instead of reporting warning.
> I try this way, and no assert failure again. See the attached patch.
>
>
Adding if (flags & REINDEX_REL_SUPPRESS_INDEX_USE) check before calling
RemoveReindexPending() may look better.
Any thoughts?

--
Thanks,
Tender Wang
https://www.openpie.com/

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Adrien Nayrat 2024-09-25 11:07:13 Re: BUG #18097: Immutable expression not allowed in generated at
Previous Message Wolfgang Walther 2024-09-25 10:13:44 Re: BUG #18632: Whether you need to consider modifying the array's handling of delimiters?