Re: BUG #18630: Incorrect memory access inside ReindexIsProcessingIndex() on VACUUM

From: Tender Wang <tndrwang(at)gmail(dot)com>
To: exclusion(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18630: Incorrect memory access inside ReindexIsProcessingIndex() on VACUUM
Date: 2024-09-25 09:28:42
Message-ID: CAHewXNnNZXbq_sW=OP-O-1nnMEnpXJX6djbzF52LtDA6tOutiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

PG Bug reporting form <noreply(at)postgresql(dot)org> 于2024年9月25日周三 13:35写道:

> The following bug has been logged on the website:
>
> Bug reference: 18630
> Logged by: Alexander Lakhin
> Email address: exclusion(at)gmail(dot)com
> PostgreSQL version: 17rc1
> Operating system: Ubuntu 22.04
> Description:
>
> The following script:
> psql -c "SELECT pg_sleep(5)" &
>
> echo "
> SET lock_timeout = '3s';
> CREATE TABLE t(i int, t text);
> REINDEX TABLE CONCURRENTLY t;
> SELECT pg_sleep(3);
> " | psql
>
> psql -c "VACUUM (PROCESS_MAIN FALSE, FULL TRUE) t;"
>
> produces:
> WARNING: cannot reindex invalid index
> "pg_toast.pg_toast_16384_index_ccnew"
> on TOAST table, skipping
>
> and then a Valgrind-detected error:
> ==00:00:00:10.727 3193327== Invalid read of size 4
> ==00:00:00:10.727 3193327== at 0x5A6D80: list_member_oid (list.c:726)
> ==00:00:00:10.727 3193327== by 0x33FE2F: ReindexIsProcessingIndex
> (index.c:4083)
> ==00:00:00:10.727 3193327== by 0x27B43F: systable_beginscan
> (genam.c:396)
> ==00:00:00:10.727 3193327== by 0x4CE8F9: vac_update_datfrozenxid
> (vacuum.c:1723)
> ==00:00:00:10.727 3193327== by 0x4CCFAB: vacuum (vacuum.c:691)
> ==00:00:00:10.727 3193327== by 0x4CC910: ExecVacuum (vacuum.c:449)
> ==00:00:00:10.727 3193327== by 0x7CE082: standard_ProcessUtility
> (utility.c:859)
> ==00:00:00:10.727 3193327== by 0x7CD61D: ProcessUtility (utility.c:523)
> ==00:00:00:10.727 3193327== by 0x7CBE98: PortalRunUtility
> (pquery.c:1158)
> ==00:00:00:10.727 3193327== by 0x7CC10F: PortalRunMulti (pquery.c:1316)
> ==00:00:00:10.727 3193327== by 0x7CB559: PortalRun (pquery.c:791)
> ==00:00:00:10.727 3193327== by 0x7C3C7A: exec_simple_query
> (postgres.c:1284)
> ==00:00:00:10.727 3193327== Address 0x72f4878 is 7,496 bytes inside a
> recently re-allocated block of size 8,192 alloc'd
> ==00:00:00:10.727 3193327== at 0x4848899: malloc
> (vg_replace_malloc.c:381)
> ==00:00:00:10.727 3193327== by 0x9FDA95: AllocSetContextCreateInternal
> (aset.c:444)
> ==00:00:00:10.727 3193327== by 0x2E0BBC: AtStart_Memory (xact.c:1206)
> ==00:00:00:10.727 3193327== by 0x2E1C56: StartTransaction (xact.c:2143)
> ==00:00:00:10.727 3193327== by 0x2E2CA8: StartTransactionCommand
> (xact.c:3050)
> ==00:00:00:10.727 3193327== by 0x9DF444: InitPostgres (postinit.c:830)
> ==00:00:00:10.727 3193327== by 0x7C8B3A: PostgresMain (postgres.c:4349)
> ==00:00:00:10.727 3193327== by 0x7BF5AE: BackendMain
> (backend_startup.c:107)
> ==00:00:00:10.727 3193327== by 0x6D1E75: postmaster_child_launch
> (launch_backend.c:274)
> ==00:00:00:10.727 3193327== by 0x6D7CE8: BackendStartup
> (postmaster.c:3420)
> ==00:00:00:10.727 3193327== by 0x6D539A: ServerLoop (postmaster.c:1653)
> ==00:00:00:10.727 3193327== by 0x6D4C92: PostmasterMain
> (postmaster.c:1351)
> ==00:00:00:10.727 3193327==
> ...
> 2024-09-25 02:44:16.496 UTC|||66f378f6.30b9b3|LOG: server process (PID
> 3193327) exited with exit code 1
> 2024-09-25 02:44:16.496 UTC|||66f378f6.30b9b3|DETAIL: Failed process was
> running: VACUUM (PROCESS_MAIN FALSE, FULL TRUE) t;
>
> or an assertion failure (when executed without Valgrind):
> TRAP: failed Assert("IsOidList(list)"), File: "list.c", Line: 726, PID:
> 3213057
>
> Reproduced on REL_16_STABLE (starting from 4211fbd84) .. master.
>
>
Thanks for reporting. I can reproduce this issue.

When this statement "REINDEX TABLE CONCURRENTLY t;" failed because of lock
timeout.
If we do vacuum like this case, when we do toast_relid of table t, we will
get two index oids.
pg_toast.pg_toast_16384_index_ccnew is invalid because the REINDEX failed.

Now we only report warings in reindex_relation(). The
pg_toast.pg_toast_16384_index_ccnew is
still on the pendingReindexedIndexes list. After finishing the toast_rel of
table t, the transatiocn committed,
and the memory of pendingReindexedIndexes was reset but not NIL. So it
will trigger assert failure when calling ReindexIsProcessingIndex().

I think we can remove the invalid index oid from the
pendingReindexedIndexes instead of reporting warning.
I try this way, and no assert failure again. See the attached patch.

--
Thanks,
Tender Wang
https://www.openpie.com/

Attachment Content-Type Size
0001-Remove-failed-REINDEX-index-oid-from-the-pending-lis.patch application/octet-stream 1.0 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Wolfgang Walther 2024-09-25 10:13:44 Re: BUG #18632: Whether you need to consider modifying the array's handling of delimiters?
Previous Message 曾满 2024-09-25 09:16:51 Re: Re: BUG #18632: Whether you need to consider modifying the array's handling of delimiters?