Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: exclusion(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date: 2021-10-29 14:55:32
Message-ID: 20211029145532.kfwqwlrdekunwoa2@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

> On Fri, Oct 29, 2021 at 07:00:01AM +0000, PG Bug reporting form wrote:
> The following bug has been logged on the website:
>
> Bug reference: 17255
> Logged by: Alexander Lakhin
> Email address: exclusion(at)gmail(dot)com
> PostgreSQL version: 14.0
> Operating system: Ubuntu 20.04
> Description:
>
> with the following stack:
> Core was generated by `postgres: law regression [local] CREATE INDEX
> '.
> Program terminated with signal SIGABRT, Aborted.
> #0 __GI_raise (sig=sig(at)entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> 50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> (gdb) bt
> #0 __GI_raise (sig=sig(at)entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1 0x00007f8a7f97a859 in __GI_abort () at abort.c:79
> #2 0x0000562dabb49700 in index_delete_sort_cmp (deltid2=<synthetic
> pointer>, deltid1=<optimized out>) at heapam.c:7582
> #3 index_delete_sort (delstate=0x7fff6f609f10, delstate=0x7fff6f609f10) at
> heapam.c:7623
> #4 heap_index_delete_tuples (rel=0x7f8a76523e08, delstate=0x7fff6f609f10)
> at heapam.c:7296
> #5 0x0000562dabc5519a in table_index_delete_tuples
> (delstate=0x7fff6f609f10, rel=0x562dac23d6c2)
> at ../../../../src/include/access/tableam.h:1327
> #6 _bt_delitems_delete_check (rel=rel(at)entry=0x7f8a7652cc80,
> buf=buf(at)entry=191, heapRel=heapRel(at)entry=0x7f8a76523e08,
> delstate=delstate(at)entry=0x7fff6f609f10) at nbtpage.c:1541
> #7 0x0000562dabc4dbe1 in _bt_simpledel_pass (maxoff=<optimized out>,
> minoff=<optimized out>, newitem=<optimized out>,
> ndeletable=55, deletable=0x7fff6f609f30, heapRel=0x7f8a76523e08,
> buffer=191, rel=0x7f8a7652cc80)
> at nbtinsert.c:2899
> ...
>
> Discovered while hunting to another bug related to autovacuum (unfortunately
> I still can't produce the reliable reproducing script for that).

Thanks for reporting (in fact I'm impressed how many issues you've
discovered, hopefully there are at least some t-shirts "I've found X
bugs in PostgreSQL" available as a reward) and putting efforts into the
reproducing steps. I believe I've managed to reproduce at least a
similar crash with the same trace.

In my case it crashed on pg_unreachable (which is an abort, when asserts
are enabled) inside index_delete_sort_cmp. It seems like item pointers
to compare both have the same block and offset number. In the view of
the recent discussions I was thinking it could be somehow related to the
issues with duplicated TIDs, but delstate->deltids doesn't in fact have
any duplicated entries -- so not sure about that, still investigating
the core dump.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David G. Johnston 2021-10-29 15:55:06 Re: BUG #17256: Running pgagent on a custom user
Previous Message Alexander Lakhin 2021-10-29 13:30:00 Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()