Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

From: Noah Misch <noah(at)leadboat(dot)com>
To: Semab Tariq <semab(dot)tariq(at)enterprisedb(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, CM Team <cm(at)enterprisedb(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data
Date: 2021-11-09 14:40:21
Message-ID: 20211109144021.GD940092@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Nov 09, 2021 at 11:55:57AM +0500, Semab Tariq wrote:
> On Mon, Nov 8, 2021 at 7:46 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > On Mon, Nov 08, 2021 at 02:09:15PM +0500, Semab Tariq wrote:
> > > On Mon, Nov 8, 2021 at 12:09 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > > > This postgres binary apparently contains an explicit branch to
> > > > 0x3fffffffff3fdc30, which is not an address reasonably expected to contain
> > > > code.  (It's not a known heap, a known stack, or a CODE section from the
> > > > binary file.)  This probably confirms a toolchain bug.
> > > >
> > > > Would you do "git checkout 166f943" in the source directory you've been
> > > > testing, then rerun the test and post the compressed
> > > > tmp_check/log directory?

> PFA tmp_check.tar.bz2

Excellent. No crash, and the only difference in equalTupleDescs() code
generation is the branch destination addresses. At commit 70bef49, gharial's
toolchain generates the invalid branch destination:

$ diff -U0 <(cut -b47- disasm/70bef49/equalTupleDescs) <(cut -b47- disasm/166f943/equalTupleDescs)
--- /dev/fd/63 2021-11-09 06:11:20.927444437 -0800
+++ /dev/fd/62 2021-11-09 06:11:20.926444428 -0800
@@ -100 +100 @@
- br.call.sptk.many rp=0x40000000003cdc20
+ br.call.sptk.many rp=0x40000000003cde20
@@ -658 +658 @@
- br.call.sptk.many rp=0x40000000003cdc20
+ br.call.sptk.many rp=0x40000000003cde20
@@ -817 +817 @@
- br.call.sptk.many rp=0x3fffffffff3fdc30
+ br.call.sptk.many rp=0x4000000000400c30
@@ -949 +949 @@
- br.call.sptk.many rp=0x40000000003cdc20
+ br.call.sptk.many rp=0x40000000003cde20
@@ -970 +970 @@
- br.call.sptk.many rp=0x40000000003cdc20
+ br.call.sptk.many rp=0x40000000003cde20

Since "git diff 70bef49 166f943" contains nothing that correlates with such a
change, I'm concluding that this is a bug in gharial's toolchain.

It looks like gharial's automatic buildfarm runs have been paused for nine
days. Feel free to unpause it. Also, I recommend using the buildfarm client
setnotes.pl to add a note like 'Rare signal 11 from toolchain bug'. Months or
years pass between these events. Here are all gharial "signal 11" failures,
likely some of which have other causes:

sysname │ snapshot │ branch │ bfurl
─────────┼─────────────────────┼───────────────┼─────────────────────────────────────────────────────────────────────────────────────────────
gharial │ 2018-04-10 00:32:08 │ HEAD │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2018-04-10%2000%3A32%3A08
gharial │ 2019-03-08 01:30:45 │ HEAD │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2019-03-08%2001%3A30%3A45
gharial │ 2019-03-08 08:55:31 │ HEAD │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2019-03-08%2008%3A55%3A31
gharial │ 2019-03-08 19:55:38 │ HEAD │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2019-03-08%2019%3A55%3A38
gharial │ 2019-08-20 09:57:27 │ REL_12_STABLE │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2019-08-20%2009%3A57%3A27
gharial │ 2019-08-21 08:04:58 │ REL_12_STABLE │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2019-08-21%2008%3A04%3A58
gharial │ 2019-08-22 00:37:03 │ REL_12_STABLE │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2019-08-22%2000%3A37%3A03
gharial │ 2019-08-22 12:42:02 │ REL_12_STABLE │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2019-08-22%2012%3A42%3A02
gharial │ 2019-08-24 18:43:52 │ REL_12_STABLE │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2019-08-24%2018%3A43%3A52
gharial │ 2019-08-25 11:14:36 │ REL_12_STABLE │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2019-08-25%2011%3A14%3A36
gharial │ 2019-08-25 18:44:04 │ REL_12_STABLE │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2019-08-25%2018%3A44%3A04
gharial │ 2019-08-26 08:47:19 │ REL_12_STABLE │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2019-08-26%2008%3A47%3A19
gharial │ 2019-08-26 22:30:23 │ REL_12_STABLE │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2019-08-26%2022%3A30%3A23
gharial │ 2021-04-08 03:21:42 │ HEAD │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2021-04-08%2003%3A21%3A42
gharial │ 2021-04-09 06:40:31 │ HEAD │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2021-04-09%2006%3A40%3A31
gharial │ 2021-10-24 16:19:05 │ HEAD │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2021-10-24%2016%3A19%3A05
gharial │ 2021-10-24 20:38:39 │ HEAD │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2021-10-24%2020%3A38%3A39
(17 rows)

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Dmitry Dolgov 2021-11-09 15:02:16 Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Previous Message Kyotaro Horiguchi 2021-11-09 08:32:48 Re: BUG #17269: Why is virtual memory usage of PostgreSQL growing constantly?