Re: PITR promote bug: Checkpointer writes to older timeline

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, masao(dot)fujii(at)oss(dot)nttdata(dot)com, soumyadeep2007(at)gmail(dot)com, hlinnaka(at)iki(dot)fi, pgsql-hackers(at)postgresql(dot)org, jyih(at)vmware(dot)com, kyeap(at)vmware(dot)com
Subject: Re: PITR promote bug: Checkpointer writes to older timeline
Date: 2021-06-27 19:13:20
Message-ID: 78633.1624821200@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Buildfarm member hornet just reported a failure in this test:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet&dt=2021-06-27%2013%3A40%3A57
> It's not clear whether this is a problem with the test case or an
> actual server bug, but I'm leaning to the latter theory. My gut
> feel is it's some problem in the "snapshot scalability" work. It
> doesn't look the same as the known open issue, but maybe related?

Hmm, the plot thickens. I scraped the buildfarm logs for similar-looking
assertion failures back to last August, when the snapshot scalability
patches went in. The first such failure is not until 2021-03-24
(see attachment), and they all look to be triggered by
023_pitr_prepared_xact.pl. It sure looks like recovering a prepared
transaction creates a transient state in which a new backend will
compute a broken snapshot.

regards, tom lane

Attachment Content-Type Size
transaction-ordering-assertion-failures.txt text/plain 2.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-06-27 19:20:50 Re: PITR promote bug: Checkpointer writes to older timeline
Previous Message Tom Lane 2021-06-27 18:35:24 Re: PITR promote bug: Checkpointer writes to older timeline