Re: 12.3 replicas falling over during WAL redo

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: bench(at)silentmedia(dot)com
Cc: alvherre(at)2ndquadrant(dot)com, pgsql-general(at)postgresql(dot)org
Subject: Re: 12.3 replicas falling over during WAL redo
Date: 2020-08-03 04:39:42
Message-ID: 20200803.133942.2170581823465092883.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

At Sat, 1 Aug 2020 09:58:05 -0700, Ben Chobot <bench(at)silentmedia(dot)com> wrote in
>
>
> Alvaro Herrera wrote on 8/1/20 9:35 AM:
> > On 2020-Aug-01, Ben Chobot wrote:
> >
> >> We have a few hundred postgres servers in AWS EC2, all of which do
> >> streaming
> >> replication to at least two replicas. As we've transitioned our fleet
> >> to
> >> from 9.5 to 12.3, we've noticed an alarming increase in the frequency
> >> of a
> >> streaming replica dying during replay. Postgres will log something
> >> like:
> >>
> >> |2020-07-31T16:55:22.602488+00:00 hostA postgres[31875]: [19137-1]
> >> |db=,user=
> >> LOG: restartpoint starting: time 2020-07-31T16:55:24.637150+00:00
> >> hostA
> >> postgres[24076]: [15754-1] db=,user= FATAL: incorrect index offsets
> >> supplied
> >> 2020-07-31T16:55:24.637261+00:00 hostA postgres[24076]: [15754-2]
> >> db=,user=
> >> CONTEXT: WAL redo at BCC/CB7AF8B0 for Btree/VACUUM: lastBlockVacuumed
> >> 1720
> >> 2020-07-31T16:55:24.642877+00:00 hostA postgres[24074]: [8-1]
> >> db=,user= LOG:
> >> startup process (PID 24076) exited with exit code 1|
> > I've never seen this one.
> >
> > Can you find out what the index is being modified by those LSNs -- is
> > it
> > always the same index? Can you have a look at nearby WAL records that
> > touch the same page of the same index in each case?
> >
> > One possibility is that the storage forgot a previous write.
>
> I'd be happy to, if you tell me how. :)
>
> We're using xfs for our postgres filesystem, on ubuntu bionic. Of
> course it's always possible there's something wrong in the filesystem
> or the EBS layer, but that is one thing we have not changed in the
> migration from 9.5 to 12.3.

All of the cited log lines seem suggesting relation with deleted btree
page items. As a possibility I can guess, that can happen if the pages
were flushed out during a vacuum after the last checkpoint and
full-page-writes didn't restored the page to the state before the
index-item deletion happened(that is, if full_page_writes were set to
off.). (If it found to be the cause, I'm not sure why that didn't
happen on 9.5.)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message alex m 2020-08-03 05:04:53 How to get the name of the current database in C function/extention inside a trigger?
Previous Message Michael Paquier 2020-08-03 01:58:08 Re: how reliable is pg_rewind?