From: | Justin King <kingpin867(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: walreceiver termination |
Date: | 2020-05-04 14:09:15 |
Message-ID: | CAE39h22d4-AcqpQaaN8NXA-zTDXM6i=joBjJtnaoB0706vFrNw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Would there be anyone that might be able to help troubleshoot this
issue -- or at least give me something that would be helpful to look
for?
https://www.postgresql.org/message-id/flat/CAGH8ccdWLLGC7qag5pDUFbh96LbyzN_toORh2eY32-2P1%3Dtifg%40mail.gmail.com
https://www.postgresql.org/message-id/flat/CANQ55Tsoa6%3Dvk2YkeVUN7qO-2YdqJf_AMVQxqsVTYJm0qqQQuw%40mail.gmail.com
https://dba.stackexchange.com/questions/116569/postgresql-docker-incorrect-resource-manager-data-checksum-in-record-at-46f-6
I'm not the first one to report something similar and all the
complaints have a different filesystem in common -- particularly ZFS
(or btrfs, in the bottom case). Is there anything more we can do here
to help narrow down this issue? I'm happy to help, but I honestly
wouldn't even know where to begin.
Thanks-
Justin King
flightaware.com
On Thu, Apr 23, 2020 at 4:40 PM Justin King <kingpin867(at)gmail(dot)com> wrote:
>
> On Thu, Apr 23, 2020 at 3:06 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > Justin King <kingpin867(at)gmail(dot)com> writes:
> > > I assume it would be related to the following:
> > > LOG: incorrect resource manager data checksum in record at 2D6/C259AB90
> > > since the walreceiver terminates just after this - but I'm unclear
> > > what precisely this means.
> >
> > What it indicates is corrupt data in the WAL stream. When reading WAL
> > after crash recovery, we assume that that indicates end of WAL. When
> > pulling live data from a source server, it suggests some actual problem
> > ... but killing the walreceiver and trying to re-fetch the data might
> > be a reasonable response to that. I'm not sure offhand what the startup
> > code thinks it's doing in this context. It might either be attempting
> > to retry, or concluding that it's come to the end of WAL and it ought
> > to promote to being a live server. If you don't see the walreceiver
> > auto-restarting then I'd suspect that the latter is happening.
> >
> > regards, tom lane
>
> walrecevier is definitely not restarting -- replication stops cold
> right at that segment. I'm a little unclear where to go from here --
> is there additional info that would be useful?
From | Date | Subject | |
---|---|---|---|
Next Message | Dilip Kumar | 2020-05-04 16:14:29 | Re: 12.2: Howto check memory-leak in worker? |
Previous Message | Laurenz Albe | 2020-05-04 12:08:11 | Re: Clustered Index in PG |