From: | Mike Broers <mbroers(at)gmail(dot)com> |
---|---|
To: | pgsql-admin(at)postgresql(dot)org |
Subject: | root cause of corruption in hot standby |
Date: | 2018-09-06 17:28:18 |
Message-ID: | CAB9893iR3-xCEoLTLrxuEmeeTbid7Omq0ZJHG=+MB9JBKb_esA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
So I have discovered corruption in a postgres 9.5.12 read replica, yay
checksums!
2018-09-06 12:00:53 CDT [1563]: [4-1] user=postgres,db=production WARNING:
page verification failed, calculated checksum 3482 but expected 32232
2018-09-06 12:00:53 CDT [1563]: [5-1] user=postgres,db=production ERROR:
invalid page in block 15962 of relation base/16384/464832386
The rest of the log is clean and just has usual monitoring queries as this
isnt a heavily used db.
This corruption isnt occurring on the primary or a second replica, so I'm
not freaking out exactly, but Im not sure how I can further diagnose what
the root cause of the corruption might be.
There were no power outages. This is a streaming hot standby replica that
looks like it was connected fine to its primary xlog at the time, and not
falling back on rsync'ed WALS or anything. We run off an SSD SAN that is
allocated using LVM and I've noticed documentation that states that can be
problematic, but I'm unclear on how to diagnose what might have been the
root cause and now I'm somewhat uncomfortable with this environments
reliability in general.
Does anyone have advice for what to check further to determine a possible
root cause? This is a CentOS 7 vm running on Hyper-V.
Thanks for any assistance, greatly appreciated!
Mike
From | Date | Subject | |
---|---|---|---|
Next Message | For your eyes only... | 2018-09-07 03:41:45 | 答复: Determine which query prevents applying WALs on standby |
Previous Message | Poul Kristensen | 2018-09-06 17:26:36 | Postgresql jdbc driver for PG version 9.5 |