Re: emergency outage requiring database restart

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: emergency outage requiring database restart
Date: 2016-10-25 17:57:27
Message-ID: 20161025175727.p25f3h5dq5bgggye@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Merlin Moncure wrote:

> After last night, I rebuilt the cluster, turning on checksums, turning
> on synchronous commit (it was off) and added a standby replica. This
> should help narrow the problem down should it re-occur; if storage is
> bad (note, other database on same machine is doing 10x write activity
> and is fine) or something is scribbling on shared memory (my guess
> here) then checksums should be popped, right?

Not really sure about that. As I recall we compute the CRC on the
buffer's way out, based on the then-current contents, so if something
scribbles on the buffer while it's waiting to be evicted, the CRC
computation would include the new (corrupted) bytes rather than the
original ones -- see FlushBuffer.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2016-10-25 19:13:39 Re: emergency outage requiring database restart
Previous Message Merlin Moncure 2016-10-25 17:52:29 Re: emergency outage requiring database restart