From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Sherrylyn Branchaw <sbranchaw(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: What to do when dynamic shared memory control segment is corrupt |
Date: | 2018-06-18 16:30:13 |
Message-ID: | 28565.1529339413@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Sherrylyn Branchaw <sbranchaw(at)gmail(dot)com> writes:
> We are using Postgres 9.6.8 (planning to upgrade to 9.6.9 soon) on RHEL 6.9.
> We recently experienced two similar outages on two different prod
> databases. The error messages from the logs were as follows:
> LOG: server process (PID 138529) was terminated by signal 6: Aborted
Hm ... were these installations built with --enable-cassert? If not,
an abort trap seems pretty odd.
> In one case, the logs recorded
> LOG: all server processes terminated; reinitializing
> LOG: incomplete data in "postmaster.pid": found only 1 newlines while
> trying to add line 7
> ...
> In the other case, the logs recorded
> LOG: all server processes terminated; reinitializing
> LOG: dynamic shared memory control segment is corrupt
> LOG: incomplete data in "postmaster.pid": found only 1 newlines while
> trying to add line 7
> ...
Those "incomplete data" messages are quite unexpected and disturbing.
I don't know of any mechanism within Postgres proper that would result
in corruption of the postmaster.pid file that way. (I wondered briefly
if trying to start a conflicting postmaster would result in such a
situation, but experimentation here says not.) I'm suspicious that
this may indicate a bug or unwarranted assumption in whatever scripts
you use to start/stop the postmaster. Whether that is at all related
to your crash issue is hard to say, but it bears looking into.
> My question is whether the corrupt shared memory control segment, and the
> failure of Postgres to automatically restart, mean the database should not
> be automatically started up, and if there's something we should be doing
> before restarting.
No, that looks like fairly typical crash recovery to me: corrupt shared
memory contents are expected and recovered from after a crash. However,
we don't expect postmaster.pid to get mucked with.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2018-06-18 17:28:43 | Re: What to do when dynamic shared memory control segment is corrupt |
Previous Message | Łukasz Jarych | 2018-06-18 15:47:45 | Run Stored procedure - function from VBA |