From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Christophe Pettus <xof(at)thebuild(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Subject: | Re: startup process stuck in recovery |
Date: | 2017-10-10 20:23:52 |
Message-ID: | 20435.1507667032@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Christophe Pettus <xof(at)thebuild(dot)com> writes:
> I was able to reproduce this on 9.5.9 with the following:
Hmm ... so I still can't reproduce the specific symptoms Christophe
reports.
What I see is that, given this particular test case, the backend
process on the master never holds more than a few locks at a time.
Each time we abort a subtransaction, the AE lock it was holding
on the temp table it created gets dropped. However ... on the
standby server, pre v10, the replay process attempts to take all
12000 of those AE locks at once. This is not a great plan.
On 9.5, for me, as soon as we're out of shared memory
ResolveRecoveryConflictWithLock will go into an infinite loop.
And AFAICS it *is* infinite; it doesn't look to me like it's
making any progress. This is pretty easy to diagnose though
because it spews "out of shared memory" WARNING messages to the
postmaster log at an astonishing rate.
9.6 hits the OOM condition as well, but manages to get out of it
somehow. I'm not very clear how, and the log trace doesn't look
like it's real clean: after a bunch of these
WARNING: out of shared memory
CONTEXT: xlog redo at 0/C1098AC0 for Standby/LOCK: xid 134024 db 423347 rel 524106
WARNING: out of shared memory
CONTEXT: xlog redo at 0/C10A97E0 for Standby/LOCK: xid 134024 db 423347 rel 524151
WARNING: out of shared memory
CONTEXT: xlog redo at 0/C10B36B0 for Standby/LOCK: xid 134024 db 423347 rel 524181
WARNING: out of shared memory
CONTEXT: xlog redo at 0/C10BD780 for Standby/LOCK: xid 134024 db 423347 rel 524211
you get a bunch of these
WARNING: you don't own a lock of type AccessExclusiveLock
CONTEXT: xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 15:05:56.615721-04
LOG: RecoveryLockList contains entry for lock no longer recorded by lock manager: xid 134024 database 423347 relation 526185
CONTEXT: xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 15:05:56.615721-04
WARNING: you don't own a lock of type AccessExclusiveLock
CONTEXT: xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 15:05:56.615721-04
LOG: RecoveryLockList contains entry for lock no longer recorded by lock manager: xid 134024 database 423347 relation 526188
CONTEXT: xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 15:05:56.615721-04
WARNING: you don't own a lock of type AccessExclusiveLock
CONTEXT: xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 15:05:56.615721-04
LOG: RecoveryLockList contains entry for lock no longer recorded by lock manager: xid 134024 database 423347 relation 526191
CONTEXT: xlog redo at 0/C13A79B0 for Transaction/COMMIT: 2017-10-10 15:05:56.615721-04
The important point though is that "a bunch" is a finite number,
whereas 9.5 seems to be just stuck. I'm not sure how Christophe's
server managed to continue to make progress.
It looks like the 9.6-era patch 37c54863c must have been responsible
for that behavioral change. There's no indication in the commit message
or the comments that anyone had specifically considered the OOM
scenario, so I think it's just accidental that it's better.
v10 and HEAD avoid the problem because the standby server doesn't
take locks (any at all, AFAICS). I suppose this must be a
consequence of commit 9b013dc238c, though I'm not sure exactly how.
Anyway, it's pretty scary that it's so easy to run the replay process
out of shared memory pre-v10. I wonder if we should consider
backpatching that fix. Any situation where the replay process takes
more locks concurrently than were ever held on the master is surely
very bad news.
A marginally lesser concern is that the replay process does need to have
robust behavior in the face of locktable OOM. AFAICS whatever it is doing
now is just accidental, and I'm not sure it's correct. "Doesn't get into
an infinite loop" is not a sufficiently high bar.
And I'm still wondering exactly what Christophe actually saw ...
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | pinker | 2017-10-10 20:40:07 | core system is getting unresponsive because over 300 cpu load |
Previous Message | Nico Williams | 2017-10-10 19:41:58 | Re: Equivalence Classes when using IN |