From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Simon Riggs <simon(at)2ndQuadrant(dot)com>, Boszormenyi Zoltan <zboszor(at)pr(dot)hu>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Hot Standby WAL reply uses heavyweight session locks, but doesn't have enough infrastructure set up |
Date: | 2015-01-26 21:24:58 |
Message-ID: | 20150126212458.GA29457@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
dbase_redo does:
if (InHotStandby)
{
/*
* Lock database while we resolve conflicts to ensure that
* InitPostgres() cannot fully re-execute concurrently. This
* avoids backends re-connecting automatically to same database,
* which can happen in some cases.
*/
LockSharedObjectForSession(DatabaseRelationId, xlrec->db_id, 0, AccessExclusiveLock);
ResolveRecoveryConflictWithDatabase(xlrec->db_id);
}
Unfortunately that Assert()s when there's a lock conflict because
e.g. another backend is currently connecting. That's because ProcSleep()
does a enable_timeout_after(DEADLOCK_TIMEOUT, DeadlockTimeout) - and
there's no deadlock timeout (or lock timeout) handler registered.
I'm not sure if this is broken since 8bfd1a884 introducing those session
locks, or if it's caused by the new timeout infrastructure
(f34c68f09f34c68f09).
I guess the easiest way to fix this would be to make this a loop like
ResolveRecoveryConfictWithLock():
LOCKTAG tag;
SET_LOCKTAG_OBJECT(tag,
InvalidOid,
DatabaseRelationId,
xlrec->db_id,
objsubid);
while (!lock_acquired)
{
while (CountDBBackends(dbid) > 0)
{
CancelDBBackends(dbid, PROCSIG_RECOVERY_CONFLICT_DATABASE, true);
/*
* Wait awhile for them to die so that we avoid flooding an
* unresponsive backend when system is heavily loaded.
*/
pg_usleep(10000);
}
if (LockAcquireExtended(&locktag, AccessExclusiveLock, true, true, false)
!= LOCKACQUIRE_NOT_AVAIL)
lock_acquired = true;
}
afaics, that should work? Not pretty, but probably easier than starting
to reason about the deadlock detector in the startup process.
We probably should also add a Assert(!InRecovery || sessionLock) to
LockAcquireExtended() - these kind of problems are otherwise hard to
find in a developer setting.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2015-01-26 21:29:13 | Re: New CF app deployment |
Previous Message | Jim Nasby | 2015-01-26 21:23:15 | Re: pgaudit - an auditing extension for PostgreSQL |