Re: BUG #13559: WAL replay stuck after DROP VIEW

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: maciek(at)heroku(dot)com
Cc: PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #13559: WAL replay stuck after DROP VIEW
Date: 2015-08-11 03:18:05
Message-ID: CAB7nPqQqX8sunfTXXsOCDkrVnrW4-i46wZb5Sn6buA-szq1y-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Aug 11, 2015 at 7:31 AM, <maciek(at)heroku(dot)com> wrote:
> We had some code in production that automatically dropped and recreated
> views periodically. This database also has a replica that serves some
> moderately intensive queries (read: on the order of several minutes). This
> generally this works fine, but we ran into an issue the other day where the
> startup process on the replica was holding a bunch of AccessExclusive locks
> on these views (presumably due to the DROP) and would not progress even
> though there were no conflicting queries (there may very well have been
> queries against these views at one point, but not not when I looked--all the
> locks held by the startup process showed up as granted in pg_locks). This
> resolved when we restarted the replica.

Yes, possible. Access exclusive lock is taken when dropping the
relation before removing it from heap.

> We're on 9.4.1, but skimming through the 9.4.2-through-9.4.4 release notes,
> I don't see anything relevant.
>
> Could this be an outstanding bug? For what it's worth, we've been running
> the view drop / recreate for about 9 months, totalling probably ~240 drops /
> creates and this is the first time we've run into an issue.

The closest fix post-9.4.4 fix in this area is this commit:
commit: bab959906911c97437f410a03b0346e6dd28d528
author: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
date: Sun, 2 Aug 2015 20:09:05 +0300
Fix race condition that lead to WALInsertLock deadlock with commit_delay.

But this fixed a problem on master where queries got stuck because of
the WAL insert lock patch that has been introduced in 9.4 because of a
race condition between two backends.

Do you have some remnants of the logs when this problem happened? Or
even better, a stack trace to understand where the startup process got
stucked? That's perhaps unrelated, but were you using commit_delay?

I have been playing with pgbench, creating and dropping views with a
couple of hundred connections on 9.4.1 but could not reproduce the
issue.
Regards,
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Maciek Sakrejda 2015-08-11 04:03:20 Re: BUG #13559: WAL replay stuck after DROP VIEW
Previous Message maciek 2015-08-10 22:31:14 BUG #13559: WAL replay stuck after DROP VIEW