From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Greg Stark <stark(at)mit(dot)edu> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Deadlocks in HS (on 9.0 :( ) |
Date: | 2014-07-16 04:25:55 |
Message-ID: | 20140716042555.GA2165511@tornado.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jul 15, 2014 at 04:54:05PM +0100, Greg Stark wrote:
> We've observed a 9.0 database have undetected deadlocks repeatedly in
> hot standby mode.
>
> I think what's happening is that autovacuum is kicking off a VACUUM of
> some system catalogs -- it seems to usually be pg_statistics' toast
> table actually. At the end of the vacuum it briefly gets the exclusive
> lock to truncate the table. On the standby it replays that and records
> the exclusive lock being taken. It then sees a cleanup record that
> pauses replay because a HS standby transaction is running that can see
> the xid being cleaned up. That transaction then blocks against the
> exclusive lock and deadlocks against recovery.
>
> We expect upgrading to 9.3 to fix the problem for us due to the xid
> feedback mechanism. But is this still a known problem when feedback is
> not enabled?
This is the first I've heard of the problem.
> And is it a problem we should try to find a backpatchable
> fix for?
Yes. Undetected deadlock entirely within the confines of the system is a
clear bug, so let's back-patch if the fix proves suitable for that.
> I'm pondering whether we really need to log the exclusive lock taken
> by vacuum when truncating. Worst case is a scan is in progress,
> perhaps we can make scans understand how to handle tables that have
> been truncated concurrently? We could always make the truncate replay
> command acquire the lock and release it itself right away.
Perhaps so. Heikki had a broader design in that area:
http://www.postgresql.org/message-id/flat/5193AB47(dot)3070801(at)vmware(dot)com
The lock VACUUM takes before truncating a relation is the main (only?) source
of spontaneous recovery conflicts not addressed by hot_standby_feedback, so
any of the above would constitute a nice step forward.
--
Noah Misch
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2014-07-16 04:28:34 | Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ] |
Previous Message | Dilip kumar | 2014-07-16 03:57:37 | Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ] |