From: | "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com> |
---|---|
To: | Victor Yegorov <vyegorov(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Deadlock between backend and recovery may not be detected |
Date: | 2020-12-16 14:28:33 |
Message-ID: | fadffa0b-3a5c-8cc9-2555-a823cb69450f@amazon.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 12/16/20 2:36 PM, Victor Yegorov wrote:
>
> *CAUTION*: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender
> and know the content is safe.
>
>
> ср, 16 дек. 2020 г. в 13:49, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com
> <mailto:masao(dot)fujii(at)oss(dot)nttdata(dot)com>>:
>
> After doing this procedure, you can see the startup process and
> backend
> wait for the table lock each other, i.e., deadlock. But this
> deadlock remains
> even after deadlock_timeout passes.
>
> This seems a bug to me.
>
+1
>
> > * Deadlocks involving the Startup process and an ordinary
> backend process
> > * will be detected by the deadlock detector within the ordinary
> backend.
>
> The cause of this issue seems that
> ResolveRecoveryConflictWithLock() that
> the startup process calls when recovery conflict on lock happens
> doesn't
> take care of deadlock case at all. You can see this fact by
> reading the above
> source code comment for ResolveRecoveryConflictWithLock().
>
> To fix this issue, I think that we should enable
> STANDBY_DEADLOCK_TIMEOUT
> timer in ResolveRecoveryConflictWithLock() so that the startup
> process can
> send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the backend.
> Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal arrives,
> the backend should check whether the deadlock actually happens or not.
> Attached is the POC patch implimenting this.
>
good catch!
I don't see any obvious reasons why the STANDBY_DEADLOCK_TIMEOUT
shouldn't be set in ResolveRecoveryConflictWithLock() too (it is already
set in ResolveRecoveryConflictWithBufferPin()).
So + 1 to consider this as a bug and for the way the patch proposes to
fix it.
Bertrand
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2020-12-16 15:46:33 | Re: ResourceOwner refactoring |
Previous Message | Victor Yegorov | 2020-12-16 13:36:04 | Re: Deadlock between backend and recovery may not be detected |