From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com> |
Cc: | Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com> |
Subject: | Re: Failed recovery with new faster 2PC code |
Date: | 2017-04-19 02:09:00 |
Message-ID: | CAMkU=1y98=hMk=giv8LDszkZqGgTkk2yYWeHPiz+4SN6m7RL5g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Apr 18, 2017 at 1:17 AM, Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>
wrote:
> Hi,
>
> There was a bug in the redo 2PC remove code path. Because of which,
> autovac would think that the 2PC is gone and cause removal of the
> corresponding clog entry earlier than needed.
>
> Please find attached, the bug fix: 2pc_redo_remove_bug.patch.
>
> I have been testing this on top of Michael's 2pc-restore-fix.patch and
> things seem to be ok for the past one+ hour. Will keep it running for long.
>
> Jeff, thanks for these very useful scripts. I am going to make a habit to
> run these scripts on my side from now on. Do you have any other script that
> I could try against these patches? Please let me know.
>
This script is the only one I have that specifically targets 2PC. I wrote
it last year when the previous round of speed-up code (which avoided
writing the files upon "PREPARE" by delaying them until the next
checkpoint) was developed. I just decided to dust that test off to try
again here. I don't know how to change it to make it more targeted towards
this set of patches. Would this bug have been seen in a replica server in
the absence of crashes, or was it only vulnerable during crash recovery
rather than streaming replication?
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Petr Jelinek | 2017-04-19 02:13:36 | Re: tablesync patch broke the assumption that logical rep depends on? |
Previous Message | Jeff Janes | 2017-04-19 01:48:45 | Re: Failed recovery with new faster 2PC code |