From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
Cc: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Assertion failure when promoting node by deleting recovery.conf and restart node |
Date: | 2013-05-19 16:35:22 |
Message-ID: | CA+U5nM+O+30Y9=+e42dAB1Pef1WP8dSUsp_C41-sMVMe5F3NdQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 25 March 2013 19:14, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
> On 15.03.2013 04:25, Michael Paquier wrote:
>>
>> Hi,
>>
>> When trying to *promote* a slave as master by removing recovery.conf and
>> restarting node, I found an assertion failure on master branch:
>> LOG: database system was shut down in recovery at 2013-03-15 10:22:27 JST
>> TRAP: FailedAssertion("!(ControlFile->minRecoveryPointTLI != 1)", File:
>> "xlog.c", Line: 4954)
>> (gdb) bt
>> #0 0x00007f95af03b2c5 in raise () from /usr/lib/libc.so.6
>> #1 0x00007f95af03c748 in abort () from /usr/lib/libc.so.6
>> #2 0x000000000086ce71 in ExceptionalCondition (conditionName=0x8f2af0
>> "!(ControlFile->minRecoveryPointTLI != 1)", errorType=0x8f0813
>> "FailedAssertion", fileName=0x8f076b "xlog.c",
>> lineNumber=4954) at assert.c:54
>> #3 0x00000000004fe499 in StartupXLOG () at xlog.c:4954
>> #4 0x00000000006f9d34 in StartupProcessMain () at startup.c:224
>> #5 0x000000000050ef92 in AuxiliaryProcessMain (argc=2,
>> argv=0x7fffa6fc3d20) at bootstrap.c:423
>> #6 0x00000000006f8816 in StartChildProcess (type=StartupProcess) at
>> postmaster.c:4956
>> #7 0x00000000006f39e9 in PostmasterMain (argc=6, argv=0x1c950a0) at
>> postmaster.c:1237
>> #8 0x000000000065d59b in main (argc=6, argv=0x1c950a0) at main.c:197
>> Ok, this is not the cleanest way to promote a node as it doesn't do any
>> safety checks relation at promotion but 9.2 and previous versions allowed
>> to do that properly.
>>
>> The assertion has been introduced by commit 3f0ab05 in order to record
>> properly minRecoveryPointTLI in control file at the end of recovery in the
>> case of a crash.
>> However, in the case of a slave node properly shutdown in recovery which
>> is
>> then restarted as a master, the code path of this assertion is taken.
>> What do you think of the patch attached? It avoids the update of
>> recoveryTargetTLI and recoveryTargetIsLatest if the node has been shutdown
>> while in recovery.
>> Another possibility could be to add in the assertion some conditions based
>> on the state of controlFile but I think it is more consistent simply not
>> to
>> update those fields.
>
>
> Simon, can you comment on this? ISTM we could just remove the assertion and
> update the comment to mention that this can happen. If there is a min
> recovery point, surely we always need to recover to the timeline containing
> that point, so setting recoveryTargetTLI to minRecoveryPointTLI seems
> sensible.
Fixed using the latest TLI available and removing the assertion.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Hitoshi Harada | 2013-05-19 19:06:37 | Re: Parallel Sort |
Previous Message | Simon Riggs | 2013-05-19 14:25:08 | Re: Fast promotion failure |