From: | Marco Nenciarini <marco(dot)nenciarini(at)2ndquadrant(dot)it> |
---|---|
To: | francesco(dot)canovai(at)2ndquadrant(dot)it, pgsql-bugs(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: BUG #14230: Wrong timeline returned by pg_stop_backup on a standby |
Date: | 2016-07-06 15:57:34 |
Message-ID: | bdf05251-65d3-5847-671f-50a7cc3aa64b@2ndquadrant.it |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On 06/07/16 17:41, Marco Nenciarini wrote:
> On 06/07/16 17:37, Marco Nenciarini wrote:
>> Hi,
>>
>> On 06/07/16 17:07, francesco(dot)canovai(at)2ndquadrant(dot)it wrote:
>>> The following bug has been logged on the website:
>>>
>>> Bug reference: 14230
>>> Logged by: Francesco Canovai
>>> Email address: francesco(dot)canovai(at)2ndquadrant(dot)it
>>> PostgreSQL version: 9.6beta2
>>> Operating system: Linux
>>> Description:
>>>
>>> I'm taking a concurrent backup from a standby in PostgreSQL beta2 and I get
>>> the wrong timeline from pg_stop_backup(false).
>>>
>>> This is what I'm doing:
>>>
>>> 1) I set up an environment with a primary server and a replica in streaming
>>> replication.
>>>
>>> 2) On the replica, I run
>>>
>>> postgres=# SELECT pg_start_backup('test_backup', true, false);
>>> pg_start_backup
>>> -----------------
>>> 0/3000A00
>>> (1 row)
>>>
>>> 3) When I run pg_stop_backup, it returns a start wal location belonging to a
>>> file with timeline 0.
>>>
>>> postgres=# SELECT pg_stop_backup(false);
>>> pg_stop_backup
>>>
>>> ---------------------------------------------------------------------------
>>> (0/3000AE0,"START WAL LOCATION: 0/3000A00 (file
>>> 000000000000000000000003)+
>>> CHECKPOINT LOCATION: 0/3000A38
>>> +
>>> BACKUP METHOD: streamed
>>> +
>>> BACKUP FROM: standby
>>> +
>>> START TIME: 2016-07-06 16:44:31 CEST
>>> +
>>> LABEL: test_backup
>>> +
>>> ","")
>>> (1 row)
>>>
>>> The timeline returned is fine (is 1) when running the same commands on the
>>> master.
>>>
>>> An incorrect backup label doesn't prevent PostgreSQL from starting up, but
>>> it affects the tools using that information.
>>>
>>>
>>
>> The issue here is that the do_pg_stop_backup function uses the
>> ThisTimeLineID variable that is not valid on standbys.
>>
>> I think that it should read it from
>> ControlFile->checkPointCopy.ThisTimeLineID as we do in do_pg_start_backup.
>>
>
> No, that's not the solution.
>
> The backup_label is generated during the do_pg_start_backup call, so
> also the copy in ControlFile->checkPointCopy.ThisTimeLineID is
> uninitialized.
>
After further analysis, the issue is that we retrieve the starttli from
the ControlFile structure, but it was using ThisTimeLineID when writing
the backup label.
I've attached a very simple patch that fixes it.
Regards,
Marco
--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco(dot)nenciarini(at)2ndQuadrant(dot)it | www.2ndQuadrant.it
Attachment | Content-Type | Size |
---|---|---|
timeline.patch | text/x-patch | 652 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | blake | 2016-07-06 18:55:02 | BUG #14231: logical replication wal sender process spins when using error traps in function |
Previous Message | Marco Nenciarini | 2016-07-06 15:41:56 | Re: BUG #14230: Wrong timeline returned by pg_stop_backup on a standby |
From | Date | Subject | |
---|---|---|---|
Next Message | petrum@gmail.com | 2016-07-06 16:07:14 | Question about an inconsistency - 1 |
Previous Message | Marco Nenciarini | 2016-07-06 15:41:56 | Re: BUG #14230: Wrong timeline returned by pg_stop_backup on a standby |