From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | alfonso(dot)vicente(at)logos(dot)com(dot)uy |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #8543: Standby recovery use incorrect timeline to determine WAL length |
Date: | 2013-10-22 00:42:48 |
Message-ID: | CAB7nPqQAB7knC9dXbCuqJjWa_vWepwM2Kk1gJMLzCJhPGwq-Sg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Mon, Oct 21, 2013 at 11:31 PM, <alfonso(dot)vicente(at)logos(dot)com(dot)uy> wrote:
> PostgreSQL version: 9.1.3
This version is outdated, and 9.1.9 even contains important security
fixes, so you should update to the latest minor release of 9.1 which
is 9.1.10.
> But... we don't stop the cluster in main-site. So, the cluster in main-site
> continue to archive logs in /archlogs directory, and archives the WAL
> 00000001000000FE00000002 of 16MB.
This is fine, the old and new WAL files are not going to overlap
because of their different timeline ID.
> Immediately, we perform a backup in the backup-site, to prepare for
> switchback. The pg_stop_backup() force a WAL switch, and generate a WAL
> 00000002000000FE00000002 of 2MB. Then, the archive_command script perform
> the log shipping to the main-site and store the 00000002000000FE00000002 in
> the same /archlogs directory
As far as I recall, the size of the WAL segment archived after
pg_stop_backup should be 16MB. Isn't this a problem?
> When we execute the pg_ctl promote in the main-site, the postgresql.log show
> the following error:
> sh: /home/postgres/9.1/backups/archlogs/00000002.history.gz: No existe el
> fichero o el directorio
> 2013-10-19 22:39:59 UYST LOG: starting archive recovery
> 2013-10-19 22:39:59 UYST FATAL: archive file "00000002000000FE00000002"
> has wrong size: 2293760 instead of 16777216
>
> If we drop all the timeline ID 1 archived WALs, the failback executes
> succesfully
9.1.8 and 9.1.10 include a couple fixes related to timeline handling
of WAL files and detection of a database consistency point. Hence I
would recommend trying once again with a latest version.
> The documentation says: "The default behavior of recovery is to recover along
> the same timeline that was current when the base backup was taken". The base
> backup in backup-site was timeline ID 2
True.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Euler Taveira | 2013-10-22 03:26:21 | Re: BUG #8545: pg_dump fails to backup database level grants |
Previous Message | Andres Freund | 2013-10-21 22:08:57 | Re: BUG #8544: High nfs getattr |