Re: Recovery from WAL archives not advancing timeline?

From: Ian Lawrence Barwick <barwick(at)gmail(dot)com>
To: Don Seiler <don(at)seiler(dot)us>, Pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: Re: Recovery from WAL archives not advancing timeline?
Date: 2020-08-08 15:33:54
Message-ID: CAB8KJ=jsEKz6GY0d+bi6Q2_U5_GDSJkaOO8Mh2V+rQ-JHM6_vg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

2020年8月9日(日) 0:12 Don Seiler <don(at)seiler(dot)us>:
>
> On Sat, Aug 8, 2020 at 9:14 AM Ian Lawrence Barwick <barwick(at)gmail(dot)com> wrote:
>>
>>
>> After the failure to fetch the expected WAL file, the standby
>> (replica) should be
>> attempting to fetch the 00000002.history file; do you see any wal-g
>> error messages
>> about that?
>
>
> No. I when I first start the new replica, it shows:
>
> 2020-08-08 06:16:47.536 GMT [30368] LOG: database system was shut down at 2020-08-08 05:29:12 GMT
> 2020-08-08 06:16:47.768 GMT [30379] [unknown](at)[unknown] - [unknown] LOG: connection received: host=[local]
> 2020-08-08 06:16:47.768 GMT [30379] postgres(at)postgres - [unknown] FATAL: the database system is starting up
> 2020-08-08 06:16:47.769 GMT [30380] [unknown](at)[unknown] - [unknown] LOG: connection received: host=[local]
> 2020-08-08 06:16:47.769 GMT [30380] postgres(at)banno_login - [unknown] FATAL: the database system is starting up
> 2020-08-08 06:16:47.849 GMT [30381] [unknown](at)[unknown] - [unknown] LOG: connection received: host=[local]
> 2020-08-08 06:16:47.849 GMT [30381] [unknown](at)[unknown] - [unknown] LOG: incomplete startup packet
> WARNING: 2020/08/08 06:16:47.857808 You are using deprecated functionality that uses an external gpg library.
> It will be removed in next major version.
> Please set GPG key using environment variables WALG_PGP_KEY or WALG_PGP_KEY_PATH.
> ERROR: 2020/08/08 06:16:48.041147 openpgp: invalid argument: no armored data found
> 2020-08-08 06:16:48.042 GMT [30368] LOG: entering standby mode
> ERROR: 2020/08/08 06:16:48.423220 Archive '000000010000142E0000005B' does not exist.
> 2020-08-08 06:16:48.428 GMT [30368] LOG: consistent recovery state reached at 142E/5B000098
> 2020-08-08 06:16:48.428 GMT [30368] LOG: invalid record length at 142E/5B000098: wanted 24, got 0
> 2020-08-08 06:16:48.429 GMT [30366] LOG: database system is ready to accept read only connections
>
> From there I see these messages repeated:
>
> ERROR: 2020/08/08 06:16:49.029836 Archive '000000010000142E0000005B' does not exist.
> WARNING: 2020/08/08 06:16:49.510460 You are using deprecated functionality that uses an external gpg library.
> It will be removed in next major version.
> Please set GPG key using environment variables WALG_PGP_KEY or WALG_PGP_KEY_PATH.
> ERROR: 2020/08/08 06:16:49.513288 openpgp: invalid argument: no armored data found
>
> There's no attempt to look for 00000002.history that I would normally expect when a replica doing WAL restore/recovery runs of out of files to restore.

In that case are you
a) absolutely sure recovery.conf contains "recovery_target_timeline = 'latest'"?
b) the standby was restarted *after* the change was made?

I ask because not attempting to fetch a history file would be a sign
that "recovery_target_timeline"
is not set (or set but not to "latest").

(...)

> There are WAL files in that new replica from when it was still the old primary, would those be causing confusion?:
>
> postgres:~/9.6/main/pg_xlog$ ls -lh
> total 273M
> -rw------- 1 postgres postgres 340 Aug 7 22:21 000000010000142C000000AD.00000028.backup
> -rw------- 1 postgres postgres 16M Aug 8 05:24 000000010000142E00000055
> -rw------- 1 postgres postgres 16M Aug 8 05:25 000000010000142E00000056
> -rw------- 1 postgres postgres 16M Aug 8 05:26 000000010000142E00000057
> -rw------- 1 postgres postgres 16M Aug 8 05:27 000000010000142E00000058
> -rw------- 1 postgres postgres 16M Aug 8 05:28 000000010000142E00000059
> -rw------- 1 postgres postgres 16M Aug 8 05:29 000000010000142E0000005A
> -rw------- 1 postgres postgres 16M Aug 8 05:29 000000010000142E0000005B
> -rw------- 1 postgres postgres 16M Aug 8 05:16 000000010000142E0000005C
> -rw------- 1 postgres postgres 16M Aug 8 05:15 000000010000142E0000005D
> -rw------- 1 postgres postgres 16M Aug 8 05:18 000000010000142E0000005E
> -rw------- 1 postgres postgres 16M Aug 8 05:14 000000010000142E0000005F
> -rw------- 1 postgres postgres 16M Aug 8 05:17 000000010000142E00000060
> -rw------- 1 postgres postgres 16M Aug 8 05:19 000000010000142E00000061
> -rw------- 1 postgres postgres 16M Aug 8 05:22 000000010000142E00000062
> -rw------- 1 postgres postgres 16M Aug 8 05:21 000000010000142E00000063
> -rw------- 1 postgres postgres 16M Aug 8 05:20 000000010000142E00000064
> -rw------- 1 postgres postgres 16M Aug 8 05:23 000000010000142E00000065

From the timestamps, 000000010000142E0000005C and later are files
which were "recycled"
for future use and as such are perfectly normal.

Regards

Ian Barwick

--
Ian Barwick https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Don Seiler 2020-08-08 15:59:10 Re: Recovery from WAL archives not advancing timeline?
Previous Message Hotmail 2020-08-08 14:57:20 Re: PostgreSQL JDBC Driver ERROR: invalid byte sequence for encoding "UTF8": 0x00 on Select stmt