Re: PgbackRest PointTIme Recovery : server unable to start back

From: Muhammad Ikram <mmikram(at)gmail(dot)com>
To: KK CHN <kkchn(dot)in(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: PgbackRest PointTIme Recovery : server unable to start back
Date: 2024-07-26 06:57:39
Message-ID: CAGeimVrKesB0QV6vR-o3x2OzcYjTEcAic90XiG_JqMkSRkCYVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I don't think it has anything to do with the pg_hba.conf file. Archive
seems missing. I will suggest taking a full backup, as you have mentioned
that you are able to take full backup.

Regards,
Ikram

On Fri, Jul 26, 2024 at 11:24 AM KK CHN <kkchn(dot)in(at)gmail(dot)com> wrote:

> here the log outputs....
>
> When I ran
>
> $sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
> --set=20240719-122703F_20240724-094727I --target-timeline=current
> --type=time "--target=2024-07-24 09:44:01.3255+05:30"
> --target-action=promote restore
>
>
>
> The Edb server log out put shows
> ...........................................................................
> ................................................................
> 2024-07-23 15:34:12 IST ERROR: syntax error at or near "selecct" at
> character 1
> 2024-07-23 15:34:12 IST STATEMENT: selecct current_timestamp
> 2024-07-23 15:34:26 IST FATAL: database "enterprisedb" does not exist
> 2024-07-24 09:38:20 IST FATAL: database "enterprisedb" does not exist
> 2024-07-24 09:42:17 IST FATAL: database "test1" does not exist
> 2024-07-24 09:43:56 IST FATAL: database "enterprisedb" does not exist
> 2024-07-24 09:45:07 IST FATAL: database "enterprisedb" does not exist
> 2024-07-24 09:45:16 IST ERROR: relation "important_table" does not exist
> at character 58
> 2024-07-24 09:45:16 IST STATEMENT: begin; drop table important_table;
> commit; select * from important_table;
> 2024-07-24 09:46:00 IST LOG: checkpoint starting: time
> 2024-07-24 09:46:02 IST LOG: checkpoint complete: wrote 16 buffers
> (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=1.506 s,
> sync=0.001 s, total=1.509 s; sync files=9, longest=0.001 s, average=0.001
> s; distance=93 kB, estimate=171 kB; lsn=0/3C045768, redo lsn=0/3C045730
> 2024-07-24 09:47:27 IST LOG: checkpoint starting: immediate force wait
> 2024-07-24 09:47:27 IST LOG: checkpoint complete: wrote 2 buffers (0.0%);
> 0 WAL file(s) added, 0 removed, 1 recycled; write=0.001 s, sync=0.001 s,
> total=0.003 s; sync files=1, longest=0.001 s, average=0.001 s;
> distance=16106 kB, estimate=16106 kB; lsn=0/3D000060, redo lsn=0/3D000028
> 2024-07-24 09:47:27.788 P00 INFO: archive-push command begin 2.52.1:
> [pg_wal/00000009000000000000003C] --exec-id=34036-c1d6de0b
> --log-level-console=info --log-level-file=debug
> --pg1-path=/var/lib/edb/as16/data --pg-version-force=16
> --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
> 2024-07-24 09:47:28.130 P00 INFO: pushed WAL file
> '00000009000000000000003C' to the archive
> 2024-07-24 09:47:28.230 P00 INFO: archive-push command end: completed
> successfully (444ms)
>
>
>
> When I ran $ sudo systemctl start edb-as-16.service
>
> 4-07-26 11:32:56 IST LOG: starting PostgreSQL 16.3 (EnterpriseDB Advanced
> Server 16.3.0) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1
> 20231218 (Red Hat 11.4.1-3), 64-bit
> 2024-07-26 11:32:56 IST LOG: listening on IPv4 address "0.0.0.0", port
> 5444
> 2024-07-26 11:32:56 IST LOG: listening on IPv6 address "::", port 5444
> 2024-07-26 11:32:56 IST LOG: listening on Unix socket "/tmp/.s.PGSQL.5444"
> 2024-07-26 11:32:56 IST LOG: database system was interrupted; last known
> up at 2024-07-24 09:47:27 IST
> 2024-07-26 11:32:56 IST LOG: starting point-in-time recovery to
> 2024-07-24 09:44:01.3255+05:30
> 2024-07-26 11:32:56 IST LOG: starting backup recovery with redo LSN
> 0/3D000028, checkpoint LSN 0/3D000060, on timeline ID 9
> 2024-07-26 11:32:56.475 P00 INFO: archive-get command begin 2.52.1:
> [00000009.history, pg_wal/RECOVERYHISTORY] --exec-id=43293-1d583a8e
> --log-level-console=info --log-level-file=debug
> --pg1-path=/var/lib/edb/as16/data --pg-version-force=16
> --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
> 2024-07-26 11:32:56.709 P00 INFO: found 00000009.history in the repo1:
> 16-1 archive
> 2024-07-26 11:32:56.809 P00 INFO: archive-get command end: completed
> successfully (335ms)
> 2024-07-26 11:32:56 IST LOG: restored log file "00000009.history" from
> archive
> 2024-07-26 11:32:56.816 P00 INFO: archive-get command begin 2.52.1:
> [00000009000000000000003D, pg_wal/RECOVERYXLOG] --exec-id=43295-55c5d6f7
> --log-level-console=info --log-level-file=debug
> --pg1-path=/var/lib/edb/as16/data --pg-version-force=16
> --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
> 2024-07-26 11:32:56.999 P00 INFO: found 00000009000000000000003D in the
> repo1: 16-1 archive
> 2024-07-26 11:32:57.099 P00 INFO: archive-get command end: completed
> successfully (285ms)
> 2024-07-26 11:32:57 IST LOG: restored log file "00000009000000000000003D"
> from archive
> 2024-07-26 11:32:57 IST LOG: redo starts at 0/3D000028
> 2024-07-26 11:32:57.114 P00 INFO: archive-get command begin 2.52.1:
> [00000009000000000000003E, pg_wal/RECOVERYXLOG] --exec-id=43297-34867c05
> --log-level-console=info --log-level-file=debug
> --pg1-path=/var/lib/edb/as16/data --pg-version-force=16
> --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
> 2024-07-26 11:32:57.282 P00 INFO: found 00000009000000000000003E in the
> repo1: 16-1 archive
> 2024-07-26 11:32:57.382 P00 INFO: archive-get command end: completed
> successfully (270ms)
> 2024-07-26 11:32:57 IST LOG: restored log file "00000009000000000000003E"
> from archive
> 2024-07-26 11:32:57.400 P00 INFO: archive-get command begin 2.52.1:
> [00000009000000000000003F, pg_wal/RECOVERYXLOG] --exec-id=43299-e2db2e1b
> --log-level-console=info --log-level-file=debug
> --pg1-path=/var/lib/edb/as16/data --pg-version-force=16
> --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
> 2024-07-26 11:32:57.521 P00 INFO: unable to find
> 00000009000000000000003F in the archive
> 2024-07-26 11:32:57.621 P00 INFO: archive-get command end: completed
> successfully (222ms)
> 2024-07-26 11:32:57 IST LOG: completed backup recovery with redo LSN
> 0/3D000028 and end LSN 0/3D000100
> 2024-07-26 11:32:57 IST LOG: consistent recovery state reached at
> 0/3D000100
> 2024-07-26 11:32:57 IST LOG: database system is ready to accept read-only
> connections
> 2024-07-26 11:32:57.632 P00 INFO: archive-get command begin 2.52.1:
> [00000009000000000000003F, pg_wal/RECOVERYXLOG] --exec-id=43301-f613dae9
> --log-level-console=info --log-level-file=debug
> --pg1-path=/var/lib/edb/as16/data --pg-version-force=16
> --repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
> 2024-07-26 11:32:57.761 P00 INFO: unable to find
> 00000009000000000000003F in the archive
> 2024-07-26 11:32:57.861 P00 INFO: archive-get command end: completed
> successfully (231ms)
> 2024-07-26 11:32:57 IST LOG: redo done at 0/3E000060 system usage: CPU:
> user: 0.00 s, system: 0.00 s, elapsed: 0.75 s
> 2024-07-26 11:32:57 IST FATAL: recovery ended before configured recovery
> target was reached
> 2024-07-26 11:32:57 IST LOG: startup process (PID 43292) exited with exit
> code 1
>
>
> ONLY inference I can make is
>
> INFO unable to find 00000009000000000000003F in the archive ( This
> means the EDB server (10.10.20.6 ) unable to push the archives to the
> Repo server(10.10.20.7 ) ? Is that the reason for the recovery and
> start backing of edb server fails ?
>
>
> the pg_hba.conf entry in the EDB Server machine is as
>
> host all all 127.0.0.1/32 ident
> host all all 10.10.20.7/32
> scram-sha-256
> #host all all 10.10.20.7/32 trust
> # IPv6 local connections:
> host all all ::1/128 ident
> #host all all 10.10.20.7/24 trust
>
> # Allow replication connections from localhost, by a user with the
> # replication privilege.
> local replication all peer
> host replication all 10.10.20.7/32
> scram-sha-256
> host replication all 127.0.0.1/32 ident
> host replication all ::1/128 ident
>
>
> Do I have to change anything in pg_hba.conf ?
>
>
> my EDB Server conf as this
>
> archive_mode = on
> archive_command = 'pgbackrest --stanza=Demo_Repo archive-push %p'
> log_filename = 'postgresql.log'
> max_wal_senders = 5
> wal_level = replica
>
>
> Any help ?
>
> Krishane
>
>
>
>
> On Fri, Jul 26, 2024 at 10:45 AM Muhammad Ikram <mmikram(at)gmail(dot)com> wrote:
>
>> Hi KK CHN
>>
>> Could you check server logs ?
>> Your service trace suggests that it started server and then failure
>> happened
>>
>> ul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced Server
>> 16.
>> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
>> exited, code=exited, status=1/FAILURE
>>
>>
>>
>> Regards,
>> Ikram
>>
>>
>> On Fri, Jul 26, 2024 at 10:04 AM KK CHN <kkchn(dot)in(at)gmail(dot)com> wrote:
>>
>>> List,
>>>
>>> Reference: https://pgbackrest.org/user-guide-rhel.html#pitr
>>> I am following the PTR on RHEL9 EPAS16.
>>> I am able to do a backup(Full, diff and incr) and restore from a
>>> full backup and restart of EPAS16 works fine.
>>>
>>> But when I do an incremental backup after doing the procedures
>>> mentioned in the PTR section of the above reference link and try
>>> restoring the EDB database from the INCR backup and then starting up the
>>> EPAS16 the server always results in dead state
>>>
>>> My repo server is another machine. If I do a full restore on the
>>> DB server ( sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
>>> restore) it works and the server starts without any issue.
>>> Restoring from Incremental backup tty output shows successful but edb
>>> service start results in failure.
>>>
>>> Any help is much appreciated.
>>>
>>> Krishane.
>>>
>>>
>>>
>>>
>>> STEPS followed:
>>>
>>> after dropping the table pg-primary *⇒* Drop the important table
>>> ([section]
>>> stopped the EDB server.
>>>
>>> $ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
>>> --set=20240719-122703F_20240724-094727I --target-timeline=current
>>> --type=time "--target=2024-07-24 09:44:01.3255+05:30"
>>> --target-action=promote restore
>>> .....................................................
>>>
>>> 2024-07-26 09:48:06.343 P00 INFO: restore command end: completed
>>> successfully (1035ms)
>>>
>>>
>>> But
>>>
>>> [root(at)rservice01 ~]# sudo systemctl start edb-as-16.service
>>> [root(at)service01 ~]# sudo systemctl status edb-as-16.service
>>> × edb-as-16.service - EDB Postgres Advanced Server 16
>>> Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled;
>>> preset: disabled)
>>> * Active: failed* (Result: exit-code) since Fri 2024-07-26 09:48:50
>>> IST; 8s ago
>>> Duration: 242ms
>>> Process: 41903 ExecStartPre=/usr/edb/as16/bin/edb-as-16-check-db-dir
>>> ${PGDATA} (code=exited, status=0/SUCCESS)
>>> Process: 41908 ExecStart=/usr/edb/as16/bin/edb-postgres -D ${PGDATA}
>>> (code=exited, status=1/FAILURE)
>>> Main PID: 41908 (code=exited, status=1/FAILURE)
>>> CPU: 331ms
>>>
>>> Jul 26 09:48:48 service01 systemd[1]: Starting EDB Postgres Advanced
>>> Server 16...
>>> Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
>>> LOG: redirecting log output to logging collector process
>>> Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
>>> HINT: Future log output will appear in directory "log".
>>> Jul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced
>>> Server 16.
>>> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
>>> exited, code=exited, status=1/FAILURE
>>> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Killing process
>>> 41909 (edb-postgres) with signal SIGKILL.
>>> Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Failed with
>>> result 'exit-code'.
>>> [root(at)service01 ~]#
>>>
>>> why is it unable to perform a restore and recovery from an incr backup
>>> ?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On The Repo Server
>>> [root(at)service02 ~]# sudo -u postgres pgbackrest --stanza=Demo_Repo info
>>> stanza: Demo_Repo
>>> status: ok
>>> cipher: aes-256-cbc
>>>
>>> db (current)
>>> wal archive min/max (16):
>>> 000000020000000000000021/0000000B0000000000000041
>>>
>>> full backup: 20240719-122703F
>>> timestamp start/stop: 2024-07-19 12:27:03+05:30 / 2024-07-19
>>> 12:27:06+05:30
>>> wal start/stop: 00000002000000000000002A /
>>> 00000002000000000000002A
>>> database size: 61.7MB, database backup size: 61.7MB
>>> repo1: backup size: 9.6MB
>>>
>>> incr backup: 20240719-122703F_20240719-123353I
>>> timestamp start/stop: 2024-07-19 12:33:53+05:30 / 2024-07-19
>>> 12:33:56+05:30
>>> wal start/stop: 00000002000000000000002C /
>>> 00000002000000000000002C
>>> database size: 61.7MB, database backup size: 6.4MB
>>> repo1: backup size: 6.2KB
>>> backup reference list: 20240719-122703F
>>>
>>> diff backup: 20240719-122703F_20240719-123408D
>>> timestamp start/stop: 2024-07-19 12:34:08+05:30 / 2024-07-19
>>> 12:34:10+05:30
>>> wal start/stop: 00000002000000000000002E /
>>> 00000002000000000000002E
>>> database size: 61.7MB, database backup size: 6.4MB
>>> repo1: backup size: 6.4KB
>>> backup reference list: 20240719-122703F
>>>
>>> incr backup: 20240719-122703F_20240723-110212I
>>> timestamp start/stop: 2024-07-23 11:02:12+05:30 / 2024-07-23
>>> 11:02:15+05:30
>>> wal start/stop: 000000070000000000000038 /
>>> 000000070000000000000038
>>> database size: 48MB, database backup size: 6.4MB
>>> repo1: backup size: 9.8KB
>>> backup reference list: 20240719-122703F,
>>> 20240719-122703F_20240719-123408D
>>>
>>> incr backup: 20240719-122703F_20240723-141818I
>>> timestamp start/stop: 2024-07-23 14:18:18+05:30 / 2024-07-23
>>> 14:18:22+05:30
>>> wal start/stop: 00000008000000000000003C /
>>> 00000008000000000000003C
>>> database size: 75.4MB, database backup size: 33.8MB
>>> repo1: backup size: 4.7MB
>>> backup reference list: 20240719-122703F,
>>> 20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I
>>>
>>> incr backup: 20240719-122703F_20240724-094727I
>>> timestamp start/stop: 2024-07-24 09:47:27+05:30 / 2024-07-24
>>> 09:47:30+05:30
>>> wal start/stop: 00000009000000000000003D /
>>> 00000009000000000000003D
>>> database size: 62MB, database backup size: 11.5MB
>>> repo1: backup size: 72KB
>>> backup reference list: 20240719-122703F,
>>> 20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I,
>>> 20240719-122703F_20240723-141818
>>>
>>>
>>>
>>>
>>
>> --
>> Muhammad Ikram
>>
>>

--
Muhammad Ikram

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Kashif Zeeshan 2024-07-26 07:05:41 Re: pgBackRest for multiple production servers
Previous Message KK CHN 2024-07-26 06:33:49 Re: PgbackRest PointTIme Recovery : server unable to start back