Re: PgBackRest PTR recovery: After table drop to get dropped state

From: KK CHN <kkchn(dot)in(at)gmail(dot)com>
To: Kashif Zeeshan <kashi(dot)zeeshan(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: PgBackRest PTR recovery: After table drop to get dropped state
Date: 2024-08-01 10:48:04
Message-ID: CAKgGyB9SoBQUBRuDOpCQm2Hx07yphcTZxm0E=wk9v6n+Y7oWmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

sorry ignore the previous paste , was in complete

Here the full one. https://pastecode.io/s/hya0fyvo

On Thu, Aug 1, 2024 at 4:02 PM KK CHN <kkchn(dot)in(at)gmail(dot)com> wrote:

> The logs are here.
>
> https://pastecode.io/s/s5dp8ur1
>
>
>
> On Thu, Aug 1, 2024 at 3:30 PM Kashif Zeeshan <kashi(dot)zeeshan(at)gmail(dot)com>
> wrote:
>
>> Hi
>>
>> On Thu, Aug 1, 2024 at 2:54 PM KK CHN <kkchn(dot)in(at)gmail(dot)com> wrote:
>>
>>> List,
>>>
>>> *Not working (start EPAS server always fails):*
>>>
>>> 1. Testing PTR using PgBackRest(2.52.1) on RHEL9 EPAS-16, and RHEL9 (
>>> Repo Server)
>>>
>>> When I do a PTR
>>>
>>> 1. After doing a table drop and then
>>> 2. Noting down the time stamp and then
>>> 3. Taking an incremental backup in hope that If I do a restore from this
>>> incr Backup, that won't contain the dropped table.
>>> 4. Correct me if I am conceptually wrong here.
>>> 5. I am *never *successful in restoring the EPAS server in this
>>> scenario.
>>>
>>>
>>> *I know the following will work for me, w*hy not the above one if I
>>> really want that state of cluster also ?
>>>
>>> *This is Working. *
>>> 1. Create table
>>> 2. Noting down the timestamp
>>> 3. Taking incremental backup on RepoServer.
>>> 4. drop the created table .
>>> 5. Then stop the EPAS server and do a PTR, by the --set=step 3 incr
>>> backup and target= step 2 time stamp .. It finished the pgaback restore
>>> and promote command
>>> 6. I am able to start back the EPAS server and see the dropped table
>>> recovered there.
>>>
>>> But If I want a PTR as in the first section it fails.. Why ?
>>>
>>> Thank you,
>>> Krishane
>>>
>>>
>>>
>>>
>>>
>>> *What I have done and results obtained: *
>>>
>>> Created a table important_table4 in my EPAS and note down the time after
>>> creation of this table it is ( t1 : "01-AUG-24 14:08:32.447796+05:30" )
>>>
>>> Then I performed an Incremental backup (incr backup:
>>> 20240729-160137F_20240801-141148I )
>>> timestamp start/stop: 2024-08-01 14:11:48+05:30 / 2024-08-01
>>> 14:11:52+05:30
>>>
>>>
>>> Now I dropped the table table4 from the EPAS and noted down the time
>>>
>>>
>>> I want to restore the table4,, so I stopped EPAS and executed
>>>
>>> $ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
>>> --set=20240729-160137F_20240801-141148I --target-timeline=current
>>> --type=time --target="01-AUG-24 14:08:32.447796+05:30"
>>> --target-action=promote restore
>>>
>>> IT WORKS AS EXPECTED .. after restarting the EPAS I am able to get the
>>> important_table4 back.
>>>
>>> root(at)service01 ~]# sudo -u enterprisedb psql edb
>>> psql (16.3.0)
>>> Type "help" for help.
>>>
>>> edb=# \dt
>>> List of relations
>>> Schema | Name | Type | Owner
>>> --------+------------------+-------+--------------
>>> public | important_table | table | enterprisedb
>>> public | important_table2 | table | enterprisedb
>>> public | important_table3 | table | enterprisedb
>>> public | important_table4 | table | enterprisedb
>>> (4 rows)
>>>
>>> SO all works fine !!!! .
>>>
>>>
>>> *But Now the PROBLEM Statement. *
>>>
>>> *1. I am dropping the table table 4 again *
>>> edb=# \q
>>> [root(at)service01 ~]# sudo -u enterprisedb psql -c "begin; drop table
>>> important_table4; commit;" edb
>>> BEGIN
>>> DROP TABLE
>>> COMMIT
>>> *2 . [root(at)service01 ~]#* sudo -u enterprisedb psql -Atc "select
>>> current_timestamp" edb 01-AUG-24 14:23:22.085076 +05:30
>>> Noting the time as : (01-AUG-24 14:23:22.085076 +05:30 )
>>>
>>> 3. Now I am performing an incremental backup after step 2 on REPO
>>> SErver ( Hoping that this latest INCR Backup is without dropped
>>> important_table4, so that a recovery of the cluster shouldn't show the
>>> table4 again. )
>>>
>>> incr backup details. : 20240729-160137F_20240801-142433I
>>> timestamp start/stop*: 2024-08-01 14:24:33+05:30 /
>>> 2024-08-01 14:24:36+05:30*
>>>
>>> 4. Now I want to test the database recovery after dropping the table4
>>> in step1 to verify that my EPAS restores from the backup in step 3 and time
>>> stamp (01-AUG-24 14:23:22.085076 +05:30, so that the restored EPAS
>>> cluster doesn't contain the important_table4.
>>>
>>> 5. $ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
>>> --set=20240729-160137F_20240801-142433I --target-timeline=current
>>> --type=time --target="01-AUG-24 14:23:22.085076+05:30"
>>> --target-action=promote restore
>>> ------------
>>> -------------
>>> INFO: restore command end: completed successfully (1035ms)
>>>
>>> *ISSUE: I am unable to get the EPAS Server* in running state after
>>> step 5
>>>
>>> *What am I doing wrong ? OR am I conceptually wrong ?*
>>>
>>>
>>>
>>>
>>> OUTPUT on executing step 5.
>>>
>>> [root(at)service01 ~]# sudo -u enterprisedb pgbackrest --stanza=Demo_Repo
>>> --delta --set=20240729-160137F_20240801-142433I --target-timeline=current
>>> --type=time --target="01-AUG-24 14:23:22.085076+05:30"
>>> --target-action=promote restore
>>>
>>> 2024-08-01 14:30:03.535 P00 INFO: restore command begin 2.52.1:
>>> --delta --exec-id=82738-b5fe7415 --log-level-console=info
>>> --log-level-file=debug --pg1-path=/var/lib/edb/as16/data
>>> --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres
>>> --set=20240729-160137F_20240801-142433I --stanza=Demo_Repo
>>> --target="01-AUG-24 14:23:22.085076+05:30" --target-action=promote
>>> --target-timeline=current --type=time
>>> 2024-08-01 14:30:03.880 P00 INFO: repo1: restore backup set
>>> 20240729-160137F_20240801-142433I, recovery will start at 2024-08-01
>>> 14:24:33
>>> 2024-08-01 14:30:03.881 P00 INFO: remove invalid files/links/paths
>>> from '/var/lib/edb/as16/data'
>>> 2024-08-01 14:30:04.567 P00 INFO: write updated
>>> /var/lib/edb/as16/data/postgresql.auto.conf
>>> 2024-08-01 14:30:04.569 P00 INFO: restore global/pg_control (performed
>>> last to ensure aborted restores cannot be started)
>>> 2024-08-01 14:30:04.569 P00 INFO: restore size = 75.9MB, file total =
>>> 2171
>>> 2024-08-01 14:30:04.569 P00 INFO: restore command end: completed
>>> successfully (1035ms)
>>> *[root(at)service01 ~]# systemctl start edb-as-16.service*
>>>
>>> *Now If I check the server status : Its dead *
>>>
>>> [root(at)service01 ~]# systemctl status edb-as-16.service
>>> × edb-as-16.service - EDB Postgres Advanced Server 16
>>> Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled;
>>> preset: disabled)
>>> *Active: failed* (Result: exit-code) since Thu 2024-08-01 14:30:58
>>> IST; 4s ago
>>> Duration: 228ms
>>> Process: 82752 ExecStartPre=/usr/edb/as16/bin/edb-as-16-check-db-dir
>>> ${PGDATA} (code=exited, status=0/SUCCESS)
>>> Process: 82757 ExecStart=/usr/edb/as16/bin/edb-postgres -D ${PGDATA}
>>> (code=exited, status=1/FAILURE)
>>> Main PID: 82757 (code=exited, status=1/FAILURE)
>>> CPU: 325ms
>>>
>>> Aug 01 14:30:56 service01 systemd[1]: Starting EDB Postgres Advanced
>>> Server 16...
>>> Aug 01 14:30:56 service01 edb-postgres[82757]: 2024-08-01 14:30:56 IST
>>> LOG: redirecting log output to logging collector process
>>> Aug 01 14:30:56 rservice01 edb-postgres[82757]: 2024-08-01 14:30:56 IST
>>> HINT: Future log output will appear in directory "log".
>>> Aug 01 14:30:58 service01 systemd[1]: Started EDB Postgres Advanced
>>> Server 16.
>>> Aug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Main process
>>> exited, code=exited, status=1/FAILURE
>>> Aug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Killing process
>>> 82758 (edb-postgres) with signal SIGKILL.
>>> A*ug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Failed with
>>> result 'exit-code'.*
>>> [root(at)service01 ~]#
>>>
>>> Any hints/guidance most welcome.
>>>
>>> Can you please share the DB Server log as it contains the exact error
>> which is causing the server not to start.
>>
>> Thanks
>>
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Piotr Andreassen Blasiak 2024-08-01 14:18:54 Re: Logical replication slots on slaves/replicas?
Previous Message Mateusz Henicz 2024-08-01 10:34:12 Re: PgBackRest PTR recovery: After table drop to get dropped state