Re: PgBackRest PTR recovery: After table drop to get dropped state

From: KK CHN <kkchn(dot)in(at)gmail(dot)com>
To: Mateusz Henicz <mateuszhenicz(at)gmail(dot)com>
Cc: Kashif Zeeshan <kashi(dot)zeeshan(at)gmail(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: PgBackRest PTR recovery: After table drop to get dropped state
Date: 2024-08-02 08:37:38
Message-ID: CAKgGyB-oSj_L-8EgfnN7Gfi-SPZXY2ZypO2QBb353U=RoNT2ZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thank you for shedding light on this.

On Thu, Aug 1, 2024 at 4:04 PM Mateusz Henicz <mateuszhenicz(at)gmail(dot)com>
wrote:

> When you are performing PITR you need to configure a timestamp before your
> last committed transaction. In your case you provided timestamp after your
> last commit.
>
>

I am trying to understand this concept and correct me if I am wrong
If I do

STEPS:
1. Noting down the time stamp just before my table drop, then
2. Drop a table, then
3. Taking the incremental backup after this table drop on Repo server, Then
4. If I do a PITR, with --target = timestamp noted in step1 and --set
= the incr backup taken in step3

Then if perform a PITR restore by pgbackrest in step4 will it succeed or
fail theoretically ?

Thanks,
Krishane

When postgtes is restoring until a specified point, it restores a
> transaction from WAL, and checking if next transaction is before or after
> said timestamp. If it is before it will replay it and check next
> transaction. Until next transaction is after configured timestamp.
> If there is no transaction after your current timestamp in current WAL,
> postgres will try to restore next WAL from archive. And since there is no
> next WAL, and your timestamp is past latest committed transaction, it is
> unable to continue, because it does not know if there should be any other
> transaction replayed or not.
>
> Just perform some other actions after you note down the timestamp after
> drop table. Create another one, insert some data, do whatever to have
> another transaction in WALs.
>
> Cheers,
> Mateusz
>
> czw., 1 sie 2024 o 12:23 KK CHN <kkchn(dot)in(at)gmail(dot)com> napisał(a):
>
>> The logs are here.
>>
>> https://pastecode.io/s/s5dp8ur1
>>
>>
>>
>> On Thu, Aug 1, 2024 at 3:30 PM Kashif Zeeshan <kashi(dot)zeeshan(at)gmail(dot)com>
>> wrote:
>>
>>> Hi
>>>
>>> On Thu, Aug 1, 2024 at 2:54 PM KK CHN <kkchn(dot)in(at)gmail(dot)com> wrote:
>>>
>>>> List,
>>>>
>>>> *Not working (start EPAS server always fails):*
>>>>
>>>> 1. Testing PTR using PgBackRest(2.52.1) on RHEL9 EPAS-16, and RHEL9
>>>> ( Repo Server)
>>>>
>>>> When I do a PTR
>>>>
>>>> 1. After doing a table drop and then
>>>> 2. Noting down the time stamp and then
>>>> 3. Taking an incremental backup in hope that If I do a restore from
>>>> this incr Backup, that won't contain the dropped table.
>>>> 4. Correct me if I am conceptually wrong here.
>>>> 5. I am *never *successful in restoring the EPAS server in this
>>>> scenario.
>>>>
>>>>
>>>> *I know the following will work for me, w*hy not the above one if I
>>>> really want that state of cluster also ?
>>>>
>>>> *This is Working. *
>>>> 1. Create table
>>>> 2. Noting down the timestamp
>>>> 3. Taking incremental backup on RepoServer.
>>>> 4. drop the created table .
>>>> 5. Then stop the EPAS server and do a PTR, by the --set=step 3 incr
>>>> backup and target= step 2 time stamp .. It finished the pgaback restore
>>>> and promote command
>>>> 6. I am able to start back the EPAS server and see the dropped table
>>>> recovered there.
>>>>
>>>> But If I want a PTR as in the first section it fails.. Why ?
>>>>
>>>> Thank you,
>>>> Krishane
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *What I have done and results obtained: *
>>>>
>>>> Created a table important_table4 in my EPAS and note down the time
>>>> after creation of this table it is ( t1 : "01-AUG-24
>>>> 14:08:32.447796+05:30" )
>>>>
>>>> Then I performed an Incremental backup (incr backup:
>>>> 20240729-160137F_20240801-141148I )
>>>> timestamp start/stop: 2024-08-01 14:11:48+05:30 / 2024-08-01
>>>> 14:11:52+05:30
>>>>
>>>>
>>>> Now I dropped the table table4 from the EPAS and noted down the time
>>>>
>>>>
>>>> I want to restore the table4,, so I stopped EPAS and executed
>>>>
>>>> $ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
>>>> --set=20240729-160137F_20240801-141148I --target-timeline=current
>>>> --type=time --target="01-AUG-24 14:08:32.447796+05:30"
>>>> --target-action=promote restore
>>>>
>>>> IT WORKS AS EXPECTED .. after restarting the EPAS I am able to get the
>>>> important_table4 back.
>>>>
>>>> root(at)service01 ~]# sudo -u enterprisedb psql edb
>>>> psql (16.3.0)
>>>> Type "help" for help.
>>>>
>>>> edb=# \dt
>>>> List of relations
>>>> Schema | Name | Type | Owner
>>>> --------+------------------+-------+--------------
>>>> public | important_table | table | enterprisedb
>>>> public | important_table2 | table | enterprisedb
>>>> public | important_table3 | table | enterprisedb
>>>> public | important_table4 | table | enterprisedb
>>>> (4 rows)
>>>>
>>>> SO all works fine !!!! .
>>>>
>>>>
>>>> *But Now the PROBLEM Statement. *
>>>>
>>>> *1. I am dropping the table table 4 again *
>>>> edb=# \q
>>>> [root(at)service01 ~]# sudo -u enterprisedb psql -c "begin; drop table
>>>> important_table4; commit;" edb
>>>> BEGIN
>>>> DROP TABLE
>>>> COMMIT
>>>> *2 . [root(at)service01 ~]#* sudo -u enterprisedb psql -Atc "select
>>>> current_timestamp" edb 01-AUG-24 14:23:22.085076 +05:30
>>>> Noting the time as : (01-AUG-24 14:23:22.085076 +05:30 )
>>>>
>>>> 3. Now I am performing an incremental backup after step 2 on REPO
>>>> SErver ( Hoping that this latest INCR Backup is without dropped
>>>> important_table4, so that a recovery of the cluster shouldn't show the
>>>> table4 again. )
>>>>
>>>> incr backup details. : 20240729-160137F_20240801-142433I
>>>> timestamp start/stop*: 2024-08-01 14:24:33+05:30 /
>>>> 2024-08-01 14:24:36+05:30*
>>>>
>>>> 4. Now I want to test the database recovery after dropping the table4
>>>> in step1 to verify that my EPAS restores from the backup in step 3 and time
>>>> stamp (01-AUG-24 14:23:22.085076 +05:30, so that the restored EPAS
>>>> cluster doesn't contain the important_table4.
>>>>
>>>> 5. $ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
>>>> --set=20240729-160137F_20240801-142433I --target-timeline=current
>>>> --type=time --target="01-AUG-24 14:23:22.085076+05:30"
>>>> --target-action=promote restore
>>>> ------------
>>>> -------------
>>>> INFO: restore command end: completed successfully (1035ms)
>>>>
>>>> *ISSUE: I am unable to get the EPAS Server* in running state after
>>>> step 5
>>>>
>>>> *What am I doing wrong ? OR am I conceptually wrong ?*
>>>>
>>>>
>>>>
>>>>
>>>> OUTPUT on executing step 5.
>>>>
>>>> [root(at)service01 ~]# sudo -u enterprisedb pgbackrest --stanza=Demo_Repo
>>>> --delta --set=20240729-160137F_20240801-142433I --target-timeline=current
>>>> --type=time --target="01-AUG-24 14:23:22.085076+05:30"
>>>> --target-action=promote restore
>>>>
>>>> 2024-08-01 14:30:03.535 P00 INFO: restore command begin 2.52.1:
>>>> --delta --exec-id=82738-b5fe7415 --log-level-console=info
>>>> --log-level-file=debug --pg1-path=/var/lib/edb/as16/data
>>>> --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres
>>>> --set=20240729-160137F_20240801-142433I --stanza=Demo_Repo
>>>> --target="01-AUG-24 14:23:22.085076+05:30" --target-action=promote
>>>> --target-timeline=current --type=time
>>>> 2024-08-01 14:30:03.880 P00 INFO: repo1: restore backup set
>>>> 20240729-160137F_20240801-142433I, recovery will start at 2024-08-01
>>>> 14:24:33
>>>> 2024-08-01 14:30:03.881 P00 INFO: remove invalid files/links/paths
>>>> from '/var/lib/edb/as16/data'
>>>> 2024-08-01 14:30:04.567 P00 INFO: write updated
>>>> /var/lib/edb/as16/data/postgresql.auto.conf
>>>> 2024-08-01 14:30:04.569 P00 INFO: restore global/pg_control
>>>> (performed last to ensure aborted restores cannot be started)
>>>> 2024-08-01 14:30:04.569 P00 INFO: restore size = 75.9MB, file total =
>>>> 2171
>>>> 2024-08-01 14:30:04.569 P00 INFO: restore command end: completed
>>>> successfully (1035ms)
>>>> *[root(at)service01 ~]# systemctl start edb-as-16.service*
>>>>
>>>> *Now If I check the server status : Its dead *
>>>>
>>>> [root(at)service01 ~]# systemctl status edb-as-16.service
>>>> × edb-as-16.service - EDB Postgres Advanced Server 16
>>>> Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled;
>>>> preset: disabled)
>>>> *Active: failed* (Result: exit-code) since Thu 2024-08-01
>>>> 14:30:58 IST; 4s ago
>>>> Duration: 228ms
>>>> Process: 82752
>>>> ExecStartPre=/usr/edb/as16/bin/edb-as-16-check-db-dir ${PGDATA}
>>>> (code=exited, status=0/SUCCESS)
>>>> Process: 82757 ExecStart=/usr/edb/as16/bin/edb-postgres -D
>>>> ${PGDATA} (code=exited, status=1/FAILURE)
>>>> Main PID: 82757 (code=exited, status=1/FAILURE)
>>>> CPU: 325ms
>>>>
>>>> Aug 01 14:30:56 service01 systemd[1]: Starting EDB Postgres Advanced
>>>> Server 16...
>>>> Aug 01 14:30:56 service01 edb-postgres[82757]: 2024-08-01 14:30:56 IST
>>>> LOG: redirecting log output to logging collector process
>>>> Aug 01 14:30:56 rservice01 edb-postgres[82757]: 2024-08-01 14:30:56 IST
>>>> HINT: Future log output will appear in directory "log".
>>>> Aug 01 14:30:58 service01 systemd[1]: Started EDB Postgres Advanced
>>>> Server 16.
>>>> Aug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Main process
>>>> exited, code=exited, status=1/FAILURE
>>>> Aug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Killing
>>>> process 82758 (edb-postgres) with signal SIGKILL.
>>>> A*ug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Failed with
>>>> result 'exit-code'.*
>>>> [root(at)service01 ~]#
>>>>
>>>> Any hints/guidance most welcome.
>>>>
>>>> Can you please share the DB Server log as it contains the exact error
>>> which is causing the server not to start.
>>>
>>> Thanks
>>>
>>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kashif Zeeshan 2024-08-02 09:13:32 Re: PgBackRest PTR recovery: After table drop to get dropped state
Previous Message Vince McMahon 2024-08-02 07:56:39 Impact from removal of pgadmin4