| From: | Kashif Zeeshan <kashi(dot)zeeshan(at)gmail(dot)com> | 
|---|---|
| To: | KK CHN <kkchn(dot)in(at)gmail(dot)com> | 
| Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org> | 
| Subject: | Re: PgBackRest PTR recovery: After table drop to get dropped state | 
| Date: | 2024-08-01 10:00:16 | 
| Message-ID: | CAAPsdhcBz9PSbrD4QY0taouACMBqSZcWZZFoo5tY1OUxp10bww@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
Hi
On Thu, Aug 1, 2024 at 2:54 PM KK CHN <kkchn(dot)in(at)gmail(dot)com> wrote:
> List,
>
> *Not working (start EPAS server always fails):*
>
> 1. Testing PTR using  PgBackRest(2.52.1)  on RHEL9  EPAS-16, and RHEL9 (
> Repo       Server)
>
>   When I do a PTR
>
> 1.  After doing a table drop and then
> 2. Noting down the time stamp and then
> 3. Taking an incremental backup in hope that If I do a restore from this
> incr Backup, that won't  contain the  dropped table.
> 4. Correct me  if I am  conceptually wrong here.
> 5.  I am *never *successful in restoring the EPAS server in this scenario.
>
>
> *I know the following will work for me, w*hy not the above one if I
> really want that state of cluster also  ?
>
> *This is Working. *
>  1. Create table
> 2. Noting down the timestamp
> 3.  Taking incremental backup on RepoServer.
> 4. drop the created table .
> 5. Then stop the EPAS server and do a  PTR, by the  --set=step 3 incr
> backup  and target= step 2 time stamp .. It finished the pgaback restore
> and promote command
> 6. I am able to start back the  EPAS server and see the dropped table
> recovered there.
>
> But If I want a PTR as in the first section it fails.. Why ?
>
> Thank you,
> Krishane
>
>
>
>
>
> *What I have done and results obtained: *
>
> Created a table important_table4 in my EPAS and note down the time after
> creation of this table it is  ( t1 :  "01-AUG-24 14:08:32.447796+05:30" )
>
> Then I performed an Incremental backup   (incr backup:
> 20240729-160137F_20240801-141148I )
> timestamp start/stop: 2024-08-01 14:11:48+05:30 / 2024-08-01 14:11:52+05:30
>
>
> Now I dropped the table table4 from the EPAS and noted down the time
>
>
> I want to  restore the table4,, so I stopped EPAS and executed
>
> $ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
> --set=20240729-160137F_20240801-141148I  --target-timeline=current
> --type=time  --target="01-AUG-24 14:08:32.447796+05:30"
> --target-action=promote restore
>
> IT WORKS AS EXPECTED .. after restarting the EPAS I am able to get the
> important_table4 back.
>
> root(at)service01 ~]# sudo -u enterprisedb psql edb
> psql (16.3.0)
> Type "help" for help.
>
> edb=# \dt
>                 List of relations
>  Schema |       Name       | Type  |    Owner
> --------+------------------+-------+--------------
>  public | important_table  | table | enterprisedb
>  public | important_table2 | table | enterprisedb
>  public | important_table3 | table | enterprisedb
>  public | important_table4 | table | enterprisedb
> (4 rows)
>
> SO all works fine !!!! .
>
>
> *But Now the PROBLEM Statement. *
>
> *1. I am dropping the table table 4 again *
> edb=# \q
> [root(at)service01 ~]# sudo -u enterprisedb psql -c "begin; drop table
> important_table4; commit;" edb
> BEGIN
> DROP TABLE
> COMMIT
> *2 .  [root(at)service01 ~]#* sudo -u enterprisedb psql -Atc "select
> current_timestamp" edb  01-AUG-24 14:23:22.085076 +05:30
> Noting the time as :   (01-AUG-24 14:23:22.085076 +05:30 )
>
> 3. Now  I am performing an incremental backup after step 2  on REPO SErver
> ( Hoping that this latest INCR Backup is without dropped important_table4,
> so that a recovery of the cluster  shouldn't show the table4 again. )
>
>         incr backup details. : 20240729-160137F_20240801-142433I
>             timestamp start/stop*: 2024-08-01 14:24:33+05:30 / 2024-08-01
> 14:24:36+05:30*
>
> 4. Now I want to test the database recovery  after dropping the table4 in
> step1 to verify that my EPAS restores from the backup in step 3 and time
> stamp (01-AUG-24 14:23:22.085076 +05:30,   so that  the restored EPAS
> cluster doesn't contain the important_table4.
>
> 5.  $ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
>  --set=20240729-160137F_20240801-142433I  --target-timeline=current
> --type=time  --target="01-AUG-24 14:23:22.085076+05:30"
> --target-action=promote restore
>  ------------
> -------------
> INFO: restore command end: completed successfully (1035ms)
>
> *ISSUE:    I am unable to get the EPAS Server* in running state after
> step 5
>
>  *What am I doing wrong ?  OR am I conceptually wrong ?*
>
>
>
>
> OUTPUT on executing step 5.
>
> [root(at)service01 ~]# sudo -u enterprisedb pgbackrest --stanza=Demo_Repo
> --delta --set=20240729-160137F_20240801-142433I  --target-timeline=current
> --type=time  --target="01-AUG-24 14:23:22.085076+05:30"
> --target-action=promote restore
>
> 2024-08-01 14:30:03.535 P00   INFO: restore command begin 2.52.1: --delta
> --exec-id=82738-b5fe7415 --log-level-console=info --log-level-file=debug
> --pg1-path=/var/lib/edb/as16/data --pg-version-force=16
> --repo1-host=10.10.20.7 --repo1-host-user=postgres
> --set=20240729-160137F_20240801-142433I --stanza=Demo_Repo
> --target="01-AUG-24 14:23:22.085076+05:30" --target-action=promote
> --target-timeline=current --type=time
> 2024-08-01 14:30:03.880 P00   INFO: repo1: restore backup set
> 20240729-160137F_20240801-142433I, recovery will start at 2024-08-01
> 14:24:33
> 2024-08-01 14:30:03.881 P00   INFO: remove invalid files/links/paths from
> '/var/lib/edb/as16/data'
> 2024-08-01 14:30:04.567 P00   INFO: write updated
> /var/lib/edb/as16/data/postgresql.auto.conf
> 2024-08-01 14:30:04.569 P00   INFO: restore global/pg_control (performed
> last to ensure aborted restores cannot be started)
> 2024-08-01 14:30:04.569 P00   INFO: restore size = 75.9MB, file total =
> 2171
> 2024-08-01 14:30:04.569 P00   INFO: restore command end: completed
> successfully (1035ms)
> *[root(at)service01 ~]# systemctl  start edb-as-16.service*
>
> *Now  If I check the server  status  :   Its dead *
>
> [root(at)service01 ~]# systemctl  status edb-as-16.service
> × edb-as-16.service - EDB Postgres Advanced Server 16
>      Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled;
> preset: disabled)
>      *Active: failed* (Result: exit-code) since Thu 2024-08-01 14:30:58
> IST; 4s ago
>    Duration: 228ms
>     Process: 82752 ExecStartPre=/usr/edb/as16/bin/edb-as-16-check-db-dir
> ${PGDATA} (code=exited, status=0/SUCCESS)
>     Process: 82757 ExecStart=/usr/edb/as16/bin/edb-postgres -D ${PGDATA}
> (code=exited, status=1/FAILURE)
>    Main PID: 82757 (code=exited, status=1/FAILURE)
>         CPU: 325ms
>
> Aug 01 14:30:56 service01 systemd[1]: Starting EDB Postgres Advanced
> Server 16...
> Aug 01 14:30:56 service01 edb-postgres[82757]: 2024-08-01 14:30:56 IST
> LOG:  redirecting log output to logging collector process
> Aug 01 14:30:56 rservice01 edb-postgres[82757]: 2024-08-01 14:30:56 IST
> HINT:  Future log output will appear in directory "log".
> Aug 01 14:30:58 service01 systemd[1]: Started EDB Postgres Advanced Server
> 16.
> Aug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Main process
> exited, code=exited, status=1/FAILURE
> Aug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Killing process
> 82758 (edb-postgres) with signal SIGKILL.
> A*ug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Failed with
> result 'exit-code'.*
> [root(at)service01 ~]#
>
> Any hints/guidance most welcome.
>
> Can you please share the DB Server log as it contains the exact error
which is causing the server not to start.
Thanks
| From | Date | Subject | |
|---|---|---|---|
| Next Message | KK CHN | 2024-08-01 10:02:46 | PgBackRest PTR recovery: After table drop to get dropped state | 
| Previous Message | shammat | 2024-08-01 09:05:09 | Re: Logical replication slots on slaves/replicas? |