| From: | hao harry <harry-hao(at)outlook(dot)com> | 
|---|---|
| To: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> | 
| Subject: | Standby got invalid primary checkpoint after crashed right after promoted. | 
| Date: | 2022-03-16 07:16:16 | 
| Message-ID: | 9EB4CF63-1107-470E-B5A4-061FB9EF8CC8@outlook.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hi, pgsql-hackers,
I think I found a case that database is not recoverable, would you please give a look?
Here is how it happens:
- setup primary/standby
- do a lots INSERT at primary
- create a checkpoint at primary
- wait until standby start doing restart point, it take about 3mins syncing buffers to complete
- before the restart point update ControlFile, promote the standby, that changed ControlFile
  ->state to DB_IN_PRODUCTION, this will skip update to ControlFile, leaving the ControlFile
  ->checkPoint pointing to a removed file
- before the promoted standby request the post-recovery checkpoint (fast promoted), 
  one backend crashed, it will kill other server process, so the post-recovery checkpoint skipped
- the database restart startup process, which report: "could not locate a valid checkpoint record"
I attached a test to reproduce it, it does not fail every time, it fails every 10 times to me.
To increase the chance CreateRestartPoint skip update ControlFile and to simulate a crash,
the patch 0001 is needed.
Best Regard.
Harry Hao
| Attachment | Content-Type | Size | 
|---|---|---|
| 0001-Patched-CreateRestartPoint-to-reproduce-invalid-chec.patch | application/octet-stream | 2.6 KB | 
| reprod_crash_right_after_promoted.pl | text/x-perl-script | 2.2 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2022-03-16 07:18:09 | Re: Tab completion for ALTER MATERIALIZED VIEW ... SET ACCESS METHOD | 
| Previous Message | Masahiko Sawada | 2022-03-16 07:07:07 | Re: Skipping logical replication transactions on subscriber side |