Re: BUG #18025: Probably we need to change behaviour of the checkpoint failures in PG

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: hargudekishor(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18025: Probably we need to change behaviour of the checkpoint failures in PG
Date: 2023-07-17 07:53:32
Message-ID: ca4cdccd0b085cbbf10577be4cb05eae4c35a141.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, 2023-07-17 at 05:03 +0000, PG Bug reporting form wrote:
> I have just witnessed the data loss scenario.
>
> Scenario is like, there was checkpoint operation failures going on the DB
> server since last 8 hours which means no successful checkpoint happened in
> the DB server since last 8 hours. Then DB server went into the crash mode
> due to the exhausted disk space and did not came up as part of crash
> recovery.

Mistake #1: you did not monitor disk space.

> Actually the victim had moved few WALs from the pg_wal to other
> location and reimporting those wal on original location also did not solved
> the problem.

Mistake #2: manually messing with the database directory.

> DB server was not able to find out the valid checkpoint record.
> The victim was not having the backup which he could use that backup to
> recover the data with the help of available archived WALs.

Mistake #0: no backup.

> So , the victim
> had only one option left in his hand that is pg_resetwal.  We have tried
> every possible solution but did not worked so we did not left with more
> choices other than pg_Resetwal

Mistake #3: run pg_resetwal

"We have tried every possible solution" sounds a bit like "we tried all the
haphazard things that came to our mind".

Sorry, this is not a bug, this is a pilot error.

If PostgreSQL crashes because "pg_wal" runs out of disk space, you increase
the disk space, start PostgreSQL and let it complete crash recovery. It is
as simple as that.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Laurenz Albe 2023-07-17 07:58:00 Re: Query returns error "there is no parameter $1" but server logs that there are two parameters supplied
Previous Message Manika Singhal 2023-07-17 07:37:30 Re: BUG #18021: Loading Error