Re: "PANIC: could not open critical system index 2662" - twice

From: Kirk Wolak <wolakk(at)gmail(dot)com>
To: Evgeny Morozov <postgresql3(at)realityexists(dot)net>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: "PANIC: could not open critical system index 2662" - twice
Date: 2023-05-11 23:00:03
Message-ID: CACLU5mQKBMme_cx2NzNmwFHPr8APjWfrNubM3=9wxA8wnsbjeQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, May 10, 2023 at 9:32 AM Evgeny Morozov <
postgresql3(at)realityexists(dot)net> wrote:

> On 10/05/2023 6:39 am, Kirk Wolak wrote:
>
> It could be as simple as creating temp tables in the other database (since
> I believe pg_class was hit).
>
> We do indeed create temp tables, both in other databases and in the ones
> being tested. (We also create non-temp tables there.)
>
>
> Also, not sure if the OP has a set of things done after he creates the DB
> that may help?
>
> Basically we read rows from the source database, create some partitions of
> tables in the target database, insert into a temp table there using BULK
> COPY, then using a regular INSERT copy from the temp tables to the new
> partitions.
>
>
> Now that the probem has been reproduced and understood by the PG
> developers, could anyone explain why PG crashed entirely with the "PANIC"
> error back in April when only specific databases were corrupted, not any
> global objects necesary for PG to run? And why did it not crash with the
> "PANIC" on this occasion?
>
I understand the question as:
Why would it PANIC on non-system data corruption, but not on system data
corruption?

To which my guess is:
Because System Data Corruption, on startup is probably a use case, and we
want to report, and come up as much as possible.
Whereas the OTHER code did a PANIC simply because it was BOTH unexpected,
and NOT Where it was in a place it could move forward.
Meaning it had no idea if it read in bad data, or if it CREATED the bad
data.

As a programmer, you will find much more robust code on startup checking
than in the middle of doing something else.

But just a guess. Someone deeper into the code might explain it better.
And you COULD go dig through the source to compare the origination of the
error messages?

Kirk...

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Kirk Wolak 2023-05-11 23:13:53 Re: huge discrepancy between EXPLAIN cost and actual time (but the table has just been ANALYZED)
Previous Message Kirk Wolak 2023-05-11 22:42:34 Re: order by