Re: BUG #17268: Possible corruption in toast index after reindex index concurrently

From: Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Alexey Ermakov <alexey(dot)ermakov(at)dataegret(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>
Subject: Re: BUG #17268: Possible corruption in toast index after reindex index concurrently
Date: 2021-11-04 19:47:20
Message-ID: CAK-MWwTFEPE+9VJVh_T4HQa5ZH28MgOfHiznk05-Zj+ZnZn2RQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Nov 4, 2021 at 8:18 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Thu, Nov 4, 2021 at 11:08 AM Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com> wrote:
> > select bt_index_check('pg_toast.pg_toast_2624976286_index', true);
> > DEBUG: verifying consistency of tree structure for index
> > "pg_toast_2624976286_index"
> > DEBUG: verifying level 3 (true root level)
> > DEBUG: verifying level 2
> > DEBUG: verifying level 1
> > DEBUG: verifying level 0 (leaf level)
> > DEBUG: leaf block 715360 of index "pg_toast_2624976286_index" has no
> > first data item
> > DEBUG: verifying that tuples from index "pg_toast_2624976286_index"
> > are present in "pg_toast_2624976286"
> > ERROR: heap tuple (59561917,1) from table "pg_toast_2624976286" lacks
> > matching index tuple within index "pg_toast_2624976286_index"
> > HINT: Retrying verification using the function
> > bt_index_parent_check() might provide a more specific error.
>
> That's an unusually large TOAST table. It's at least ~454.42GiB, based
> on this error. Is the block number 59561917 near the end of the table?

select pg_size_pretty(pg_relation_size('pg_toast.pg_toast_2624976286'));
pg_size_pretty
----------------
473 GB
now... and yes during the time of error page 59561917 was very close
to the end of the table.
There was a high chance (but not 100%) that the corresponding main
table entry had been inserted during reindex CONCURRENTLY of the toast
index run.

We have base backup and wal archive so theoretically it's possible to
restore sequence of writes which lead to error,
but given huge size of relation in interest (and even bigger size of
whole database 10+TB) and large amount of writes it's a complicated
task (especially when I not really sure what exactly to look for in
waldump output).

--
Maxim Boguk
Senior Postgresql DBA
https://dataegret.com/

Phone RU: +7 985 433 0000
Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk

"Доктор, вы мне советовали так не делать, но почему мне по-прежнему
больно когда я так делаю ещё раз?"

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Maxim Boguk 2021-11-04 20:01:44 Re: BUG #17268: Possible corruption in toast index after reindex index concurrently
Previous Message Peter Geoghegan 2021-11-04 18:17:58 Re: BUG #17268: Possible corruption in toast index after reindex index concurrently