broken reading on standby (PostgreSQL 16.2)

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: broken reading on standby (PostgreSQL 16.2)
Date: 2024-04-25 06:12:17
Message-ID: CAFj8pRBEFMxxFSCVOSi-4n0jHzSaxh6Ze_cZid5eG=tsnn49-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

yesterday, I had to fix strange issue on standby server

The query to freshly updated data fails

select * from seller_success_rate where create_time::date = '2024-04-23';
ERROR: 58P01: could not access status of transaction 1393466389
DETAIL: Could not open file "pg_xact/0530": No such file or directory.
LOCATION: SlruReportIOError, slru.c:947

amcheck

select * from verify_heapam('seller_success_rate');
blkno | offnum | attnum | msg

-------+--------+--------+-------------------------------------------------------------------
5763 | 111 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
5863 | 109 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
5863 | 110 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
5868 | 110 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
5868 | 111 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
5875 | 111 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
5895 | 109 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
5895 | 110 | | xmin 1439564642 precedes oldest valid
transaction ID 3:1523885078
6245 | 108 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
6245 | 109 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
6245 | 110 | | xmin 1439564642 precedes oldest valid
transaction ID 3:1523885078
6245 | 112 | | xmin 1424677216 precedes oldest valid
transaction ID 3:1523885078
6378 | 109 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
6378 | 110 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
6382 | 110 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
6590 | 110 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
6590 | 111 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
7578 | 112 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
7581 | 112 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
8390 | 112 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
10598 | 109 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078
10598 | 110 | | xmin 1393466389 precedes oldest valid
transaction ID 3:1523885078

I verified xmin against the primary server, and it was the same. There was
not any replication gap.

I checked the fields from pg_database table, and looks same too

These rows were valid (and visible) on primary.

On this server there was not any long session (when I was connected),
unfortunately I cannot test restart of this server. One wal sender is
executing on standby. Fortunately, there was a possibility to run VACUUM
FULL, and it fixed the issue.

The customer has archived wals.

My question - is it possible to do some diagnostics from SQL level? I
didn't find a way to get values that are used for comparison by amcheck
from SQL.

Regards

Pavel

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey M. Borodin 2024-04-25 06:52:41 Re: broken reading on standby (PostgreSQL 16.2)
Previous Message Alexander Lakhin 2024-04-25 06:00:00 Re: Avoid orphaned objects dependencies, take 3