Quick Links

Re: Incorrect snapshots while promoting hot standby node when 2PC is used

From:	Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject:	Re: Incorrect snapshots while promoting hot standby node when 2PC is used
Date:	2021-05-04 06:58:36
Message-ID:	53E8323F-DC83-48D1-862B-72742F4BFC6C@yandex-team.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> 3 мая 2021 г., в 23:10, Andres Freund <andres(at)anarazel(dot)de> написал(а):
>
> Hi,
>
> On 2021-05-01 17:35:09 +0500, Andrey Borodin wrote:
>> I'm investigating somewhat resemblant case.
>> We have an OLTP sharded installation where shards are almost always under rebalancing. Data movement is implemented with 2pc.
>> Switchover happens quite often due to datacenter drills. The installation is running on PostgreSQL 12.6.
>
> If you still have the data it would be useful if you could check if the
> LSNs of the corrupted pages are LSNs from shortly after standby
> promotion/switchover?
That's a neat idea, I'll check if I can restore backup with corruptions.
I have a test cluster with corruptions, but it has undergone tens of switchovers...

>> Or, perhaps, it looks more like a hardware problem? Data_checksums are
>> on, but few years ago we observed ssd firmware that was loosing
>> updates, but passing checksums. I'm sure that we would benefit from
>> having separate relation fork for checksums or LSNs.
>
> Right - checksums are "page local". They can only detect if a page is
> corrupted, not if e.g. an older version of the page (with correct
> checksum) has been restored. While there are schemes to have stronger
> error detection properties, they do come with substantial overhead (at
> least the ones I can think of right now).

We can have PTRACK-like fork with page LSNs. It can be flushed on checkpoint and restored from WAL on crash. So we always can detect stale page version. Like LSN-track :) We will have much faster rewind and delta-backups for free.

Though I don't think it worth an effort until we at least checksum CLOG.

Thanks!

Best regards, Andrey Borodin.

In response to

Re: Incorrect snapshots while promoting hot standby node when 2PC is used at 2021-05-03 18:10:50 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bharath Rupireddy	2021-05-04 07:13:53	Re: Simplify backend terminate and wait logic in postgres_fdw test
Previous Message	vignesh C	2021-05-04 05:34:23	Re: Replication slot stats misgivings