Quick Links

Re: Corruption during WAL replay

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, deniel1495(at)mail(dot)ru, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, tejeswarm(at)hotmail(dot)com, hlinnaka <hlinnaka(at)iki(dot)fi>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Daniel Wood <hexexpert(at)comcast(dot)net>
Subject:	Re: Corruption during WAL replay
Date:	2022-03-25 13:49:34
Message-ID:	3237467.1648216174@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2022-03-25 01:38:45 -0400, Tom Lane wrote:
>> AFAICS, this strategy of whacking a predetermined chunk of the page with
>> a predetermined value is going to fail 1-out-of-64K times.

> Yea. I suspect that the way the modifications and checksumming are done are
> actually higher chance than 1/64k. But even it actually is 1/64k, it's not
> great to wait for (#animals * #catalog-changes) to approach a decent
> percentage of 1/64k.

Exactly.

> I'm was curious whether there have been similar issues in the past. Querying
> the buildfarm logs suggests not, at least not in the pg_checksums test.

That test has only been there since 2018 (b34e84f16). We've probably
accumulated a couple hundred initial-catalog-contents changes since
then, so maybe this failure arrived right on schedule :-(.

> We really ought to find a way to get to wider checksums :/

That'll just reduce the probability of failure, not eliminate it.

regards, tom lane

In response to

Re: Corruption during WAL replay at 2022-03-25 06:07:37 from Andres Freund

Responses

Re: Corruption during WAL replay at 2022-03-25 13:53:57 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2022-03-25 13:53:25	Re: identifying unrecognized node type errors
Previous Message	Japin Li	2022-03-25 13:35:42	Re: pg_relation_size on partitioned table