Re: Corruption during WAL replay

From: Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, deniel1495(at)mail(dot)ru, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, tejeswarm(at)hotmail(dot)com, hlinnaka <hlinnaka(at)iki(dot)fi>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Daniel Wood <hexexpert(at)comcast(dot)net>
Subject: Re: Corruption during WAL replay
Date: 2022-03-25 16:26:52
Message-ID: 87fsn6q9pv.fsf@wibble.ilmari.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> ... It's not
>> like a 16-bit checksum was state-of-the-art even when we introduced
>> it. We just did it because we had 2 bytes that we could repurpose
>> relatively painlessly, and not any larger number. And that's still the
>> case today, so at least in the short term we will have to choose some
>> other solution to this problem.
>
> Indeed. I propose the attached, which also fixes the unsafe use
> of seek() alongside syswrite(), directly contrary to what "man perlfunc"
> says to do.

LGTM, but it would be good to include $! in the die messages.

- ilmari

> regards, tom lane
>
> diff --git a/src/bin/pg_checksums/t/002_actions.pl b/src/bin/pg_checksums/t/002_actions.pl
> index 62c608eaf6..8c70453a45 100644
> --- a/src/bin/pg_checksums/t/002_actions.pl
> +++ b/src/bin/pg_checksums/t/002_actions.pl
> @@ -24,6 +24,7 @@ sub check_relation_corruption
> my $tablespace = shift;
> my $pgdata = $node->data_dir;
>
> + # Create table and discover its filesystem location.
> $node->safe_psql(
> 'postgres',
> "CREATE TABLE $table AS SELECT a FROM generate_series(1,10000) AS a;
> @@ -37,9 +38,6 @@ sub check_relation_corruption
> my $relfilenode_corrupted = $node->safe_psql('postgres',
> "SELECT relfilenode FROM pg_class WHERE relname = '$table';");
>
> - # Set page header and block size
> - my $pageheader_size = 24;
> - my $block_size = $node->safe_psql('postgres', 'SHOW block_size;');
> $node->stop;
>
> # Checksums are correct for single relfilenode as the table is not
> @@ -55,8 +53,12 @@ sub check_relation_corruption
>
> # Time to create some corruption
> open my $file, '+<', "$pgdata/$file_corrupted";
> - seek($file, $pageheader_size, SEEK_SET);
> - syswrite($file, "\0\0\0\0\0\0\0\0\0");
> + my $pageheader;
> + sysread($file, $pageheader, 24) or die "sysread failed";
> + # This inverts the pd_checksum field (only); see struct PageHeaderData
> + $pageheader ^= "\0\0\0\0\0\0\0\0\xff\xff";
> + sysseek($file, 0, 0) or die "sysseek failed";
> + syswrite($file, $pageheader) or die "syswrite failed";
> close $file;
>
> # Checksum checks on single relfilenode fail

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laetitia Avrot 2022-03-25 16:28:45 Re: Re: pg_dump new feature: exporting functions only. Bad or good idea ?
Previous Message Laetitia Avrot 2022-03-25 16:25:01 Re: pg_dump new feature: exporting functions only. Bad or good idea ?