Re: Does PostgreSQL check database integrity at startup?

From: Edson Carlos Ericksson Richter <richter(at)simkorp(dot)com(dot)br>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Does PostgreSQL check database integrity at startup?
Date: 2017-12-28 18:35:38
Message-ID: 2bdeab7c-dc99-eb60-84d7-2d47f3c9f180@simkorp.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Em 28/12/2017 16:26, Stephen Frost escreveu:
> Greetings Brent,
>
> * Brent Wood (pcreso(at)yahoo(dot)com) wrote:
>> A tool to calculate a checksum of sorts based on the table (file) content would provide a better surety of duplication than simply checking file size - like differently vacuumed tables in each copy could have the same content but be different file sizes.
> PG has support for checksums and there are tools out there to validate
> that the checksum is correct for all pages which have one, but that
> wouldn't help in this case because the file is zero'd out (and a zero'd
> out file is actually a valid file in a PG data directory).
>
> Also, the files on the primary and the replica actually can be different
> when looked at with a complete-file checksum due to hint bits being set
> differently (at least, possibly other ways too). That doesn't make them
> invalid or incorrect though.
>
> Rather than trying to compare a running primary to a running replica,
> such a check to verify that the files backed up during a backup have the
> same checksum as the files being restored from that backup can be done,
> and that *is* done in at least some of the PG backup tools already
> (pgBackRest has an independent manifest that it stores for each backup
> which contains the checksum of each file as-backed-up, and it verifies
> that checksum when performing a restore to make sure that the backed up
> file wasn't corrupted in place, other tools hopefully have similar).
>
>> I do wonder though - given the order of records in a table (file) is not necessarily identical (or is it?) event this may be problematic. Perhaps a checksum based on the result of a query output ordered by primary key could work?
> The order of records in a *file* should be the same in the heap on the
> primary as they are on the replica, but that doesn't mean the contents
> of those files will be exactly the same (as mentioned above, hint bits
> can differ). We used to have cases where the indexes could also be
> different, but I believe that was changed so they should match.
>
> I've used the approach of doing a checksum across the results of an
> ordered query to compare between systems and that generally does work,
> but it's a bit tricky if you're trying to compare a table that's heavily
> modified- you need to determine the point in the WAL stream that you're
> at on the primary when you run the query and then replay the replica to
> that point in the WAL and then run the query on the replica, otherwise
> you could end up with differences that are just because of the ongoing
> transactions being run to update the table.
>
> Thanks!
>
> Stephen

Stephen, thanks for you detailed reasoning on why this would not be so
simple.

I'll take all of that in consideration. I hope I find something that
gives a bit more confidence that my replicas are enought reliable. At
this moment, I just don't trust them.

Regards,

Edson

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Paul A Jungwirth 2017-12-28 18:59:11 Temporal Foreign Keys
Previous Message Edson Carlos Ericksson Richter 2017-12-28 18:33:10 Re: Does PostgreSQL check database integrity at startup?