Re: Online verification of checksums

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: Online verification of checksums
Date: 2018-09-26 14:54:45
Message-ID: 20180926145445.GZ4184@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Michael Banck (michael(dot)banck(at)credativ(dot)de) wrote:
> Am Mittwoch, den 26.09.2018, 13:23 +0200 schrieb Fabien COELHO:
> > There are debatable changes of behavior:
> >
> > if (errno == ENOENT) return / continue...
> >
> > For instance, a file disappearing is ok online, but not so if offline. On
> > the other hand, the probability that a file suddenly disappears while the
> > server offline looks remote, so reporting such issues does not seem
> > useful.
> >
> > However I'm more wary with other continues/skips added. ISTM that skipping
> > a block because of a read error, or because it is new, or some other
> > reasons, is not the same thing, so should be counted & reported
> > differently?
>
> I think that would complicate things further without a lot of benefit.
>
> After all, we are interested in checksum failures, not necessarily read
> failures etc. so exiting on them (and skip checking possibly large parts
> of PGDATA) looks undesirable to me.
>
> So I have done no changes in this part so far, what do others think
> about this?

I certainly don't see a lot of point in doing much more than what was
discussed previously for 'new' blocks (counting them as skipped and
moving on).

An actual read() error (that is, a failure on a read() call such as
getting back EIO), on the other hand, is something which I'd probably
report back to the user immediately and then move on, and perhaps
report again at the end.

Note that a short read isn't an error and falls under the 'new' blocks
discussion above.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arseny Sher 2018-09-26 15:02:47 Re: Global snapshots
Previous Message Liudmila Mantrova 2018-09-26 14:40:04 Re: [HACKERS] Bug in to_timestamp().