Re: new heapcheck contrib module

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: new heapcheck contrib module
Date: 2020-05-13 23:32:54
Message-ID: 20200513233254.GA32158@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020-May-13, Peter Geoghegan wrote:

> On Wed, May 13, 2020 at 3:10 PM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> > Hmm. I think we should (try to?) write code that avoids all crashes
> > with production builds, but not extend that to assertion failures.
>
> Assertions are only a problem at all because Mark would like to write
> tests that involve a selection of truly corrupt data. That's a new
> requirement, and one that I have my doubts about.

I agree that this (a test tool that exercises our code against
arbitrarily corrupted data pages) is not going to work as a test that
all buildfarm members run -- it seems something for specialized
buildfarm members to run, or even something that's run outside of the
buildfarm, like sqlsmith. Obviously such a tool would not be able to
run against an assertion-enabled build, and we shouldn't even try.

> I would be willing to make a larger effort to avoid crashing a
> backend, since that affects production. I might go to some effort to
> not crash with downright adversarial inputs, for example. But it seems
> inappropriate to take extreme measures just to avoid a crash with
> extremely contrived inputs that will probably never occur. My sense is
> that this is subject to sharply diminishing returns. Completely
> nailing down hard crashes from corrupt data seems like the wrong
> priority, at the very least. Pursuing that objective over other
> objectives sounds like zero-risk bias.

I think my initial approach for this would be to use a fuzzing tool that
generates data blocks semi-randomly, then uses them as Postgres data
pages somehow, and see what happens -- examine any resulting crashes and
make individual judgement calls about the fix(es) necessary to prevent
each of them. I expect that many such pages would be rejected as
corrupt by page header checks.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-05-13 23:33:33 Re: pg13: xlogreader API adjust
Previous Message Tom Lane 2020-05-13 23:30:15 Re: pg13: xlogreader API adjust