Re: pg_amcheck contrib application

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>, Amul Sul <sulamul(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_amcheck contrib application
Date: 2021-03-16 06:09:13
Message-ID: 20210316060913.GA3323238@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 15, 2021 at 02:57:20PM -0400, Tom Lane wrote:
> Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> writes:
> > On Mar 15, 2021, at 10:04 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> These animals have somewhat weird alignment properties: MAXALIGN is 8
> >> but ALIGNOF_DOUBLE is only 4. I speculate that that is affecting their
> >> choices about whether an out-of-line TOAST value is needed, breaking
> >> this test case.

That machine also has awful performance for filesystem metadata operations,
like open(O_CREAT). Its CPU and read()/write() performance are normal.

> > The logic in verify_heapam only looks for a value in the toast table if
> > the tuple it gets from the main table (in this case, from pg_statistic)
> > has an attribute that claims to be toasted. The error message we're
> > seeing that you quoted above simply means that no entry exists in the
> > toast table.
>
> Yeah, that could be phrased better. Do we have a strong enough lock on
> the table under examination to be sure that autovacuum couldn't remove
> a dead toast entry before we reach it? But this would only be an
> issue if we are trying to check validity of toasted fields within
> known-dead tuples, which I would argue we shouldn't, since lock or
> no lock there's no guarantee the toast entry is still there.
>
> Not sure that I believe the theory that this is from bad luck of
> concurrent autovacuum timing, though.

With autovacuum_naptime=1s, on hornet, the failure reproduced twice in twelve
runs. With v6-0001-Turning-off-autovacuum-during-corruption-tests.patch
applied, 196 runs all succeeded.

> The fact that we're seeing
> this on just those two animals suggests strongly to me that it's
> architecture-correlated, instead.

That is possible.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message tsunakawa.takay@fujitsu.com 2021-03-16 06:39:17 RE: libpq debug log
Previous Message Dilip Kumar 2021-03-16 05:59:18 Re: [HACKERS] Custom compression methods