Re: amcheck (B-Tree integrity checking tool)

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Subject: Re: amcheck (B-Tree integrity checking tool)
Date: 2016-03-11 23:50:10
Message-ID: 56E359B2.1090105@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/11/16 3:31 PM, Peter Geoghegan wrote:
>> Can we come up with names that more clearly identify the difference
>> >between those two functions? I mean,_parent_ does not make it
>> >particularly obvious that the second function acquires exclusive lock
>> >and performs more thorough checks.
> Dunno about that. It's defining characteristic is that it checks child
> pages against their parent IMV. Things are not often defined in terms
> of their locking requirements.

First, thanks for your work on this. I've wanted it in the past.

I agree the name isn't very clear. Perhaps _recurse?

I also agree that the nmodule name isn't very clear. If this is meant to
be the start of a generic consistency checker, lets call it that.
Otherwise, it should be marked as being specific to btrees, because
presumably we might eventually want similar tools for GIN, etc. (FWIW
I'd vote for a general consistency checker).

I know the vacuum race condition would be very rare, but I don't think
it can be ignored. Last thing you want out of a consistency checker is
false negatives/positives. I do think it would be reasonable to just
wholesale block against concurrent vacuums, but I don't think there's
any reasonable way to do that.

I would prefer the ability to do something other than raising an error
when corruption is found, so that you could find all corruption in an
index. Obviously could log to a different level. Another option would be
SRFs that return info about all the corruption found, but that's
probably overkill.

It'd be nice if you had the option to obey vacuum_cost_delay when
running this, but that's clearly just a nice-to-have (or maybe just obey
it all the time, since it defaults to 0).
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2016-03-11 23:55:03 Re: PREPARE dynamic SQL in plpgsql
Previous Message Tom Lane 2016-03-11 23:49:19 Perl's newSViv() versus 64-bit ints?