pgsql: Don't rely on estimates for amcheck Bloom filters.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Don't rely on estimates for amcheck Bloom filters.
Date: 2019-07-20 18:12:05
Message-ID: E1hotq1-0003A1-I7@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Don't rely on estimates for amcheck Bloom filters.

Solely relying on a relation's reltuples/relpages estimate to size the
Bloom filters used by amcheck verification makes verification less
effective when the estimates are very stale. In extreme cases,
verification options that use Bloom filters internally could be totally
ineffective, without users receiving any clear indication that certain
types of corruption might easily be missed.

To fix, use RelationGetNumberOfBlocks() instead of relpages to size the
downlink block Bloom filter. Use the same RelationGetNumberOfBlocks()
value to derive a minimum size for the heapallindexed Bloom filter,
rather than completely trusting reltuples. Verification will still be
reasonably effective when the projected/estimated number of Bloom filter
elements is at least 1/5 of the final number of elements, which is
assured by the new sizing logic.

Reported-By: Alexander Korotkov
Discussion: https://postgr.es/m/CAH2-Wzk0ke2J42KrNYBKu0Xovjy-sU5ub7PWjgpbsKdAQcL4OA@mail.gmail.com
Backpatch: 11-, where downlink/heapallindexed verification were added.

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/894af78f185afee221a6762a1a49057043b7bbf5

Modified Files
--------------
contrib/amcheck/verify_nbtree.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message David Rowley 2019-07-21 05:31:38 pgsql: Speed up finding EquivalenceClasses for a given set of rels
Previous Message Tomas Vondra 2019-07-20 14:37:44 pgsql: Rework examine_opclause_expression to use varonleft