Quick Links

Re: [HACKERS] A design for amcheck heapam verification

From:	Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To:	Peter Geoghegan <pg(at)bowt(dot)ie>
Cc:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] A design for amcheck heapam verification
Date:	2018-01-11 10:14:06
Message-ID:	049AE496-791B-4C0E-8ACB-43832F9FA2B8@yandex-team.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hello!

I like heapam verification functionality and use it right now. So, I'm planning to provide review for this patch, probably, this week.

From my current use I have some thoughts on interface. Here's what I get.

# select bt_index_check('messagefiltervalue_group_id_59490523e6ee451f',true);
ERROR: XX001: heap tuple (45,21) from table "messagefiltervalue" lacks matching index tuple within index "messagefiltervalue_group_id_59490523e6ee451f"
HINT: Retrying verification using the function bt_index_parent_check() might provide a more specific error.
LOCATION: bt_tuple_present_callback, verify_nbtree.c:1316
Time: 45.668 ms

# select bt_index_check('messagefiltervalue_group_id_59490523e6ee451f');
bt_index_check
----------------

(1 row)
Time: 32.873 ms

# select bt_index_parent_check('messagefiltervalue_group_id_59490523e6ee451f');
ERROR: XX002: down-link lower bound invariant violated for index "messagefiltervalue_group_id_59490523e6ee451f"
DETAIL: Parent block=6259 child index tid=(1747,2) parent page lsn=4A0/728F5DA8.
LOCATION: bt_downlink_check, verify_nbtree.c:1188
Time: 391194.113 ms

Seems like new check is working 4 orders of magnitudes faster then bt_index_parent_check() and still finds my specific error that bt_index_check() missed.
From this output I see that there is corruption, but cannot understand:
1. What is the scale of corruption
2. Are these corruptions related or not

I think an interface to list all or top N error could be useful.

> 14 дек. 2017 г., в 0:02, Peter Geoghegan <pg(at)bowt(dot)ie> написал(а):
>>
>> This could also test the reproducibility of the tests with a fixed
>> seed number and at least two rounds, a low number of elements could be
>> more appropriate to limit the run time.
>
> The runtime is already dominated by pg_regress overhead. As it says in
> the README, using a fixed seed in the test harness is pointless,
> because it won't behave in a fixed way across platforms. As long as we
> cannot ensure deterministic behavior, we may as well fully embrace
> non-determinism.
I think that determinism across platforms is not that important as determinism across runs.

Thanks for the amcheck! It is very useful.

Best regards, Andrey Borodin.

In response to

Re: [HACKERS] A design for amcheck heapam verification at 2017-12-13 19:02:44 from Peter Geoghegan

Responses

Re: [HACKERS] A design for amcheck heapam verification at 2018-01-12 09:41:11 from Andrey Borodin
Re: [HACKERS] A design for amcheck heapam verification at 2018-01-22 22:01:15 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tatsuro Yamada	2018-01-11 10:14:33	Minor code improvement to estimate_path_cost_size in postgres_fdw
Previous Message	Masahiko Sawada	2018-01-11 10:10:50	Re: [HACKERS] Creating backup history files for backups taken from standbys