From: | Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> |
---|---|
To: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: new heapcheck contrib module |
Date: | 2020-06-12 21:06:18 |
Message-ID: | 885D2EC9-7D06-4903-A17D-72C184A1E1C5@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> On Jun 11, 2020, at 11:35 PM, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Fri, Jun 12, 2020 at 12:40 AM Mark Dilger
> <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>>
>>
>>
>>> On Jun 11, 2020, at 9:14 AM, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>>>
>>> I have just browsed through the patch and the idea is quite
>>> interesting. I think we can expand it to check that whether the flags
>>> set in the infomask are sane or not w.r.t other flags and xid status.
>>> Some examples are
>>>
>>> - If HEAP_XMAX_LOCK_ONLY is set in infomask then HEAP_KEYS_UPDATED
>>> should not be set in new_infomask2.
>>> - If HEAP_XMIN(XMAX)_COMMITTED is set in the infomask then can we
>>> actually cross verify the transaction status from the CLOG and check
>>> whether is matching the hint bit or not.
>>>
>>> While browsing through the code I could not find that we are doing
>>> this kind of check, ignore if we are already checking this.
>>
>> Thanks for taking a look!
>>
>> Having both of those bits set simultaneously appears to fall into a different category than what I wrote verify_heapam.c to detect.
>
> Ok
>
>
>> It doesn't violate any assertion in the backend, nor does it cause
>> the code to crash. (At least, I don't immediately see how it does
>> either of those things.) At first glance it appears invalid to have
>> those bits both set simultaneously, but I'm hesitant to enforce that
>> without good reason. If it is a good thing to enforce, should we also
>> change the backend code to Assert?
>
> Yeah, it may not hit assert or crash but it could lead to a wrong
> result. But I agree that it could be an assertion in the backend
> code.
For v7, I've added an assertion for this. Per heap/README.tuplock, "We currently never set the HEAP_XMAX_COMMITTED when the HEAP_XMAX_IS_MULTI bit is set." I added an assertion for that, too. Both new assertions are in RelationPutHeapTuple(). I'm not sure if that is the best place to put the assertion, but I am confident that the assertion needs to only check tuples destined for disk, as in memory tuples can and do violate the assertion.
Also for v7, I've updated contrib/amcheck to report these two conditions as corruption.
> What about the other check, like hint bit is saying the
> transaction is committed but actually as per the clog the status is
> something else. I think in general processing it is hard to check
> such things in backend no? because if the hint bit is set saying that
> the transaction is committed then we will directly check its
> visibility with the snapshot. I think a corruption checker may be a
> good tool for catching such anomalies.
I already made some design changes to this patch to avoid taking the CLogTruncationLock too often. I'm happy to incorporate this idea, but perhaps you could provide a design on how to do it without all the extra locking? If not, I can try to get this into v8 as an optional check, so users can turn it on at their discretion. Having the check enabled by default is probably a non-starter.
Attachment | Content-Type | Size |
---|---|---|
v7-0001-Adding-verify_heapam-and-pg_amcheck.patch | application/octet-stream | 155.6 KB |
v7-0002-Adding-checks-of-invalid-combinations-of-hint-bit.patch | application/octet-stream | 5.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Fabien COELHO | 2020-06-12 21:26:42 | Re: Internal key management system |
Previous Message | Fabien COELHO | 2020-06-12 20:59:37 | Re: Internal key management system |