Quick Links

Re: Write Ahead Logging for Hash Indexes

From:	Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Write Ahead Logging for Hash Indexes
Date:	2016-09-08 04:32:02
Message-ID:	844a643b-fe3d-7d9a-5040-570674fe9321@catalyst.net.nz
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 07/09/16 21:58, Amit Kapila wrote:

> On Wed, Aug 24, 2016 at 10:32 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>> On Tue, Aug 23, 2016 at 10:05 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
>> wrote:
>>> On Wed, Aug 24, 2016 at 2:37 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>>>
>>>> After an intentionally created crash, I get an Assert triggering:
>>>>
>>>> TRAP: FailedAssertion("!(((freep)[(bitmapbit)/32] &
>>>> (1<<((bitmapbit)%32))))", File: "hashovfl.c", Line: 553)
>>>>
>>>> freep[0] is zero and bitmapbit is 16.
>>>>
>>> Here what is happening is that when it tries to clear the bitmapbit,
>>> it expects it to be set. Now, I think the reason for why it didn't
>>> find the bit as set could be that after the new overflow page is added
>>> and the bit corresponding to it is set, you might have crashed the
>>> system and the replay would not have set the bit. Then while freeing
>>> the overflow page it can hit the Assert as mentioned by you. I think
>>> the problem here could be that I am using REGBUF_STANDARD to log the
>>> bitmap page updates which seems to be causing the issue. As bitmap
>>> page doesn't follow the standard page layout, it would have omitted
>>> the actual contents while taking full page image and then during
>>> replay, it would not have set the bit, because page doesn't need REDO.
>>> I think here the fix is to use REGBUF_NO_IMAGE as we use for vm
>>> buffers.
>>>
>>> If you can send me the detailed steps for how you have produced the
>>> problem, then I can verify after fixing whether you are seeing the
>>> same problem or something else.
>>
>>
>> The test is rather awkward, it might be easier to just have me test it.
>>
> Okay, I have fixed this issue as explained above. Apart from that, I
> have fixed another issue reported by Mark Kirkwood upthread and few
> other issues found during internal testing by Ashutosh Sharma.
>
> The locking issue reported by Mark and Ashutosh is that the patch
> didn't maintain the locking order while adding overflow page as it
> maintains in other write operations (lock the bucket pages first and
> then metapage to perform the write operation). I have added the
> comments in _hash_addovflpage() to explain the locking order used in
> modified patch.
>
> During stress testing with pgbench using master-standby setup, we
> found an issue which indicates that WAL replay machinery doesn't
> expect completely zeroed pages (See explanation of RBM_NORMAL mode
> atop XLogReadBufferExtended). Previously before freeing the overflow
> page we were zeroing it, now I have changed it to just initialize the
> page such that the page will be empty.
>
> Apart from above, I have added support for old snapshot threshold in
> the hash index code.
>
> Thanks to Ashutosh Sharma for doing the testing of the patch and
> helping me in analyzing some of the above issues.
>
> I forgot to mention in my initial mail that Robert and I had some
> off-list discussions about the design of this patch, many thanks to
> him for providing inputs.
>
>

Repeating my tests with these new patches applied points to the hang
issue being solved. I tested several 10 minute runs (any of which was
enough to elicit the hang previously). I'll do some longer ones, but
looks good!

regards

Mark

In response to

Re: Write Ahead Logging for Hash Indexes at 2016-09-07 09:58:34 from Amit Kapila

Responses

Re: Write Ahead Logging for Hash Indexes at 2016-09-08 04:37:44 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Kapila	2016-09-08 04:32:44	Re: Hash Indexes
Previous Message	Michael Paquier	2016-09-08 01:23:35	Useless dependency assumption libxml2 -> libxslt in MSVC scripts