Re: Hash Indexes

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash Indexes
Date: 2016-12-14 09:27:40
Message-ID: CAA4eK1LPC68kEaeBLKx-4w5Lid0a=5ddyifY=c1y+N5WWfYR9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 13, 2016 at 11:30 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Dec 12, 2016 at 9:21 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> The reason is to make the operations consistent in master and standby.
>> In WAL patch, for clearing the SPLIT_CLEANUP flag, we write a WAL and
>> if we don't release the lock after writing a WAL, the operation can
>> appear on standby even before on master. Currently, without WAL,
>> there is no benefit of doing so and we can fix by using
>> MarkBufferDirty, however, I thought it might be simpler to keep it
>> this way as this is required for enabling WAL. OTOH, we can leave
>> that for WAL patch. Let me know which way you prefer?
>
> It's not required for enabling WAL. You don't *have* to release and
> reacquire the buffer lock; in fact, that just adds overhead.
>

If we don't release the lock, then it will break the general coding
pattern of writing WAL (Acquire pin and an exclusive lock,
Markbufferdirty, Write WAL, Release Lock). Basically, I think it is
to ensure that we don't hold it for multiple atomic operations or in
this case avoid calling MarkBufferDirty multiple times.

> You *do*
> have to be aware that the standby will perhaps see the intermediate
> state, because it won't hold the lock throughout. But that does not
> mean that the master must release the lock.
>

Okay, but I think that will be avoided to a great extent because we do
follow the practice of releasing the lock immediately after writing
the WAL.

>>> I don't think we should be afraid to call MarkBufferDirty() instead of
>>> going through these (fairly stupid) hasham-specific APIs.
>>
>> Right and anyway we need to use it at many more call sites when we
>> enable WAL for hash index.
>
> I propose the attached patch, which removes _hash_wrtbuf() and causes
> those functions which previously called it to do MarkBufferDirty()
> directly.
>

It is possible that we can MarkBufferDirty multiple times (twice in
hashbucketcleanup and then in _hash_squeezebucket) while holding the
lock. I don't think that is a big problem as of now but wanted to
avoid it as I thought we need it for WAL patch.

> Aside from hopefully fixing the bug we're talking about
> here, this makes the logic in several places noticeably less
> contorted.
>

Yeah, it will fix the problem in hashbucketcleanup, but there are two
other problems that need to be fixed for which I can send a separate
patch. By the way, as mentioned to you earlier that WAL patch already
takes care of removing _hash_wrtbuf and simplified the logic wherever
possible without introducing the logic of MarkBufferDirty multiple
times under one lock. However, if you want to proceed with the
current patch, then I can send you separate patches for the remaining
problems as addressed in bug fix patch.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rahila Syed 2016-12-14 10:02:34 Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.
Previous Message Heikki Linnakangas 2016-12-14 08:51:55 pg_authid.rolpassword format (was Re: Password identifiers, protocol aging and SCRAM protocol)