Quick Links

Re: [HACKERS] ginInsertCleanup called from vacuum could still miss tuples to be deleted

From:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To:	Peter Geoghegan <pg(at)bowt(dot)ie>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ginInsertCleanup called from vacuum could still miss tuples to be deleted
Date:	2017-11-14 02:56:00
Message-ID:	CAD21AoBwRsOJF+na0NPCXrO_5MGD2_B0SrggJ11T5iXd3qaB3g@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Nov 14, 2017 at 10:48 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Mon, Nov 13, 2017 at 5:07 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>> I've been suspicious of that commit (and related commits) for a while
>>> now [1]. I think that it explains a couple of different problem
>>> reports that we have seen.
>>
>> Yeah, the problem here is that vacuum and analyze don't acquire a
>> heavy weight lock for meta page using properly function. it seems not
>> relevant with that problem.
>
> One thing that really bothers me about commit e2c79e14 is that
> LockPage() is called, not LockBuffer(). GIN had no LockPage() calls
> before that commit, and is now the only code in the entire system that
> calls LockPage()/ConditionalLockPage() (the hash am no longer uses
> page heavyweight locks following recent work there).
>
> Historically, the only reason that an AM would ever call LockPage()
> instead of LockBuffer() (at least after LWLocks were introduced many
> years ago) was that there was a small though acceptable risk of
> deadlock for concurrent inserters. It would hardly ever happen, but
> the possibility could not be ruled out, so deadlock detection was
> required. This was definitely true of hash, which is one reason why it
> was a second class index am for so long. I think this was even true of
> nbtree, up until about 15 years ago.
>
> The high level question that I have to ask about e2c79e14 is: can it
> deadlock? If not, why was LockPage() used at all? It seems like a bad
> sign that none of this is explained in the code.

I'm not sure the exact reason but I guess that because we keep holding
the lock for meta page during cleaning up pending list we would
eliminate the possibility of concurrent insertion into pending list if
we acquire the lock of meta page using LockBuffer() instead.This is
explained in ginInsertCleanup().

/*
* We would like to prevent concurrent cleanup process. For that we will
* lock metapage in exclusive mode using LockPage() call. Nobody other
* will use that lock for metapage, so we keep possibility of concurrent
* insertion into pending list
*/

So I conjecture that this has been introduced for not the reason why
we need to detect deadlock but the reason why we need to a different
lock from the lock used by insertion into pending list.

>
> My guess is that bugs in this area have caused data corruption (not
> just undetectable deadlock issues), which was reported and commenting
> on elsewhere [1]; maybe this proposed fix of yours could have
> prevented that. But we clearly still need to do a careful analysis of
> e2c79e14, because it seems like it probably has fundamental design
> problems (it's not *just* buggy).
>

Yeah, I agree. I'll manage to get the time to look carefully at the
GIN code related to fast insertion and cleanup pending list.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Re: [HACKERS] ginInsertCleanup called from vacuum could still miss tuples to be deleted at 2017-11-14 01:48:52 from Peter Geoghegan

Responses

Re: [HACKERS] ginInsertCleanup called from vacuum could still miss tuples to be deleted at 2017-11-14 04:08:42 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2017-11-14 03:03:41	Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Previous Message	Michael Paquier	2017-11-14 01:57:27	Re: [HACKERS] ginInsertCleanup called from vacuum could still miss tuples to be deleted