Re: Suggest note in index documentation about long running transactions

From: Chris Travers <chris(dot)travers(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Suggest note in index documentation about long running transactions
Date: 2016-02-16 17:24:16
Message-ID: CAKt_Zfu5P5UJ6Bm1ytsoz5bLy4Tq50iFUJMEMvy20VVk+dUiFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Feb 16, 2016 at 6:04 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I wrote:
> > Chris Travers <chris(dot)travers(at)gmail(dot)com> writes:
> >> Unless there is a sense that this is a bad idea I will submit a doc
> patch.
>
> > I was already working on it ... I think what we want is something along
> > this line in the "Building Indexes Concurrently" section of the CREATE
> > INDEX ref page:
>
> After further perusal of the code I propose replacing that para with this
> wording:
>
> <para>
> In a concurrent index build, the index is actually entered into
> the system catalogs in one transaction, then two table scans occur in
> two more transactions. Before each table scan, the index build must
> wait for existing transactions that have modified the table to
> terminate.
> After the second scan, the index build must wait for any transactions
> that have a snapshot (see <xref linkend="mvcc">) predating the second
> scan to terminate. Then finally the index can be marked ready for use,
> and the <command>CREATE INDEX</> command terminates.
> Even then, however, the index may not be immediately usable for
> queries:
> in the worst case, it cannot be used as long as transactions exist that
> predate the start of the index build.
> </para>
>
> This is a good deal clearer, IMO, about the conditions under which
> transactions block CREATE INDEX CONCURRENTLY; we need that because
> otherwise it's not going to make much sense to talk about old
> transactions still existing after the command completes.
>

Agreed.

>
> The reason for the "worst case" weasel-wording is that the problem you
> saw doesn't actually occur unless the index build detected some broken
> HOT chains. I do not want to get into explaining what those are here,
> so it seemed best to just be vague about whether there's a delay in
> index usability or not.
>

I think that is also a good optimization, documentation wise.

>
> (BTW, I wondered whether this wasn't just a bug and we should make things
> less confusing by having CREATE INDEX CONCURRENTLY not complete until
> the index is fully usable. However, it appears the reason we don't do
> that is it would create a risk of two CREATE INDEX CONCURRENTLY commands
> deadlocking, ie they'd each think they have to wait for the other one.)
>

I think even without the deadlocking that would be far worse than the
current behavior. The current behavior is a bit opaque when it happens
(and in this case I could certainly see HOT chains being a problem case on
this db as it is large, but also that this specific table has tremendous
turnover, and a few transactions which read from the table can be extremely
long running -- have shortened the longest running case from about 4 days
to about 18 hours -- don't ask).

If you do anything, raising a NOTICE that the index is deferred for
usability might be a good thing, but the problems with delaying exit go
well beyond deadlocks.

>
> regards, tom lane
>

--
Best Wishes,
Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor
lock-in.
http://www.efficito.com/learn_more

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2016-02-16 18:18:28 Re: Transaction Rollback Error: DeadLock Detected
Previous Message Tom Lane 2016-02-16 17:04:23 Re: Suggest note in index documentation about long running transactions