From: | James Coleman <jtc331(at)gmail(dot)com> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [DOC] Document concurrent index builds waiting on each other |
Date: | 2019-09-29 02:08:21 |
Message-ID: | CAAaqYe9P95wxtoOaF0KozJsVMZ4Dxv8982x-uKAur8Umb1GkEg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Sep 28, 2019 at 9:56 PM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> On Sat, Sep 28, 2019 at 09:54:48PM -0400, James Coleman wrote:
> > On Sat, Sep 28, 2019 at 9:22 PM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> > >
> > > On 2019-Sep-28, Bruce Momjian wrote:
> > >
> > > > The CREATE INDEX docs already say:
> > > >
> > > > In a concurrent index build, the index is actually entered into
> > > > the system catalogs in one transaction, then two table scans occur in
> > > > two more transactions. Before each table scan, the index build must
> > > > wait for existing transactions that have modified the table to terminate.
> > > > After the second scan, the index build must wait for any transactions
> > > > --> that have a snapshot (see <xref linkend="mvcc"/>) predating the second
> > > > --> scan to terminate. Then finally the index can be marked ready for use,
> > > >
> > > > So, having multiple concurrent index scans is just a special case of
> > > > having to "wait for any transactions that have a snapshot", no? I am
> > > > not sure adding a doc mention of other index builds really is helpful.
While that may be technically true, as a co-worker of mine likes to
point out, being "technically correct" is the worst kind of correct.
Here's what I mean:
First, I believe the docs should aim to be as useful as possible to
even those with more entry-level understanding of PostgreSQL. The fact
the paragraph you cite actually links to the entire chapter on
concurrency control in Postgres demonstrates that there's some
not-so-immediate stuff here to consider. For one: is it obvious to all
users that the transaction held by CIC (or even that all transactions)
has an open snapshot?
Second, this is a difference from a regular CREATE INDEX, and we
already call out as caveats differences between CREATE INDEX
CONCURRENTLY and regular CREATE INDEX as I point out below re:
Alvaro's comment.
Third, related to the above point, many DDL commands only block DDL
against the table being operated on. The fact that CIC here is
different is, in my opinion, a fairly surprising break from that
pattern, and as such likely to catch users off guard. I can attest
that this surprised at least one entire database team a while back :)
including many people who've been operating Postgres at a large scale
for a long time.
I believe caveats like this are worth calling out rather than
expecting users to have to understand the implementation details an
work out the implications on their own.
> > > I always thought that create index concurrently was prevented from
> > > running concurrently in a table by the ShareUpdateExclusive lock that's
> > > held during the operation.
> >
> > You mean multiple CICs on a single table at the same time? Yes, that
> > (unfortunately) isn't possible, but I'm concerned in the patch with
> > the fact that CIC on table X blocks CIC on table Y.
>
> I think any open transaction will block CIC, which is my point.
I read Alvaro as referring to the fact that the docs already call out
the following:
> Regular index builds permit other regular index builds on the same table to occur simultaneously, but only one concurrent index build can occur on a table at a time.
James
From | Date | Subject | |
---|---|---|---|
Next Message | Dilip Kumar | 2019-09-29 04:34:55 | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions |
Previous Message | Bruce Momjian | 2019-09-29 01:56:24 | Re: [DOC] Document concurrent index builds waiting on each other |