Re: INVALID index while concurrent indexing in progress?

From: Rajakavitha Kodhandapani <krajakavitha(at)gmail(dot)com>
To: Lauren Fliksteen <dancernerd32(at)gmail(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-docs(at)lists(dot)postgresql(dot)org
Subject: Re: INVALID index while concurrent indexing in progress?
Date: 2022-05-19 16:31:18
Message-ID: CABx-2BAHnpxPBzig7ucjYebjFeUYsiogSvT8CuRDgCFVBdGSKg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

Thank you, Laurenz. The changes that you suggested make a lot more sense.
I will make the updates and submit the changes.

Regards,
Rajie

On Thu, May 19, 2022 at 9:45 PM Lauren Fliksteen <dancernerd32(at)gmail(dot)com>
wrote:

> Thank you both! I think Laurenz’s changes make perfect sense!
>
> Sent from my iPhone
>
> > On May 19, 2022, at 8:37 AM, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
> wrote:
> >
> > On Thu, 2022-05-19 at 19:02 +0530, Rajakavitha Kodhandapani wrote:
> >>> I think the INVALID index can use further explanation, in particular,
> from
> >>> my experience it seems like when building an index concurrently, the
> index
> >>> gets inserted and labeled invalid while the index is being built, and
> then
> >>> the label gets removed if it finishes successfully or gets left on the
> index
> >>> if there is a failure while building the index. It is my current
> >>> understanding, after experimenting, that INVALID means 'incomplete',
> whether
> >>> that's because it's in progress or because it was unable to be
> completed,
> >>> but prior to my experiment my understanding was that INVALID indicated
> >>> failure.
> >>>
> >>> This was especially confusing when we were adding an index to a very
> large
> >>> table because we assumed the INVALID index indicated failure when we
> >>> couldn't find any other sign of progress or failure.
> >>
> >> This is my first attempt at contributing to the documentation of
> PostgreSQL.
> >> Here's the patch. Please let me know if any other changes need to be
> made.
> >
> > Thank you! Please send patches as plain text and use bottom posting.
> >
> >> \cf3 @@ -665,11 +665,14 @@\cf5 Indexes:\
> >> <para>\
> >> Another caveat when building a unique index concurrently is that
> the\
> >> uniqueness constraint is already being enforced against other
> transactions\
> >> \cf6 - when the second table scan begins. This means that
> constraint violations\cf5 \
> >> \cf4 + when the second table scan begins. This means that constraint
> violations\cf5 \
> >> could be reported in other queries prior to the index becoming
> available\
> >> \cf6 - for use, or even in cases where the index build eventually
> fails. Also,\cf5 \
> >> \cf6 - if a failure does occur in the second scan, the
> <quote>invalid</quote> index\cf5 \
> >> \cf6 - continues to enforce its uniqueness constraint
> afterwards.\cf5 \
> >> \cf4 + for use, or even in cases where the index build eventually
> fails. The index\cf5 \
> >> \cf4 + is inserted and labeled <quote>invalid</quote> while the
> index is being built,\cf5 \
> >> \cf4 + and then the label is removed if the index builds
> successfully. If the index does\cf5 \
> >> \cf4 + not build successfully, then the label <quote>invalid</quote>
> remains. Also, if a\cf5 \
> >> }
> >
> > I don't think that this information should be added to a paragraph that
> > focuses on uniqueness checks in concurrent index builds.
> >
> > Actually, most of the information is already there. To quote from the
> page:
> >
> > If a problem arises while scanning the table, such as a deadlock or a
> uniqueness violation
> > in a unique index, the CREATE INDEX command will fail but leave behind
> an “invalid” index.
> >
> > How about the following patch to emphasize the role of "invalid":
> >
> > diff --git a/doc/src/sgml/ref/create_index.sgml
> b/doc/src/sgml/ref/create_index.sgml
> > index d3102a87d9..fee2c61e5e 100644
> > --- a/doc/src/sgml/ref/create_index.sgml
> > +++ b/doc/src/sgml/ref/create_index.sgml
> > @@ -622,7 +622,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT
> EXISTS ] <replaceable class=
> > </para>
> >
> > <para>
> > - In a concurrent index build, the index is actually entered into
> > + In a concurrent index build, the index is actually entered as
> <quote>invalid</quote> index into
> > the system catalogs in one transaction, then two table scans occur in
> > two more transactions. Before each table scan, the index build must
> > wait for existing transactions that have modified the table to
> terminate.
> > @@ -631,7 +631,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT
> EXISTS ] <replaceable class=
> > scan to terminate, including transactions used by any phase of
> concurrent
> > index builds on other tables, if the indexes involved are partial or
> have
> > columns that are not simple column references.
> > - Then finally the index can be marked ready for use,
> > + Then finally the index can be marked <quote>valid</quote> and ready
> for use,
> > and the <command>CREATE INDEX</command> command terminates.
> > Even then, however, the index may not be immediately usable for
> queries:
> > in the worst case, it cannot be used as long as transactions exist
> that
> >
> > Yours,
> > Laurenz Albe
>

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Alvaro Herrera 2022-05-19 16:36:30 Re: INVALID index while concurrent indexing in progress?
Previous Message Lauren Fliksteen 2022-05-19 16:15:45 Re: INVALID index while concurrent indexing in progress?