Quick Links

Re: Question: pg_class attributes and race conditions ?

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)enterprisedb(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Question: pg_class attributes and race conditions ?
Date:	2007-03-16 16:26:56
Message-ID:	45FAC550.6020000@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Tom Lane wrote:
>
> In what context are you proposing to do that, and won't this
> high-strength lock in itself lead to deadlocks?
>
> The whole thing sounds exceedingly ugly anyway --- for example
> what happens if the backend doing the CREATE INDEX fails and
> is therefore unable to clear the flag again?
>

Let me state the problem and a vague solution I am thinking of.
I would appreciate comments and suggestions.

The major known issue left with HOT is support for
CREATE INDEX and CREATE INDEX CONCURRENTLY. The
problem is with HEAP_ONLY tuples in the heap which do not have index
entries in the existing indexes. When we build a new index, some or all
of the HEAP_ONLY tuples may need index entries in the new index.
It would be very ugly if we try to keep the existing indexes
without index entries for those tuples. A clean solution
would be to add index entries for the HEAP_ONLY tuples in
the existing indexes and break all the HOT-chains.

I would leave the details, but rather explain what I have in
mind at high level. Any help to fill in the details or any
suggestions to do things differently would immensely help.

This is what I have in mind:

In the context of CREATE INDEX [CONCURRENTLY],

We first disable HOT-updates on the table. This would ensure
that no new HOT tuples are added while we CHILL the heap.
(How do we do this ?)

We then start scanning the heap and start building the new
index. If a HEAP_ONLY tuple is found which needs to be
indexed, we mark the tuple with a CHILL_IN_PROGRESS flag
and insert index entries into all the existing indexes.
(The buffer is exclusively locked and the operation is WAL
logged).

We do this until entire heap is scanned. At this point, we
would have inserted missing index entries for the HEAP_ONLY
tuples. Till this point, we don't use the direct index
entries to fetch the HEAP_ONLY tuples to avoid duplicate
fetches of the same tuple.

We now wait for all the concurrent index scans to end and
then disable HOT-chain following logic to fetch tuples.
(How do we do this ?)

At this point, all index scans would ONLY use the direct
path from the index to fetch tuples. The HOT-chains are
not followed to avoid duplicate fetches of the same tuple.

A second pass over the heap is now required to clear the
CHILL_IN_PROGRESS, HEAP_ONLY and HEAP_HOT_UPDATED flags.

At the end of this step, all the indexes and the table are
in sync. Once again we need to ensure that there are no
concurrent index scans in progress and then enable HOT-fetch.
Also, HOT-updates can be turned on.

If CREATE INDEX crashes, VACUUM is required to clear the
CHILL_IN_PROGRESS flags and the corresponding index entries
are removed. Since VACUUM runs mutually exclusive to CREATE
INDEX, we don't need any special mechanism to handle race
conditions between them.

There are some other details like running multiple CREATE
INDEX in parallel and still be able to CHILL the table
safely. May be one of them needs to act as the chiller
and others wait for it finish successfully.

Any thoughts on the overall approach ? Any suggestions to
simplify things or any alternate designs ? Can something
as simple as CHILLing the table holding VACUUM FULL
strength lock be acceptable ?

Thanks,
Pavan

EnterpriseDB http://www.enterprisedb.com

In response to

Re: Question: pg_class attributes and race conditions ? at 2007-03-16 15:09:27 from Tom Lane

Responses

Re: Question: pg_class attributes and race conditions ? at 2007-03-16 16:40:13 from Tom Lane
Re: Question: pg_class attributes and race conditions ? at 2007-03-16 17:44:45 from Simon Riggs

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Eric	2007-03-16 16:28:49	Re: initdb fails - postgresql does not support leap seconds
Previous Message	Florian G. Pflug	2007-03-16 16:25:19	Re: tsearch_core for inclusion