Re: GLOBAL vs LOCAL temp tables

From: Kevin Brown <kevin(at)sysexperts(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: GLOBAL vs LOCAL temp tables
Date: 2003-04-17 06:53:22
Message-ID: 20030417065322.GH1833@filer
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera wrote:
> On Wed, Apr 16, 2003 at 06:42:36PM -0700, Kevin Brown wrote:
>
> > Hmm...could VACUUM FULL then take an additional option, REINDEX (not
> > sure if calling it that will be possible if it'll cause ambiguities in
> > the parser), so that you don't have to do REINDEX separately?
>
> Actually, the need to REINDEX should be pretty minimal. The command was
> originally invented to recover from corrupted indexes, but was used as
> a regular maintenance because of the ever-growing indexes problem. The
> problem is mostly gone now, and REINDEX is again only meant to the
> corrupted indexes scenario.

But as Tom pointed out, if you delete a bunch of data from a table
then insert a fresh set of data, but don't end up inserting much data
with roughly the same keys that were in the original batch of data,
you'll get a lot of empty areas in your index that are unused. VACUUM
marks them as being available for reuse, of course, but that doesn't
help you unless you insert data containing values that are appropriate
to the unused areas.

An example would be an index on a serial column, where a periodic
delete of old data happens. The values in the serial column are never
reused, so their locations in the index never get reused, either,
right? So the index on that serial column grows without bound. I
imagine that situation is actually rather common.

Unless that behavior has changed, you will need to reindex on a
periodic basis to keep your index from growing indefinitely, won't
you?

> VACUUM generally won't free disk space, but it will mark pages free so
> subsequent index growth will use them. Periodic VACUUM usage should be
> enough to keep indexes in control.

Even in situations as I describe above?

> > On a slightly different note, from what I've seen of the source,
> > REINDEX TABLE acquires an access exclusive lock on the table, thus
> > preventing concurrent SELECTs against the table while REINDEX TABLE is
> > running.
>
> Yeah, it's difficult to recreate an index keeping concurrency.

I don't quite understand this. I'm certainly not advocating allowing
writes while reindexing! The data in the table isn't changing at all
because you've acquired an exclusive lock on the table (which permits
reads but not writes) -- but not a shared exclusive lock (which would
prevent reads).

Why would you need to prevent reads of the data in the table or the
old index, aside from performance or corruption considerations (which
is an evaluation that belongs in the hands the DBA)? I can certainly
see the need to prevent reads (and thus acquire a shared exclusive
lock) once the new index is built and needs to be swapped in for the
old one, but not before then.

> I had originally meant to implement this (concurrent index rebuild),
> but got stuck in the freelist thing, and now concurrent index
> rebuild is not needed as much. Because of this I selected a
> different project that is probably more useful.

I can certainly understand that. :-)

I'm just wondering what would happen if you changed the code to
acquire a standard exclusive lock on the table instead of a shared
exclusive one, and added another line to acquire a shared exclusive
lock once the new index is built...

--
Kevin Brown kevin(at)sysexperts(dot)com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Oliver Elphick 2003-04-17 06:54:36 New flex
Previous Message Tom Lane 2003-04-17 03:39:31 Re: pg_clog woes with 7.3.2 - Episode 2