estimating the need for VACUUM FULL and REINDEX

From: Guillaume Cottenceau <gc(at)mnc(dot)ch>
To: pgsql-performance(at)postgresql(dot)org
Subject: estimating the need for VACUUM FULL and REINDEX
Date: 2007-05-08 12:08:24
Message-ID: 87odkvmsav.fsf@meuh.mnc.lan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I'm trying to come up with a way to estimate the need for a
VACUUM FULL and/or a REINDEX on some tables.

According to documentation[1], VACUUM FULL's only benefit is
returning unused disk space to the operating system; am I correct
in assuming there's also the benefit of optimizing the
performance of scans, because rows are physically compacted on
the disk?

With that in mind, I've tried to estimate how much benefit would
be brought by running VACUUM FULL, with the output of VACUUM
VERBOSE. However, it seems that for example the "removable rows"
reported by each VACUUM VERBOSE run is actually reused by VACUUM,
so is not what I'm looking for.

Then according to documentation[2], REINDEX has some benefit when
all but a few index keys on a page have been deleted, because the
page remains allocated (thus, I assume it improves index scan
performance, am I correct?). However, again I'm unable to
estimate the expected benefit. With a slightly modified version
of a query found in documentation[3] to see the pages used by a
relation[4], I'm able to see that the index data from a given
table...

relname | relpages | reltuples
------------------------+----------+-----------
idx_sessions_owner_key | 38 | 2166
pk_sessions | 25 | 2166

...is duly optimized after a REINDEX:

relname | relpages | reltuples
------------------------+----------+-----------
idx_sessions_owner_key | 13 | 2166
pk_sessions | 7 | 2166

but what I'd need is really these 38-13 and 25-7 figures (or
estimates) prior to running REINDEX.

Thanks for any insight.

Ref:
[1] http://www.postgresql.org/docs/8.2/interactive/routine-vacuuming.html

[2] http://www.postgresql.org/docs/8.2/interactive/routine-reindex.html

[3] http://www.postgresql.org/docs/8.2/interactive/disk-usage.html

[4] SELECT c2.relname, c2.relpages, c2.reltuples
FROM pg_class c, pg_class c2, pg_index i
WHERE c.relname = 'sessions'
AND c.oid = i.indrelid
AND c2.oid = i.indexrelid
ORDER BY c2.relname;

--
Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company
Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Adam Tauno Williams 2007-05-08 12:25:25 Re: [OT] Best OS for Postgres 8.2
Previous Message Luke Lonergan 2007-05-08 12:01:25 Re: Best OS for Postgres 8.2