CLUSTER, REINDEX and VACUUM on batch ops

From: François Beausoleil <francois(at)teksol(dot)info>
To: Forums postgresql <pgsql-general(at)postgresql(dot)org>
Subject: CLUSTER, REINDEX and VACUUM on batch ops
Date: 2013-04-24 15:49:58
Message-ID: 95EE6F4F-28FE-42A2-B388-6F27829192D4@teksol.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi all!

I have a partitioned table with millions of rows per weekly partition. I am adding new fields, with null values and no default values to ensure I had a reasonable migration time. All downstream code knows how to work with null fields.

Presently, I'm migrating each partition individually to add NOT NULL, set a default value and update the table to have correct values. Essentially, I'm doing this:

ALTER TABLE parent ADD COLUMN new_field int; -- adds the field to all child tables - runs quickly

-- the bulk of the data transfer
for each partition in partitions:
BEGIN;
UPDATE partition SET new_field = 0;
ALTER TABLE partition
ALTER COLUMN new_field SET NOT NULL
, ALTER COLUMN new_field SET DEFAULT 0;
COMMIT;

CLUSTER partition USING partition_pkey;
REINDEX TABLE partition;
VACUUM ANALYZE partition;
done

After I've clustered the table, must I reindex and vacuum as well? It is unclear to me if clustering a table reindexes or not: the docs at http://www.postgresql.org/docs/current/static/sql-cluster.html are silent on the matter, but do mention that an ANALYZE is in order.

Thanks!
François Beausoleil

Responses

Browse pgsql-general by date

  From Date Subject
Next Message François Beausoleil 2013-04-24 16:04:38 Re: Most efficient way to insert without duplicates
Previous Message Kirk Wythers 2013-04-24 15:14:42 Re: run COPY as user other than postgres