Quick Links

CLUSTER, REINDEX and VACUUM on batch ops

From:	François Beausoleil <francois(at)teksol(dot)info>
To:	Forums postgresql <pgsql-general(at)postgresql(dot)org>
Subject:	CLUSTER, REINDEX and VACUUM on batch ops
Date:	2013-04-24 15:49:58
Message-ID:	95EE6F4F-28FE-42A2-B388-6F27829192D4@teksol.info
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Hi all!

I have a partitioned table with millions of rows per weekly partition. I am adding new fields, with null values and no default values to ensure I had a reasonable migration time. All downstream code knows how to work with null fields.

Presently, I'm migrating each partition individually to add NOT NULL, set a default value and update the table to have correct values. Essentially, I'm doing this:

ALTER TABLE parent ADD COLUMN new_field int; -- adds the field to all child tables - runs quickly

-- the bulk of the data transfer
for each partition in partitions:
BEGIN;
UPDATE partition SET new_field = 0;
ALTER TABLE partition
ALTER COLUMN new_field SET NOT NULL
, ALTER COLUMN new_field SET DEFAULT 0;
COMMIT;

CLUSTER partition USING partition_pkey;
REINDEX TABLE partition;
VACUUM ANALYZE partition;
done

After I've clustered the table, must I reindex and vacuum as well? It is unclear to me if clustering a table reindexes or not: the docs at http://www.postgresql.org/docs/current/static/sql-cluster.html are silent on the matter, but do mention that an ANALYZE is in order.

Thanks!
François Beausoleil

Responses

Re: CLUSTER, REINDEX and VACUUM on batch ops at 2013-04-25 03:47:37 from Sergey Konoplev

Browse pgsql-general by date

	From	Date	Subject
Next Message	François Beausoleil	2013-04-24 16:04:38	Re: Most efficient way to insert without duplicates
Previous Message	Kirk Wythers	2013-04-24 15:14:42	Re: run COPY as user other than postgres