Re: CLUSTER, REINDEX and VACUUM on batch ops

From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: François Beausoleil <francois(at)teksol(dot)info>
Cc: Forums postgresql <pgsql-general(at)postgresql(dot)org>
Subject: Re: CLUSTER, REINDEX and VACUUM on batch ops
Date: 2013-04-25 03:47:37
Message-ID: CAL_0b1sgfRhhJNP7HJBNK301+Cs=A4G+gZY8-tz=_bu3_+cGCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Apr 24, 2013 at 8:49 AM, François Beausoleil
<francois(at)teksol(dot)info> wrote:
> Presently, I'm migrating each partition individually to add NOT NULL, set a default value and update the table to have correct values. Essentially, I'm doing this:
>
> ALTER TABLE parent ADD COLUMN new_field int; -- adds the field to all child tables - runs quickly
>
> -- the bulk of the data transfer
> for each partition in partitions:
> BEGIN;
> UPDATE partition SET new_field = 0;
> ALTER TABLE partition
> ALTER COLUMN new_field SET NOT NULL
> , ALTER COLUMN new_field SET DEFAULT 0;
> COMMIT;
>
> CLUSTER partition USING partition_pkey;
> REINDEX TABLE partition;
> VACUUM ANALYZE partition;
> done
>
> After I've clustered the table, must I reindex and vacuum as well? It is unclear to me if clustering a table reindexes or not: the docs at http://www.postgresql.org/docs/current/static/sql-cluster.html are silent on the matter, but do mention that an ANALYZE is in order.

CLUSTER does full table rewrite including all its indexes so REINDEX
is not required after it. It is mentioned in the docs implicitly:

<<
When an index scan is used, a temporary copy of the table is created
that contains the table data in the index order. Temporary copies of
each index on the table are created as well. Therefore, you need free
space on disk at least equal to the sum of the table size and the
index sizes.

When a sequential scan and sort is used, a temporary sort file is also
created, so that the peak temporary space requirement is as much as
double the table size, plus the index sizes.
>>

BTW, you do not need to do the ALTERs for each partition. What you
need to do is:

1. add the column to the parent,
2. set the default constraint on the column of the parent,
3. update the column in partitions to the value,
4. set the not null constraint on the parent.

It will be better from the point of view of inheritance as the new
column will be fully inherited from the parent rather then partially
overloaded in partitions.

--
Kind regards,
Sergey Konoplev
Database and Software Consultant

Profile: http://www.linkedin.com/in/grayhemp
Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
Skype: gray-hemp
Jabber: gray(dot)ru(at)gmail(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Sergey Konoplev 2013-04-25 03:55:53 Re: Confusing error message.
Previous Message Sergey Konoplev 2013-04-25 02:44:25 Re: Replication terminated due to PANIC