Re: reclaiming diskspace bloat w/near-zero downtime

From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: "Ed L(dot)" <pgsql(at)bluepolka(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: reclaiming diskspace bloat w/near-zero downtime
Date: 2004-12-03 14:40:20
Message-ID: 41B07AD4.1000903@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Ed L. wrote:

>I need to reclaim the diskspace from a heavily updated 7.3.4 table which has
>grown 95% bloated to 20gb of disk (indices and toast included), and I need
>to reclaim it while the table continues to get updates, and without
>interrupting/delaying access more than a few seconds. This is an
>explanation of what I've tried so far along with some observations and a
>request for other ideas to try.
>
>Some constraints on my efforts: Stopping the cluster for any reason is
>pretty undesirable, as is vacuum full. Upgrading to 7.4.6 via slony is an
>option, but I'm curious if there is an easier way.
>
>

You need to run a vacuum. If you are lucky a normal vacuum should
suffice and you will not take down the machine.

Sincerely,

Joshua D. Drake

>I assume this bloat is basically due to the garbage generation outpacing the
>garbage collection. The bloat does not appear to be in indices, so the
>index bloat issue does not seem to be in play much here. The table gets
>about 5% INSERTs, 95% UPDATEs, and 0% DELETEs, and UPDATEs are not changing
>indexed values, so maybe that makes sense that indices are not bloated.
>The UPDATEs to the table consist of appending text to a text colum (UPDATE
>foo SET value = 'new text' || value). I had max_fsm_pages at 4M (for upto
>32GB of disk?) and 8K (8000 tables, etc), both far more than I needed
>initially, but it still bloated. [SIDEBAR: Vacuum + fsm is not working as
>I expected; it is clearly not reclaiming space resulting from the UPDATEs.
>If I UPDATE 10000 rows and then run vacuum, I was expecting/hoping that a
>subsequent UPDATE of 9000 rows would largely reuse the space reclaimed from
>the 10000-row UPDATE.]
>
>Anyway, I need to reclaim the space without stopping the cluster and without
>blocking access to the table for more than a few seconds. I seem to have
>found a way to do that for any table ('foo') that doesn't have fkey/func
>dependencies:
>
> create table foo_slim as exact schema of foo
> begin
> alter table foo rename foo_bloated
> create view foo as select from foo_slim union all foo_bloated
> create rules:
> on insert to foo: insert into foo_slim
> on delete to foo: delete from foo_bloated or foo_slim
> on update to foo: if in foo_bloated,
> insert new values into foo_slim
> delete from foo_bloated
> commit;
>
>Once that's done, then the data begins to trickle into the new, slim table,
>and the xfer can be speeded up by doing no-op updates on the foo view or a
>select-for-update function on the foo_bloated table. Once all the data has
>migrated over to foo_slim, the initial state can be restored with:
>
> begin
> drop view foo
> alter table foo_slim rename to foo
> commit;
>
>and then foo_bloated can be dropped. That seems to work well enough.
>
>But how to do it on a table with foreign keys and triggers, etc? I was
>wondering if I could use the same basic approach but manually reconstitute
>the oid linkages so that the triggers and functions stayed intact even
>while renaming/dropping/etc, but sounds a little dicey. Any other ideas?
>
>Thanks,
>Ed
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 9: the planner will ignore your desire to choose an index scan if your
> joining column's datatypes do not match
>
>

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd(at)commandprompt(dot)com - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL

Attachment Content-Type Size
jd.vcf text/x-vcard 285 bytes

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2004-12-03 14:41:04 Re: table inheritance and DB design
Previous Message Ian Harding 2004-12-03 14:34:06 Re: relation does not exist error