From: | Gregory Stark <stark(at)enterprisedb(dot)com> |
---|---|
To: | "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com> |
Cc: | "Bill Moran" <wmoran(at)collaborativefusion(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Gauthier\, Dave" <dave(dot)gauthier(at)intel(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: "Healing" a table after massive updates |
Date: | 2008-09-13 14:09:27 |
Message-ID: | 87sks4qgrs.fsf@oxford.xeocode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
"Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com> writes:
> On Thu, Sep 11, 2008 at 8:56 AM, Bill Moran
> <wmoran(at)collaborativefusion(dot)com> wrote:
>> In response to Alvaro Herrera <alvherre(at)commandprompt(dot)com>:
>>
>>> Bill Moran wrote:
>>> > In response to "Gauthier, Dave" <dave(dot)gauthier(at)intel(dot)com>:
>>> >
>>> > > I might be able to answer my own question...
>>> > >
>>> > > vacuum FULL (analyze is optional)
>>> >
>>> > CLUSTER _may_ be a better choice, but carefully read the docs regarding
>>> > it's drawbacks first. You may want to do some benchmarks to see if it's
>>> > really needed before you commit to it as a scheduled operation.
>>>
>>> What drawbacks?
>>
>> There's the whole "there will be two copies of the table on-disk" thing
>> that could be an issue if it's a large table.
>
> I've also found cluster to be pretty slow, even on 8.3. On a server
> that hits 30-40Megs a second write speed for random access during
> pgbench, it's writing out at 1 to 2 megabytes a second when it runs,
> and takes the better part of a day on our biggest table. vacuumdb -fz
> + reindexdb ran in about 6 hours which means we could fit it into our
> maintenance window. vacuum moves a lot more data per second than
> cluster.
Alternative you can do
ALTER TABLE tab ALTER col TYPE <sametype> USING col;
which will rewrite the table without using an index. This is much faster but
has the same other disadvantages.
Incidentally the visibility bugs are indeed entirely fixed in 8.3. In 8.2 and
before cluster and alter table rewrites can both cause tuples to not appear
for transactions which were started before the cluster or alter table such as
a long-running pg_dump.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!
From | Date | Subject | |
---|---|---|---|
Next Message | Ben Chobot | 2008-09-13 15:44:46 | detecting recurring appointment conflicts |
Previous Message | Martin Gainty | 2008-09-13 13:44:31 | Re: Getting rows in a very specific order |