Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From: Aleksey Tsalolikhin <atsaloli(dot)tech(at)gmail(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?
Date: 2012-03-12 06:32:53
Message-ID: CA+jMWofbA5FfaRDf-ga7n6ihJGzPKp7zyPORSXFuwUiVZRCJ3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Dear Scott,

When I pg_dump -t bigtablename on the Slony slave, the dump file is
212G in size.

I am unable to perform the same test on the master until I get a
maintenance window, which may not be for a few weeks, as it does
impact our production system when we dump from the master (the web app
gets noticeably slower).

I compare this 212 GB size (logical size) with the 550 GB reported
size (physical size on disk), which corresponds to / aligns with "du
-sh" output, 550 GB.

I remember now I had a similar issue about a year and a half or so,
when we had a jump in database size, also with replication involved,
and one site became bigger than the other. I talked to Bruce M. about
it at some conference and he suggested looking at his site, where he
has explanation of how Postgres stores data, low-level, like structure
of a page and so on. Unfortunately I was unable to carve out the time
to drill into it then, just continued running with the larger database
size... so now this issue is coming back to haunt me, even bigger
now.

The size had doubled earlier, and now it has nearly tripled.

I'm afraid the easiest (quickest) solution will be for me to destroy
the RAID 1E array and rebuild it as a RAID 5 array, which would give
me a bigger filesystem, buying me time to study up on what Bruce
suggested, or else to hire a professional Postgres consultant (if
$WORK ever coughs up the money).

Our resident Oracle DBA expert (since we don't have a Postgres one)
suggested I try truncating the table on the slave (with replication
down) and then restoring it from this pg_dump, just to see if the size
of the new table will be 200 GB or 500 GB. If 200, we're home free;
if 500, we need to continue to investigate.

In the meantime, I owe you the size of the bigtable from
production... if anybody has any other suggestions, I am all ears.

Yours very truly,
Aleksey

On 3/8/12, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> wrote:
> On Thu, Mar 8, 2012 at 1:10 PM, Aleksey Tsalolikhin
> <atsaloli(dot)tech(at)gmail(dot)com> wrote:
>>> On Thu, Mar 8, 2012 at 12:10 PM, Aleksey Tsalolikhin
>>> <atsaloli(dot)tech(at)gmail(dot)com> wrote:
>>>> On Wed, Mar 7, 2012 at 10:05 AM, Aleksey Tsalolikhin
>>>> <atsaloli(dot)tech(at)gmail(dot)com> wrote:
>>>>> We're replicating a PostgreSQL 8.4.x database using Slony1-1.2.x
>>>>>
>>>>>
>>>>> My biggest table measures 154 GB on the origin, and 533 GB on
>>>>> the slave.
>>>>>
>>>>> Why is my slave bigger than my master? How can I compact it, please?
>>>>
>>>>
>>>> On Wed, Mar 7, 2012 at 11:54 PM, Stuart Bishop
>>>> <stuart(at)stuartbishop(dot)net> wrote back:
>>>>>
>>>>> Do you have a long running transaction on the slave? vacuum will not
>>>>> reuse space that was freed after the longest running transaction.
>>>>>
>>>>> You need to use the CLUSTER command to compact it, or VACUUM FULL
>>>>> followed by a REINDEX if you don't have enough disk space to run
>>>>> CLUSTER. And neither of these will do anything if the space is still
>>>>> live because some old transaction might still need to access the old
>>>>> tuples.
>>>>
>>>> Dear Stuart,
>>>>
>>>> We do not run any transactions on the slave besides we pg_dump the
>>>> entire database every 3 hours. I don't have enough disk space to
>>>> CLUSTER
>>>> the table; I ran VACUUM FULL yesterday, and I just fired up a REINDEX
>>>> TABLE.
>>>>
>>>> I'd love to get some insight into how much logical data I have versus
>>>> how
>>>> much physical space it is taking up. Is there some admin tool or
>>>> command
>>>> or query that will report that? For each table (and index), I'd like
>>>> to know how
>>>> much data is in that object (logical data size) and how much space it is
>>>> taking
>>>> up on disk (physical data size).
>>
>>
>> On Thu, Mar 8, 2012 at 11:58 AM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
>> wrote:
>>> Do you do things like truncate on the master? Cause truncates don't
>>> get replicated in slony.
>>
>>
>> Dear Scott,
>>
>> No, we do not truncate this table on the master. We only add to it.
>>
>> The REINDEX FULL completed and the table is still swollen.
>
> If you pg_dump -t tablename from each machine, are the backups about
> the same size?
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Nur Hidayat 2012-03-12 07:06:55 Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?
Previous Message Tim Uckun 2012-03-12 03:09:32 Re: full text search and ILIKE type clauses.