Re: database is bigger after dump/restore - why? (60 GB to 109 GB)

From: Aleksey Tsalolikhin <atsaloli(dot)tech(at)gmail(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: database is bigger after dump/restore - why? (60 GB to 109 GB)
Date: 2011-03-07 22:45:00
Message-ID: AANLkTikmz=jex07jSMLToj861zHOOsJ4wAtWFTiSTJ=H@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Mar 4, 2011 at 7:19 PM, Adrian Klaver <adrian(dot)klaver(at)gmail(dot)com> wrote:
> On Friday, March 04, 2011 5:11:04 pm Aleksey Tsalolikhin wrote:
>> On Fri, Mar 4, 2011 at 4:45 PM, Adrian Klaver <adrian(dot)klaver(at)gmail(dot)com> wrote:
>> > On Friday, March 04, 2011 2:03:23 pm Aleksey Tsalolikhin wrote:
>> >> On Fri, Mar 4, 2011 at 7:53 AM, Adrian Klaver <adrian(dot)klaver(at)gmail(dot)com>
> wrote:
>> >> >What is the data being stored in the table?
>> >>
>> >> For the main part, it's an XML file, we store it in the third field.
>> >> Our XML files are 13KB - 48 KB in length.
>> >
>> > And you are positive that field has not had its storage changed to
>> > something other than EXTENDED?
>>
>> Quite.  I just double checked on both servers at it is EXTENDED for
>> everything but
>> the timestamp (5th field) which is PLAIN (on both servers).
>>
>> >> From my observation, when the table was under 30 GB in size,
>> >> TOAST compression worked fine; when it got above 35 GB in size,
>> >> now TOAST compression is not working.
>> >
>> > More questions:)
>> > How do you know that?
>> > I thought the size problem only came to light when you tried to set up
>> > another server.
>> > Was there some sort of monitoring going on previous to setting up the new
>> > server?
>> > Anything else happen around that time?
>>
>> OK, I have to withdraw my "observation".  Let me refine it (which I am
>> afraid makes
>> it less useful):  I've done pg_dump/restore of the database earlier, when
>> it was smaller (< 30 GB) and did not notice such a doubling in size; I
>> don't think it occurred.  Certainly the last time we moved the database
>> from the DR back to primary site, it did not double in size from 1 GB (it
>> was around 1 GB then).
>>
>> Here is what I did see:  we've had Slony replication running for a
>> while (over half
>> a year) from Primary to DR; and file system utilization on Primary and
>> DR was about
>> the same.  (around 75%).
>>
>> Also, I had done some pg_dump's / pg_restore's from DR to Dev and Stage,
>> and the database size was about 1:1.
>>
>> But most recently, I shut down replication, and modified the cluster config
>> (added several tables and sequences; plus some tables were modified so
>> I wanted to get a clean start on replicating them).  I removed the slony
>> schemas and re-created a Slony replication set -- my filesystem was 75%
>> full on the master,
>> and it hit 100% on the slave!  So I lost my slave!
>>
>> Then I tried pg_dump/pg_restore and noticed the same thing, that one table
>> doubles in size.
>>
>> Last time I did a full Slony re-sync like this was around 30 GB.  Now
>> we're up to
>> 40-50 GB and hit the 1:2 factor.
>>
>> I can't think of anything else happening around this time...
>>
>> I'm going to try splitting the pg_dump file (40 GB) in half, and load the
>> 20 GB file, and see how much space it takes up in the database, I'm
>> curious if I can replicate the 1:2 swelling with this smaller table.
>>
>> Aleksey
>
>
> Going over the saga to date.
>
> 1)Slony replication running between production server and DR server for half a
> year or so.
>        Where the Postgres versions the same between servers?

Yes, 8.4.4

>
> 2) Replication shut down, cluster configuration modified
>        Assuming that the problem table was not one of the ones added correct?

Correct.

> 3)In your first email you mentioned upgrading the production server.
>        Was this a version upgrade?
>                Major or minor upgrade?
>                From what to what?
>        Otherwise what was the upgrade?

Hardware upgrade only. Posgres still 8.4.4.

>
> 4)Dump/restore to Dev and Stage seem to be alright.

No, the data doubles in size in the course of the restore.
To any/all of my environments.

> 5)The restore to DR server is showing size growth of 2x.
>        Is this a fresh instance of Postgres or is it the instance that was under
> Slony replication previously?

This is the instance that was under Slony replication previously.

Dev had been under Slony replication previously. Stage had not.

Experiment 1:
hypothesis: something about how large my table has grown is causing
the TOAST compression to fail on COPY.
test: pg_dump the big table, cut the dump file in half using
"/bin/split", add "\." at the end of the file, and load the top half.
result: database is 50 GB in size. hypothesis proven false.

Experiment 2:
hypothesis: something about Slony is causing the TOAST compression to
be disabled on COPY.
test: load the 50% dump file from experiment 1 above into our Stage
database, which was never touched by Slony.
result: database is 50 GB in size. hypothesis proven false.

Best,
Aleksey

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2011-03-07 22:58:12 Re: Why count(*) doest use index?
Previous Message Glenn Maynard 2011-03-07 22:26:56 Re: Why count(*) doest use index?