Re: database is bigger after dump/restore - why? (60 GB to 109 GB)

From: tv(at)fuzzy(dot)cz
To: "Aleksey Tsalolikhin" <atsaloli(dot)tech(at)gmail(dot)com>
Cc: "pgsql-general" <pgsql-general(at)postgresql(dot)org>
Subject: Re: database is bigger after dump/restore - why? (60 GB to 109 GB)
Date: 2011-02-24 09:40:46
Message-ID: c4eb28f12793c69d2e9d19f42b9b632b.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Aleksey,

I've read your previous post, and although I'm not quite sure what is the
root cause, I have some questions and (maybe wild) guesses ...

1) Are those two machines (primary and DR) exactly the same? I mean CPU,
filesystem (including blocksize etc.)?

2) What about database encoding? I think that ASCII on primary and UTF-8
on the DR might have result in such difference in size (but maybe I'm
wrong).

3) How exactly have you measured the size of the database? Filesystem
(using 'du'), or from pg_* views?

4) The fact that a dump file is significantly smaller than the database is
not a big surprise - the reason is that some values are 'shorter' in ASCII
than in the original form (e.g. 32-bit integer 0 is encoded as a single
char '0' which means 1B instead of 4B).

5) Have you tried to use pageinspect contrib module? It allows you to find
out exactly how the data items are organized on a page, so I'd recommend
this:

a) compare table sizes using system catalogs, select the one with
greatest difference

b) use pageinspect to display details about one block of the table in
both databases and compare the result

Maybe this will help you to find out the real cause.

regards
Tomas

> I've dumped the big table from the original database (where it is 61 GB in
> size)
> and am restoring it into a test database to see what the size is after
> the restore.
>
> As it is now, our DR is offline because we did not expect the database to
> nearly double in size upon COPY of the data into a new database. Would
> like to
> understand what is going on. And would like to not have such a swell of
> data
> upon transfer. Is there anything I can do, please?
>
> Best,
> Aleksey
>
> On Tue, Feb 22, 2011 at 12:44 PM, Aleksey Tsalolikhin
> <atsaloli(dot)tech(at)gmail(dot)com> wrote:
>> Hi.  Last week our 60 GB database (per psql \l+) was (re-)replicated to
>> the DR site using SlonyI, and arrived 109 GB in size which caused a
>> problem as it filled up the filesystem on the DR server - we expected
>> the
>> DR database to be the same size.  Mystery.
>>
>> Now just past weekend we upgraded our production server by pg_dump
>> and pg_restore, and again the database is 109 GB in size!
>>
>> Most of our data is in a single table, which on the old server is 50 GB
>> in
>> size and on the new server is 100 GB in size.
>>
>> Could you please help us understand why a COPY of the data into a new
>> database (whether DR or the new server) results in different disk usage?
>>
>> Somebody mentioned on the Slony users list that there is a kind of
>> padding
>> that goes in that actually helps performance.
>>
>> Is there a way to track disk usage MINUS the padding?
>>
>> Thanks,
>> Aleksey
>>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ahmed Shinwari 2011-02-24 10:37:49 Re: "could not accept SSPI security context"
Previous Message salah jubeh 2011-02-24 09:35:08 Re: schema Cleanning