Quick Links

Re: PG backup performance

From:	Andy Colson <andy(at)squeakycode(dot)net>
To:	isabella(dot)ghiurea(at)nrc-cnrc(dot)gc(dot)ca
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: PG backup performance
Date:	2010-05-31 20:57:33
Message-ID:	4C0422BD.1010306@squeakycode.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On 05/31/2010 02:45 PM, Isabella Ghiurea wrote:
>
> Hi Andy,
> Thank you , please, see bellow my answers:
> Andy Colson wrote:
>>
>> On 05/31/2010 11:05 AM, Isabella Ghiurea wrote:
>> > Hello PG list,
>> > I 'm looking for some tip, advice toimprove PG backups performance,
>> > presently running
>> > pg_dumpall compressed option on raid array 0 getting aprox14GB
>> writes in
>> > 45 min, I'm backing up aprox 200GB database cluster daily .
>> > How can I improve this performance with the present hardware and PG
>> > version 8.3.6 , can I run parallel backups in PG ?
>> > Thank you
>> > Isabella
>> >
>>
>> Short answer, yes, you can.
>> Long answer, we need more info.
>>
>> We need to know what the slow part is.
>>
>> Are you CPU bound
>>
> CPU
>>
>> or IO bound?
>> Are you backing up over a network?
>>
> No , on locally disks
>>
>> How many tables? (well, big tables... how many really big tables).
>>
> Around 20 big tables all of them in a one separate schema on separate
> tables space from rest of other schemas. I already start backing up
> individuals schema.
> My big concern is IF I have to recover from backups will take me at
> least twice as much time around 1-2 days I expect.
>>
>> How many cores/cpu's do you have?
>>
> 4 quad core CPU server. How can I make PG to use multi threading?
>>
>>
>> I bet all 200GB hardly changes, are you sure you need to back it all
>> up over and over again? Have you thought of replication?
>>
> Yes, but waiting for a more robust build in replication PG version aka
> PG 9.1.
> Isabella
>
>

cool.. my second astronomer :-)

You are cpu bound because pg_dump will use one cpu, and if you have pg_dump compress, it uses the same cpu. But if you use a pipe (pg_dump | gzip) then each gets a cpu. And instead of using pg_dumpall, which dumps one db at a time, use pg_dump and run several in parallel.

try something like:

pg_dump big1 | gzip > big1.sql & pg_dump big2 | gzip > big2.sql

This will dump two at a time using 4 cpu's.

Have a backup file for each big table will also make restore faster, because you can restore them all at the same time, where as pg_dumpall will restore one at a time.

The compression is probably going to take the most time... if your data is not very compressible then you'll get a big speed boost by dropping the compress.

-Andy

In response to

Re: PG backup performance at 2010-05-31 19:45:19 from Isabella Ghiurea

Browse pgsql-general by date

	From	Date	Subject
Next Message	Andy Colson	2010-05-31 21:01:09	Re: PG backup performance
Previous Message	Isabella Ghiurea	2010-05-31 19:45:19	Re: PG backup performance