backup-strategies for large databases

From: MirrorX <mirrorx(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: backup-strategies for large databases
Date: 2011-08-13 21:44:23
Message-ID: 1313271863687-4697145.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

hello to all

i am trying to find an "acceptable" solution for a backup strategy on one of
our servers (will be 8.4.8 soon, now its 8.4.7). i am familiar with both
logical (dump/restore) and physical backups (pg_start_backup, walarchives
etc) and have tried both in some other cases.

the issue here is that the server is heavily loaded. the daily traffic is
heavy, which means the db size is increasing every day (by 30 gb on average)
and the size is already pretty large (~2TB). so the mentioned above
techniques are not very suitable here. the pg_dump beside the fact that it
would take about a month to be completed is not acceptable also b/c of the
the fact that there will be no PITR then.

at the moment, the copy of the PGDATA folder (excluding pg_xlog folder), the
compression of it and the storing of it in a local storage disk takes about
60 hours while the file size is about 550 GB. the archives are kept in a
different location so that not a problem. so, i dont want even to imagine
how much time the uncompress and copy will take in 'disaster' scenario.

soon, another server will be added there, a fail-over one. but again, with
this technique, to send there the PGDATA and the wals doesnt seem very
efficient.

plus, we cannot keep the PGDATA in an older version and just replicate the
wals b/c due to the heavy load they are about 150GB/day. so, even though
that we can suppose that we have unlimited disk storage its not reasonable
to use 5 TB for the wals (if the PGDATA is sent once a month) and
furthermore a lot of time will be needed for the 2nd server to recover since
it will have to process all this huge amount of wals.

so, in conclusion, given the fact that the size of the db is large, a very
big amount of storage is at our disposal and a new server will be up soon,
what options are there to reduce to minimum the down-time in a disaster
scenario? any (file-system) solutions that keep the disks at sync like DRDB
are suitable?so that the disk of the 2nd server would be at sync with the
1st. even if that works, i would still like to have a 3rd backup in the
storage disks so my question remains.

thx in advance for any suggestions and sorry for my long post...

--
View this message in context: http://postgresql.1045698.n5.nabble.com/backup-strategies-for-large-databases-tp4697145p4697145.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Chris Travers 2011-08-13 21:49:16 Re: [GENERAL] Using Postgresql as application server
Previous Message MirrorX 2011-08-13 21:16:49 Re: PD_ALL_VISIBLE flag warnings