Re: pg data backup from vps

From: Rory Campbell-Lange <rory(at)campbell-lange(dot)net>
To: support-tiger <support(at)tigernassau(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg data backup from vps
Date: 2017-12-01 20:46:50
Message-ID: 20171201204650.ayigjsypuw3gbfem@campbell-lange.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 01/12/17, support-tiger (support(at)tigernassau(dot)com) wrote:
> To diversify risk, we would like to have a daily or weekly data backup
> stored in another location besides the VPS service we are using - pg_dump is
> great for the backup but transferring a growing db across the internet to a
> local machine disk seems slow - how are others handling this with postgresql
> ?  Thks.

Speed is related to size. If you have a problem with the speed of data
transfer, you really might want to look at different ways of backing up.
Here is one pretty straightforward strategy based on both pg_dump and
streaming replication:

1. pg_dump
use the -Fc flags for compression
use the -j option to parallelize the dumps
consider adding an audit schema and inserting a row just before the
dump to make it easy to check the dump worked with pg_restore (you
can grep pg_restore output)
consider dumping twice a day and then rsync those files to nearby
machines and offsite.
'nearby machines' is in case we have to restore quickly and it can
take a lot of time to get big files back into production.
'offsite' because the place you have your database server might
evaporate

2. streaming replication
keep streaming changes to other servers
https://www.postgresql.org/docs/current/static/warm-standby.html
again you might want local and remote servers to the existing main
database
'local' so if you have other servers using the database they can
switch over to another server in the same space
'remote' so you have pretty up-to-date information on a remote server
that you can use if the place where your main database server
evaporates
a cool thing is you can play with the WAL file replay mechanisms so
that you can keep a slave roughly an hour behind the main database
server, for example, which can be nice if someone just did something
really bad in production.

These two approaches serve different purposes. Generally having a live,
up-to-date version of your database elsewhere based on streaming
replication is the most valuable thing to have if your main database
server goes down, and it can be brilliant having those other servers for
read-only tests and - if you are careful about offlining them and making
them masters - upgrade testing. But if a data bug or some other issue
started affecting data over time, or you need to retrieve the state of
something a month ago, you really might need dumps too.

I'm sure others will have much more sage advice, but that is a starter
for 10.

Rory

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Job 2017-12-01 21:39:58 Problems with triggers and table lock
Previous Message basti 2017-12-01 20:44:11 Re: pg data backup from vps