Re: Backup taking long time !!!

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: julyanto SUTANDANG <julyanto(at)equnix(dot)co(dot)id>
Cc: Dinesh Chandra 12108 <Dinesh(dot)Chandra(at)cyient(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Backup taking long time !!!
Date: 2017-01-22 16:37:48
Message-ID: 20170122163748.GG18360@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Greetings,

* julyanto SUTANDANG (julyanto(at)equnix(dot)co(dot)id) wrote:
> Please elaborate more of what you are saying. What i am saying is based on
> the Official Docs, Forum and our own test. This is what we had to do to
> save time, both backing up and restoring.
>
> https://www.postgresql.org/docs/9.6/static/functions-admin.html
>
> When PostgreSQL in the mode of Start Backup, PostgreSQL only writes to the
> XLOG, then you can safely rsync / copy the base data (snapshot) then later
> you can have full copy of snapshot backup data.

You are confusing two things.

After calling pg_start_backup, you can safely copy the contents of the
data directory, that is correct.

However, PostgreSQL *will* continue to write to the data directory.
That, however, is ok, because those changes will *also* be written into
the WAL and, after calling pg_start_backup(), you collect all of the
WAL using archive_command or pg_receivexlog. With all of the WAL
which was created during the backup, PG will be able to recover from the
changes made during the backup to the data directory, but you *must*
have all of that WAL, or the backup will be inconsistent because of
those changes that were made to the data directory after
pg_start_backup() was called.

In other words, if you aren't using pg_receivexlog or archive_command,
your backups are invalid.

> if you wanted to backup in later day, you can use rsync then it will copy
> faster because rsync only copy the difference, rather than copy all the
> data.

This is *also* incorrect. rsync, by itself, is *not* safe to use for
doing that kind of incremental backup, unless you enable checksums. The
reason for this is that rsync has only a 1-second level granularity and
it is possible (unlikely, though it has been demonstrated) to miss
changes made to a file within that 1-second window.

> my latter explanation is: use pg_basebackup, it will do it automatically
> for you.

Yes, if you are unsure about how to perform a safe backup properly,
using pg_basebackup or one of the existing backup tools is, by far, the
best approach. Attempting to roll your own backup system based on rsync
is not something I am comfortable recommending any more because it is
*not* simple to do correctly.

Thanks!

Stephen

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message julyanto SUTANDANG 2017-01-22 17:03:01 Re: Backup taking long time !!!
Previous Message julyanto SUTANDANG 2017-01-22 16:04:13 Re: Backup taking long time !!!