Re: Data duplication when moving datafiles from one server to another.

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Iñigo Martinez Lasala <imartinez(at)vectorsf(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Data duplication when moving datafiles from one server to another.
Date: 2010-12-20 20:36:43
Message-ID: AANLkTimRc9NsSCws6PzMyXrYJnSA=TysB=N2=tX9drM4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Mon, Dec 20, 2010 at 12:22 PM, Iñigo Martinez Lasala
<imartinez(at)vectorsf(dot)com> wrote:
> Good evening.
>
> Yesterday we experienced some data duplication in several database tables
> after one sysadmin decided to test in a production environment an rsync
> script in order to migrate a database from one server to another one.
> Postgresql (8.2) was running in source server and rsync script was launched
> from second one. Second one server had a one day old copy of the same
> database. Rsync script create a datafile replica in destination server.
> Our sysadmin swear he didn't launch the script in a reverse way (that is,
> from destination to source)... so my question is
> How this data duplication could happen? Due to an rsync lock in checkpoint
> segment or transaction logs? Or he has mistaken source with destination
> server?

I use this method all the time to backup big dbs from one machine to another:

(on remote machine):
/etc/init.d/postgresql stop

(on main server):
rsync -avl --delete /data/* backupserver:/data/* # takes forever
sudo /etc/init.d/postgresql stop
rsync -avl --delete /data/* backupserver:/data/* # fast, we're just catching up
sudo /etc/init.d/postgresql start

(on remote machine):
/etc/init.d/postgresql start

You can run such a script at 2 in the morning with a few minutes
downtime without having to jump through a lot of hoops.

So, can you show us what commands your sysadmin / dba actually typed?
My guess is he left out the --delete or left the backup server db up
and running at the time of the rsync. It needs to be shut down to do
this right.

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message snoop 2010-12-21 01:23:25 PostgreSQL in Shared Disk Failover mode on FreeBSD+CARP+RAIDZ
Previous Message Kevin Grittner 2010-12-20 19:43:01 Re: Data duplication when moving datafiles from one server to another.