Re: Help with data transfer please

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Sullivan <andrew(at)libertyrms(dot)info>
Cc: "'pgsql-general'" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Help with data transfer please
Date: 2002-06-11 02:00:27
Message-ID: 11128.1023760827@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Andrew Sullivan <andrew(at)libertyrms(dot)info> writes:
> On Mon, Jun 10, 2002 at 04:30:26PM -0400, terry(at)greatgulfhomes(dot)com wrote:
>> It is my understanding that copying the files that reside in the data
>> directory are not sufficient for a backup/restore, (because there are file
>> pointers within the data files I believe, pointers which are no longer valid
>> on a different server because different blocks get assigned to different
>> files). Therefore it follows that an rsync would not be sufficient to keep
>> the database in sync.

> Hmm. That sounds plausible. But you can tar up a data directory,
> and restore it later on a different physical device, and it will
> work. Or, at least, it has worked for me. This, only if the
> postmaster is shut down. But maybe my case was just a fluke.

Andrew is correct: you can tar a complete $PGDATA tree and move it to
another machine (of the same architecture anyway --- differences in
endianness, datatype alignment requirements, or floating-point
representation can still mess you up).

What people keep trying to do that *does not* work is to copy only a
subset of the $PGDATA tree, such as $PGDATA/base/somedb/. This does not
work because only part of the system state is in the data files --- the
rest is in the transaction status log (pg_clog/ directory in 7.2, pg_log
table in older releases). You can't get away with copying a data file
into another installation that has different contents of pg_clog,
because the transaction numbers will be out of sync.

Ordinarily you need to copy the pg_xlog/ contents that go with the
data files too. For data transfer purposes (not disaster recovery)
you could omit that subdirectory in favor of running pg_resetxlog
at the destination; but I doubt it's worth the trouble.

And yes, you'd better have the postmaster shut down, because otherwise
tar is likely to collect pg_clog entries that are out of step with the
data files --- not to mention the prospect that there are unwritten
modifications still in memory.

As far as the original question goes --- I suppose you could shut down
the postmaster and run rsync to keep a remote backup copy up-to-date,
as long as the backup was just a cold-storage duplicate. Don't try to
start a postmaster in it till you want to make it your primary (else
transaction numbers will get out of sync).

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Arguile 2002-06-11 02:24:32 Re: logging to a file
Previous Message Stephan Szabo 2002-06-11 00:00:18 Re: cascading an insert trigger/rule help