hot backups: am I doing it wrong, or do we have a problem with pg_clog?

From: Daniel Farina <daniel(at)heroku(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: hot backups: am I doing it wrong, or do we have a problem with pg_clog?
Date: 2011-04-21 11:15:48
Message-ID: BANLkTi=j-k3QOFKpjxUG5m0FtihANz3tOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

To start at the end of this story: "DETAIL: Could not read from file
"pg_clog/007D" at offset 65536: Success."

This is a message we received on a a standby that we were bringing
online as part of a test. The clog file was present, but apparently
too small for Postgres (or at least I tihnk this is what the message
meant), so one could stub in another clog file and then continue
recovery successfully (modulus the voodoo of stubbing in clog files in
general). I am unsure if this is due to an interesting race condition
in Postgres or a result of my somewhat-interesting hot-backup
protocol, which is slightly more involved than the norm. I will
describe what it does here:

1) Call pg start backup
2) crawl the entire postgres cluster directory structure, except
pg_xlog, taking notes of the size of every file present
3) begin writing TAR files, but *only up to the size noted during the
original crawling of the cluster directory,* so if the file grows
between the original snapshot and subsequently actually calling read()
on the file those extra bytes will not be added to the TAR.
3a) If a file is truncated partially, I add "\0" bytes to pad the
tarfile member up to the size sampled in step 2, as I am streaming the
tar file and cannot go back in the stream and adjust the tarfile
member size
4) call pg stop backup

The reason I go to this trouble is because I use many completely
disjoint tar files to do parallel compression, decompression,
uploading, and downloading of the base backup of the database, and I
want to be able to control the size of these files up-front. The
requirement of stubbing in \0 is because of a limitation of the tar
format when dealing with streaming archives and the requirement to
truncate the files to the size snapshotted in the step 2 is to enable
splitting up the files between volumes even in the presence of
possible concurrent growth while I'm performing the hot backup. (ex: a
handful of nearly-empty heap files can rapidly grow due to a
concurrent bulk load if I get unlucky, which I do not intend to allow
myself to be).

Any ideas? Or does it sound like I'm making some bookkeeping errors
and should review my code again? It does work most of the time. I
have not gotten a sense how often this reproduces just yet.

--
fdr

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nick Raj 2011-04-21 12:12:26 Defining input function for new datatype
Previous Message Simon Riggs 2011-04-21 10:38:27 Re: Re: database system identifier differs between the primary and standby