pg_basebackup: ERROR: could not find any WAL files (9.3)

From: Lonni J Friedman <netllama(at)gmail(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: pg_basebackup: ERROR: could not find any WAL files (9.3)
Date: 2013-09-26 16:04:40
Message-ID: CAP=oouEtrbH8myOQyf+e879g4BBOewPH+KqKjabr6-V8ZStrOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Greetings,
I've recently pushed a new postgres-9.3 (Linux-x86_64/RHEL6) cluster
into production, with one master, and two hot standby streaming
replication slaves. Everything seems to be working ok, however
roughly half of my pg_basebackup attempts are failing at the very end
with the error:

pg_basebackup: could not get transaction log end position from server:
ERROR: could not find any WAL files

I should note that I'm running pg_basebackup on one of the two slaves,
and not the master. However, I've got an older, separate 9.3 cluster
with the same setup, and pg_basebackup never fails there.

I thought that the WAL files in question were coming from the pg_xlog
subdirectory. But I don't see any lack of files there on the server
running pg_basebackup. They are being generated continuously (as
expected), before, during & after the pg_basebackup. I scanned the
source ( http://doxygen.postgresql.org/basebackup_8c_source.html ),
and it seems to backup my understanding of the expected behavior:

306 /*
307 * There must be at least one xlog file in the pg_xlog directory,
308 * since we are doing backup-including-xlog.
309 */
310 if (nWalFiles < 1)
311 ereport(ERROR,
312 (errmsg("could not find any WAL files")));

However, what I see on the server conflicts with the error.
pg_basebackup was invoked on Thu Sep 26 01:00:01 PDT 2013, and failed
on Thu Sep 26 02:09:12 PDT 2013. In the pg_xlog subdirectory, I see
lots of WAL files present, before, during & after pg_basebackup was
run:
-rw------- 1 postgres postgres 16777216 Sep 26 00:38 000000010000208A000000E3
-rw------- 1 postgres postgres 16777216 Sep 26 00:43 000000010000208A000000E4
-rw------- 1 postgres postgres 16777216 Sep 26 00:48 000000010000208A000000E5
-rw------- 1 postgres postgres 16777216 Sep 26 00:53 000000010000208A000000E6
-rw------- 1 postgres postgres 16777216 Sep 26 00:58 000000010000208A000000E7
-rw------- 1 postgres postgres 16777216 Sep 26 01:03 000000010000208A000000E8
-rw------- 1 postgres postgres 16777216 Sep 26 01:08 000000010000208A000000E9
-rw------- 1 postgres postgres 16777216 Sep 26 01:14 000000010000208A000000EA
-rw------- 1 postgres postgres 16777216 Sep 26 01:19 000000010000208A000000EB
-rw------- 1 postgres postgres 16777216 Sep 26 01:24 000000010000208A000000EC
-rw------- 1 postgres postgres 16777216 Sep 26 01:29 000000010000208A000000ED
-rw------- 1 postgres postgres 16777216 Sep 26 01:34 000000010000208A000000EE
-rw------- 1 postgres postgres 16777216 Sep 26 01:38 000000010000208A000000EF
-rw------- 1 postgres postgres 16777216 Sep 26 01:43 000000010000208A000000F0
-rw------- 1 postgres postgres 16777216 Sep 26 01:48 000000010000208A000000F1
-rw------- 1 postgres postgres 16777216 Sep 26 01:53 000000010000208A000000F2
-rw------- 1 postgres postgres 16777216 Sep 26 01:58 000000010000208A000000F3
-rw------- 1 postgres postgres 16777216 Sep 26 02:03 000000010000208A000000F4
-rw------- 1 postgres postgres 16777216 Sep 26 02:08 000000010000208A000000F5
-rw------- 1 postgres postgres 16777216 Sep 26 02:14 000000010000208A000000F6

Thanks in advance for any pointers.

Browse pgsql-general by date

  From Date Subject
Next Message Rob Richardson 2013-09-26 16:04:53 How do I find a trigger function that is raising notices?
Previous Message Lonni J Friedman 2013-09-26 15:56:43 Re: partitioned table + postgres_FDW not working in 9.3