pg_archivecleanup bug

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: pg_archivecleanup bug
Date: 2013-12-05 20:06:07
Message-ID: 1386273967.37960.YahooMailNeo@web162901.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

An EDB customer reported a problem with pg_archivecleanup which I
have looked into and found a likely cause.  It is, in any event, a
bug which I think should be fixed.  It has to do with our use of
the readdir() function:

http://pubs.opengroup.org/onlinepubs/7908799/xsh/readdir_r.html

These are the relevant bits:

| Applications wishing to check for error situations should set
| errno to 0 before calling readdir(). If errno is set to non-zero
| on return, an error occurred.

| Upon successful completion, readdir() returns a pointer to an
| object of type struct dirent. When an error is encountered, a
| null pointer is returned and errno is set to indicate the error.
| When the end of the directory is encountered, a null pointer is
| returned and errno is not changed.

Here is our current usage:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=contrib/pg_archivecleanup/pg_archivecleanup.c;h=8f77998de12f95f41bb95c3e05a14de6cdf18047;hb=7800229b36d0444cf2c61f5c5895108ee5e8ee2a#l110

So an error in scanning the directory will not be reported; the
cleanup will quietly terminate the WAL deletions without processing
the remainder of the directory.  Attached is the simplest fix,
which would report the error, stop looking for WAL files, and
continue with other clean-ups. I'm not sure we should keep the fix
that simple.  We could set a flag so that we would exit with a
non-zero code, or we could try a new directory scan as long as the
last scan found and deleted at least one WAL file.  Perhaps we want
to back-patch the simple fix and do something fancier for 9.4?

I would also add a few comment lines before committing this, if we
decide to go with the simple approach -- this is for purposes of
illustration; to facilitate discussion.

Thoughts?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
archivecleanup-dir-error-v1.diff text/x-diff 762 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-12-05 20:37:55 Re: Proof of concept: standalone backend with full FE/BE protocol
Previous Message Tom Lane 2013-12-05 20:05:30 Re: Proof of concept: standalone backend with full FE/BE protocol