From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PanBian <bianpan2016(at)163(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: BUG #14929: Unchecked AllocateDir() return value in restoreTwoPhaseData() |
Date: | 2017-11-28 00:35:46 |
Message-ID: | CAB7nPqS5ua9ktEX7p_2WzFcTVCzo0zP9xAx4A7thPwV-8kq7bA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Tue, Nov 28, 2017 at 12:08 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> PanBian <bianpan2016(at)163(dot)com> writes:
>> On Mon, Nov 27, 2017 at 07:53:30PM +0900, Michael Paquier wrote:
>>> You are missing the fact that ReadDir goes through ReadDirExtended,
>>> which drops an ERROR log if the folder allocated is NULL.
>
>> You are right. Its my carelessness. ReadDir will not return back on a
>> NULL dir parameter. The code is bug free. Sorry for the trouble.
>
> There are some issues here, though:
>
> 1. twophase.c does this:
>
> cldir = AllocateDir(TWOPHASE_DIR);
> LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
> while ((clde = ReadDir(cldir, TWOPHASE_DIR)) != NULL)
> {
>
> which is flat out wrong because LWLockAcquire might well clobber errno.
> I don't see any good reason why we couldn't just swap the order of
> those two calls.
I have not checked if it actually updates errno or not, but relying on
the fact that it may do it sucks.
> 2. basebackup.c and some other places do things like
>
> dir = AllocateDir("pg_wal");
> if (!dir)
> ereport(ERROR,
> (errmsg("could not open directory \"%s\": %m", "pg_wal")));
> while ((de = ReadDir(dir, "pg_wal")) != NULL)
>
> Not only is this a waste of code, because the error message is no better
> than what ReadDir would provide, but it's wrong because it omits
> errcode_for_file_access(), causing the SQLSTATE to be reported as XX000.
> There are other places that are even lazier and use elog(), failing
> translatability as well as the errcode test.
I agree with using ereport() everywhere, a path may have been created
by initdb, but anything used after a base backup should be reported to
the user.
> There might be some other problems I missed in a quick scan.
>
> So there's definitely room for a cleanup patch here, but the originally
> proposed change isn't it.
I have spotted more problems. In pg_available_extensions,
AllocateDir() does nothing in the event of an error but forgets to
reset errno. In perform_base_backup, CheckXLogRemoved() is called
without saving errno, so the error message generated after may be
wrong.
I think that this requires its own new thread with a more extended
analysis on -hackers to attract attention, this goes way beyond the
original complain about a pointer dereference. And there is a
collection of small issues. I'll try to look at that after I am done
with my CF duties except if someone beats me or volunteers for it.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2017-11-28 00:52:17 | Re: BUG #14928: Unchecked SearchSysCacheCopy1() return value |
Previous Message | Tom Lane | 2017-11-28 00:25:49 | Re: BUG #14928: Unchecked SearchSysCacheCopy1() return value |