Quick Links

Re: SimpleLruTruncate() mutual exclusion

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Noah Misch <noah(at)leadboat(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: SimpleLruTruncate() mutual exclusion
Date:	2019-07-29 16:58:17
Message-ID:	29566.1564419497@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> On Sun, Feb 17, 2019 at 11:31:03PM -0800, Noah Misch wrote:
>>> b. Arrange so only one backend runs vac_truncate_clog() at a time. Other than
>>> AsyncCtl, every SLRU truncation appears in vac_truncate_clog(), in a
>>> checkpoint, or in the startup process. Hence, also arrange for only one
>>> backend to call SimpleLruTruncate(AsyncCtl) at a time.

>> More specifically, restrict vac_update_datfrozenxid() to one backend per
>> database, and restrict vac_truncate_clog() and asyncQueueAdvanceTail() to one
>> backend per cluster. This, attached, was rather straightforward.

> Rebased. The conflicts were limited to comments and documentation.

I tried to review this, along with your adjacent patch to adjust the
segment-roundoff logic, but I didn't get far. I see the point that
the roundoff might create problems when we are close to hitting
wraparound. I do not, however, see why serializing vac_truncate_clog
helps. I'm pretty sure it was intentional that multiple backends
could run truncation directory scans concurrently, and I don't really
want to give that up if possible.

Also, if I understand the data-loss hazard properly, it's what you
said in the other thread: the latest_page_number could advance after
we make our decision about what to truncate, and then maybe we could
truncate newly-added data. We surely don't want to lock out the
operations that can advance last_page_number, so how does serializing
vac_truncate_clog help?

I wonder whether what we need to do is add some state in shared
memory saying "it is only safe to create pages before here", and
make SimpleLruZeroPage or its callers check that. The truncation
logic would advance that value, but only after completing a scan.
(Oh ..., hmm, maybe the point of serializing truncations is to
ensure that we know that we can safely advance that?)

Can you post whatever script you used to detect/reproduce the problem
in the first place?

regards, tom lane

PS: also, re-reading this code, it's worrisome that we are not checking
for failure of the unlink calls. I think the idea was that it didn't
really matter because if we did re-use an existing file we'd just
re-zero it; hence failing the truncate is an overreaction. But probably
some comments about that are in order.

In response to

Re: SimpleLruTruncate() mutual exclusion at 2019-06-28 17:06:28 from Noah Misch

Responses

Re: SimpleLruTruncate() mutual exclusion at 2019-08-01 06:51:17 from Noah Misch

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jeevan Ladhe	2019-07-29 17:25:29	concerns around pg_lsn
Previous Message	Arne Roland	2019-07-29 16:43:05	Partial join