Re: O(n) tasks cause lengthy startups and checkpoints

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Amul Sul <sulamul(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: O(n) tasks cause lengthy startups and checkpoints
Date: 2022-02-17 21:00:22
Message-ID: 20220217210022.GA3248793@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 17, 2022 at 11:27:09AM -0800, Andres Freund wrote:
> On 2022-02-17 10:23:37 -0800, Nathan Bossart wrote:
>> On Wed, Feb 16, 2022 at 10:59:38PM -0800, Andres Freund wrote:
>> > They're accessed by xid. The LSN is just for cleanup. Accessing files
>> > left over from a previous transaction with the same xid wouldn't be
>> > good - we'd read wrong catalog state for decoding...
>>
>> Okay, that part makes sense to me. However, I'm still confused about how
>> this is handled today and why moving cleanup to a separate auxiliary
>> process makes matters worse.
>
> Right now cleanup happens every checkpoint. So cleanup can't be deferred all
> that far. We currently include a bunch of 32bit xids inside checkspoints, so
> if they're rarer than 2^31-1, we're in trouble independent of logical
> decoding.
>
> But with this patch cleanup of logical decoding mapping files (and other
> pieces) can be *indefinitely* deferred, without being noticeable.

I see. The custodian should ordinarily remove the files as quickly as
possible. In fact, I bet it will typically line up with checkpoints for
most users, as the checkpointer will set the latch. However, if there are
many temporary files to clean up, removing the logical decoding files could
be delayed for some time, as you said.

> One possible way to improve this would be to switch the on-disk filenames to
> be based on 64bit xids. But that might also present some problems (file name
> length, cost of converting 32bit xids to 64bit xids).

Okay.

>> I've done quite a bit of reading, and I haven't found anything that seems
>> intended to prevent this problem. Do you have any pointers?
>
> I don't know if we have an iron-clad enforcement of checkpoints happening
> every 2*31-1 xids. It's very unlikely to happen - you'd run out of space
> etc. But it'd be good to have something better than that.

Okay. So IIUC the problem might already exist today, but offloading these
tasks to a separate process could make it more likely.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2022-02-17 21:10:13 Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM
Previous Message Robert Haas 2022-02-17 20:57:14 Re: buildfarm warnings