Re: O(n) tasks cause lengthy startups and checkpoints

From: Andres Freund <andres(at)anarazel(dot)de>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Amul Sul <sulamul(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: O(n) tasks cause lengthy startups and checkpoints
Date: 2022-02-17 22:28:29
Message-ID: 20220217222829.lndzwq72fh4v34c2@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-02-17 13:00:22 -0800, Nathan Bossart wrote:
> Okay. So IIUC the problem might already exist today, but offloading these
> tasks to a separate process could make it more likely.

Vastly more, yes. Before checkpoints not happening would be a (but not a
great) form of backpressure. You can't cancel them without triggering a
crash-restart. Whereas custodian can be cancelled etc.

As I said before, I think this is tackling things from the wrong end. Instead
of moving the sometimes expensive task out of the way, but still expensive,
the focus should be to make the expensive task cheaper.

As far as I understand, the primary concern are logical decoding serialized
snapshots, because a lot of them can accumulate if there e.g. is an old unused
/ far behind slot. It should be easy to reduce the number of those snapshots
by e.g. eliding some redundant ones. Perhaps we could also make backends in
logical decoding occasionally do a bit of cleanup themselves.

I've not seen reports of the number of mapping files to be an real issue?

The improvements around deleting temporary files and serialized snapshots
afaict don't require a dedicated process - they're only relevant during
startup. We could use the approach of renaming the directory out of the way as
done in this patchset but perform the cleanup in the startup process after
we're up.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-02-17 22:58:38 Re: O(n) tasks cause lengthy startups and checkpoints
Previous Message Peter Geoghegan 2022-02-17 22:23:51 Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM