From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | Andy Fan <zhihuifan1213(at)163(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Maxim Orlov <m(dot)orlov(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Pre-allocating WAL files |
Date: | 2025-01-22 15:56:33 |
Message-ID: | Z5EVMXSWGN7_ViZ7@nathan |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jan 22, 2025 at 01:14:22AM +0000, Andy Fan wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
>> FWIW, I've seen the fsyncs around recycling being a rather substantial
>> bottleneck. To the point of the main benefit of larger segments being the
>> reduction in number of fsyncs at the end of a checkpoint. I think we should
>> be able to make the fsyncs a lot more efficient by batching them, first rename
>> a bunch of files, then fsync them and the directory. The current pattern
>> bascially requires a separate filesystem jouranl flush for each WAL segment.
>
> For education purpose, how to fsync files in batch? 'man fsync' tells me
> user can only fsync one file each time.
>
> int fsync(int fd);
>
> The fsync manual seems not saying fsync on a directory would fsync all
> the files under that directory.
I think Andres means that we should wait until the end of recycling to
fsync() the directory so that we aren't flushing it for every single
recycled segment. This sort of batching approach could also work well with
pre_sync_fname(), so that by the time we actually call fsync() on the
files, it has very little to do.
--
nathan
From | Date | Subject | |
---|---|---|---|
Next Message | Paul Ramsey | 2025-01-22 15:57:52 | Converting pqsignal to void return |
Previous Message | Nathan Bossart | 2025-01-22 15:50:59 | Re: Pre-allocating WAL files |