Re: Pre-allocating WAL files

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Andy Fan <zhihuifan1213(at)163(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Maxim Orlov <m(dot)orlov(at)postgrespro(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Pre-allocating WAL files
Date: 2025-01-21 15:52:51
Message-ID: Z4_C04ZJ_SymvYes@nathan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 21, 2025 at 03:31:27AM +0000, Andy Fan wrote:
> Come from [0] and thanks for working on this. Here are some design
> review/question after my first going through the patch.

Thanks for taking a look.

> 1. walwriter vs checkpointer? I prefer to walwriter for now because..
>
> a. checkpointer is hard to do it in a timely manner either because
> checkpoint itself may take a long time or the checkpoint_timeout
> is much bigger than commit_delay. but walwriter could do this timely.
> I think this is an important consideration for this feature.
>
> b. We want walwriter to run with low latency to flush out async
> commits. This is true, but preallocating a wal doesn't increase the
> latency too much. After all, even user uses the aysnc commit, the walfile
> allocating is done by walwriter already in our current implementation.

I attempted to deal with this by having pre-allocation requests set the
checkpointer's latch and performing the pre-allocation within the
checkpointer's main loop and during write delays. However, checkpointing
does a number of other things that could just as easily delay
pre-allocation, so it's probably worth considering the WAL writer.

> 2. How many xlogfile should be preallocated by checkpointer/walwriter
> once. In your patch it is controled by wal-preallocate-max-size. How
> about just preallocate *the next one* xlogfile for the simplification
> purpose?

We could probably start with something like that. IIRC it was difficult to
create workloads where you'd need more than 1-2 at a time, provided
whatever is pre-allocating refills the pool quickly.

> 3. Why is the purpose of preallocated_segments directory? what in my
> mind is we just prellocate the normal filename so that XLogWrite could
> open it directly. This is same as what wal_recycle does and we can reuse
> the same strategy to clean up them if they are not needed anymore.

The purpose is to limit the use of pre-allocated segments to only
situations where WAL recycling is not sufficient. Basically, if writing a
record would require a new segment to be created, we can quickly pull a
pre-allocated one instead of creating it ourselves. Besides simplifying
matters, this prevents a lot of unnecessary pre-allocation, since many
workloads will almost never need anything beyond the recycled segments.

--
nathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Benoit Lobréau 2025-01-21 16:01:06 doc: explain pgstatindex fragmentation
Previous Message Tom Lane 2025-01-21 15:48:56 Re: [PATCH] Add roman support for to_number function