Re: [RFC] Lock-free XLog Reservation from WAL

From: "Zhou, Zhiguo" <zhiguo(dot)zhou(at)intel(dot)com>
To: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Cc: wenhui qiu <qiuwenhuifx(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Lock-free XLog Reservation from WAL
Date: 2025-01-14 14:49:47
Message-ID: faea001e-c23c-46b3-aa7f-3d7e5c7a1a68@intel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Good day, Yura!

On 1/10/2025 8:42 PM, Yura Sokolov wrote:
> If you consider hash-table fillrate, than 256 is quite enough for 128
> concurrent inserters.

The profile of your patch didn't show significant hotspots in the hash
table functions, so I believe the 256 entries should be enough.

>>
>> I will soon try to evaluate the performance impact of your patch on my
>> device with the TPCC benchmark and also profile it to see if there are
>> any changes that could be made to further improve it.
>
> It would be great. On my notebook (Mac Air M1) I don't see any benefits
> neither from mine, nor from yours patch ))
> My colleague will also test it on 20 core virtual machine (but
> backported to v15).
>

I've tested the performance impact of our patches on an Intel Sapphire
Rapids device with 480 vCPUs using a HammerDB TPC-C workload (256 VUs).
The results show a 72.3% improvement (average of 3 rounds, RSD: 1.5%)
with your patch and a 76.0% boost (average of 3 rounds, RSD: 2.95%) with
mine, applied to the latest codebase. This optimization is most
effective on systems with over 64 cores, as our core-scaling experiments
suggest minimal impact on lower-core setups like your notebook or a
20-core VM.

>
>> BTW, do you have a plan to merge this patch to the master branch? Thanks!
>
> I'm not committer )) We are both will struggle to make something
> committed for many months ;-)
>
> BTW, your version could make alike trick for guaranteed atomicity:
> - change XLogRecord's `XLogRecPtr xl_prev` to `uint32 xl_prev_offset`
> and store offset to prev record's start.
>
> Since there are two limits:
>
>     #define XLogRecordMaxSize    (1020 * 1024 * 1024)
>     #define WalSegMaxSize 1024 * 1024 * 1024
>
> offset to previous record could not be larger than 2GB.
>
> Yes, it is format change, that some backup utilities will have to adopt.
> But it saves 4 bytes in XLogRecord (that could be spent to store
> FullTransactionId instead of TransactionId) and it is better compressible.
>
> And your version than will not need the case when this value is split
> among two buffers (since MAXALIGN is not less than 4), and PostgreSQL
> already relies on 4 byte read/write atomicity (in some places even
> without use of pg_atomic_uint32).
>
> ----
>
> regards
> Sokolov Yura aka funny-falcon

Thanks for the great suggestion!

I think we've arrived at a critical juncture where we need to decide
which patch to move forward with for our optimization efforts. I've
evaluated the pros and cons of my implementation:

Pros:
-Achieves an additional 4% performance improvement.

Cons:
-Breaks the current convention of XLog insertions.
-TAP tests are not fully passed, requiring time to resolve.
-May necessitate changes to the format and backup tools, potentially
leading to backward compatibility issues.

Given these considerations, I believe your implementation is superior to
mine. I'd greatly appreciate it if you could share your insights on this
matter.

Regards,
Zhiguo

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2025-01-14 15:07:14 Re: Eager aggregation, take 3
Previous Message Tom Lane 2025-01-14 14:39:59 Re: question about relation_open