From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Kirill Reshke <reshkekirill(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: CSN snapshots in hot standby |
Date: | 2024-08-13 20:13:39 |
Message-ID: | b439edfc-c5e5-43a9-802d-4cb51ec20646@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 05/04/2024 13:49, Andrey M. Borodin wrote:
>> On 5 Apr 2024, at 02:08, Kirill Reshke <reshkekirill(at)gmail(dot)com> wrote:
Thanks for taking a look, Kirill!
>> maybe we need some hooks here? Or maybe, we can take CSN here from extension somehow.
>
> I really like the idea of CSN-provider-as-extension.
> But it's very important to move on with CSN, at least on standby, to make CSN actually happen some day.
> So, from my perspective, having LSN-as-CSN is already huge step forward.
Yeah, I really don't want to expand the scope of this.
Here's a new version. Rebased, and lots of comments updated.
I added a tiny cache of the CSN lookups into SnapshotData, which can
hold the values of 4 XIDs that are known to be visible to the snapshot,
and 4 invisible XIDs. This is pretty arbitrary, but the idea is to have
something very small to speed up the common cases that 1-2 XIDs are
repeatedly looked up, without adding too much overhead.
I did some performance testing of the visibility checks using these CSN
snapshots. The tests run SELECTs with a SeqScan in a standby, over a
table where all the rows have xmin/xmax values that are still
in-progress in the primary.
Three test scenarios:
1. large-xact: one large transaction inserted all the rows. All rows
have the same XMIN, which is still in progress
2. many-subxacts: one large transaction inserted each row in a separate
subtransaction. All rows have a different XMIN, but they're all
subtransactions of the same top-level transaction. (This causes the
subxids cache in the proc array to overflow)
3. few-subxacts: All rows are inserted, committed, and vacuum frozen.
Then, using 10 in separate subtransactions, DELETE the rows, in an
interleaved fashion. The XMAX values cycle like this "1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 1, 2, 3, 4, 5, ...". The point of this is that these
sub-XIDs fit in the subxids cache in the procarray, but the pattern
defeats the simple 4-element cache that I added.
The test script I used is attached. I repeated it a few times with
master and the patches here, and picked the fastest runs for each. Just
eyeballing the results, there's about ~10% variance in these numbers.
Smaller is better.
Master:
large-xact: 4.57732510566711
many-subxacts: 18.6958119869232
few-subxacts: 16.467698097229
Patched:
large-xact: 10.2999930381775
many-subxacts: 11.6501438617706
few-subxacts: 19.8457028865814
With cache:
large-xact: 3.68792295455933
many-subxacts: 13.3662350177765
few-subxacts: 21.4426419734955
The 'large-xacts' results show that the CSN lookups are slower than the
binary search on the 'xids' array. Not a surprise. The 4-element cache
fixes the regression, which is also not a surprise.
The 'many-subxacts' results show that the CSN lookups are faster than
the current method in master, when the subxids cache has overflowed.
That makes sense: on master, we always perform a lookup in pg_subtrans,
if the suxids cache has overflowed, which is more or less the same
overhead as the CSN lookup. But we avoid the binary search on the xids
array after that.
The 'few-subxacts' shows a regression, when the 4-element cache is not
effective. I think that's acceptable, the CSN approach has many
benefits, and I don't think this is a very common scenario. But if
necessary, it could perhaps be alleviated with more caching, or by
trying to compensate by optimizing elsewhere.
--
Heikki Linnakangas
Neon (https://neon.tech)
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Update-outdated-comment-on-WAL-logged-locks-with-.patch | text/x-patch | 1.7 KB |
v2-0002-XXX-add-perf-test.patch | text/x-patch | 5.6 KB |
v2-0003-Use-CSN-snapshots-during-Hot-Standby.patch | text/x-patch | 128.8 KB |
v2-0004-Make-SnapBuildWaitSnapshot-work-without-xl_runnin.patch | text/x-patch | 6.2 KB |
v2-0005-Remove-the-now-unused-xids-array-from-xl_running_.patch | text/x-patch | 7.0 KB |
v2-0006-Add-a-small-cache-to-Snapshot-to-avoid-CSN-lookup.patch | text/x-patch | 2.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jacob Champion | 2024-08-13 20:54:24 | Re: PG_TEST_EXTRA and meson |
Previous Message | Peter Eisentraut | 2024-08-13 20:13:27 | Re: Improve error message for ICU libraries if pkg-config is absent |