Quick Links

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Date:	2023-10-14 04:13:35
Message-ID:	CAA4eK1+FwFhFux2HnJkr6x2BfZhvqQRMic=FCvGf2nKqQzw1qQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Oct 11, 2023 at 4:35 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> The small size of the SLRU buffer pools can sometimes become a
> performance problem because it’s not difficult to have a workload
> where the number of buffers actively in use is larger than the
> fixed-size buffer pool. However, just increasing the size of the
> buffer pool doesn’t necessarily help, because the linear search that
> we use for buffer replacement doesn’t scale, and also because
> contention on the single centralized lock limits scalability.
>
> There is a couple of patches proposed in the past to address the
> problem of increasing the buffer pool size, one of the patch [1] was
> proposed by Thomas Munro where we make the size of the buffer pool
> configurable. And, in order to deal with the linear search in the
> large buffer pool, we divide the SLRU buffer pool into associative
> banks so that searching in the buffer pool doesn’t get affected by the
> large size of the buffer pool. This does well for the workloads which
> are mainly impacted by the frequent buffer replacement but this still
> doesn’t stand well with the workloads where the centralized control
> lock is the bottleneck.
>
> So I have taken this patch as my base patch (v1-0001) and further
> added 2 more improvements to this 1) In v1-0002, Instead of a
> centralized control lock for the SLRU I have introduced a bank-wise
> control lock 2)In v1-0003, I have removed the global LRU counter and
> introduced a bank-wise counter. The second change (v1-0003) is in
> order to avoid the CPU/OS cache invalidation due to frequent updates
> of the single variable, later in my performance test I will show how
> much gain we have gotten because of these 2 changes.
>
> Note: This is going to be a long email but I have summarised the main
> idea above this point and now I am going to discuss more internal
> information in order to show that the design idea is valid and also
> going to show 2 performance tests where one is specific to the
> contention on the centralized lock and other is mainly contention due
> to frequent buffer replacement in SLRU buffer pool. We are getting ~2x
> TPS compared to the head by these patches and in later sections, I am
> going discuss this in more detail i.e. exact performance numbers and
> analysis of why we are seeing the gain.
>
...
>
> Performance Test:
> Exp1: Show problems due to CPU/OS cache invalidation due to frequent
> updates of the centralized lock and a common LRU counter. So here we
> are running a parallel transaction to pgbench script which frequently
> creates subtransaction overflow and that forces the visibility-check
> mechanism to access the subtrans SLRU.
> Test machine: 8 CPU/ 64 core/ 128 with HT/ 512 MB RAM / SSD
> scale factor: 300
> shared_buffers=20GB
> checkpoint_timeout=40min
> max_wal_size=20GB
> max_connections=200
>
> Workload: Run these 2 scripts parallelly:
> ./pgbench -c $ -j $ -T 600 -P5 -M prepared postgres
> ./pgbench -c 1 -j 1 -T 600 -f savepoint.sql postgres
>
> savepoint.sql (create subtransaction overflow)
> BEGIN;
> SAVEPOINT S1;
> INSERT INTO test VALUES(1)
> ← repeat 70 times →
> SELECT pg_sleep(1);
> COMMIT;
>
> Code under test:
> Head: PostgreSQL head code
> SlruBank: The first patch applied to convert the SLRU buffer pool into
> the bank (0001)
> SlruBank+BankwiseLockAndLru: Applied 0001+0002+0003
>
> Results:
> Clients Head SlruBank SlruBank+BankwiseLockAndLru
> 1 457 491 475
> 8 3753 3819 3782
> 32 14594 14328 17028
> 64 15600 16243 25944
> 128 15957 16272 31731
>
> So we can see that at 128 clients, we get ~2x TPS(with SlruBank +
> BankwiseLock and bankwise LRU counter) as compared to HEAD.
>

This and other results shared by you look promising. Will there be any
improvement in workloads related to clog buffer usage? BTW, I remember
that there was also a discussion of moving SLRU into a regular buffer
pool [1]. You have not provided any explanation as to whether that
approach will have any merits after we do this or whether that
approach is not worth pursuing at all.

[1] - https://commitfest.postgresql.org/43/3514/

--
With Regards,
Amit Kapila.

In response to

SLRU optimization - configurable buffer pool and partitioning the SLRU lock at 2023-10-11 11:04:37 from Dilip Kumar

Responses

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock at 2023-10-20 04:10:35 from Dilip Kumar

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrew Atkinson	2023-10-14 04:16:56	[Doc] Glossary Term Definitions Edits
Previous Message	Erik Wienhold	2023-10-14 02:50:05	Re: JSON Path and GIN Questions