Re: MultiXact\SLRU buffers configuration

From: Gilles Darold <gilles(at)darold(dot)net>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MultiXact\SLRU buffers configuration
Date: 2020-12-13 09:17:51
Message-ID: d6f17699-0c18-1a47-8dfb-f52b36fb7c4f@darold.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le 11/12/2020 à 18:50, Gilles Darold a écrit :
> Le 10/12/2020 à 15:45, Gilles Darold a écrit :
>> Le 08/12/2020 à 18:52, Andrey Borodin a écrit :
>>> Hi Gilles!
>>>
>>> Many thanks for your message!
>>>
>>>> 8 дек. 2020 г., в 21:05, Gilles Darold <gilles(at)darold(dot)net> написал(а):
>>>>
>>>> I know that this report is not really helpful
>>> Quite contrary - this benchmarks prove that controllable reproduction exists. I've rebased patches for PG11. Can you please benchmark them (without extending SLRU)?
>>>
>>> Best regards, Andrey Borodin.
>>>
>> Hi,
>>
>>
>> Running tests yesterday with the patches has reported log of failures
>> with error on INSERT and UPDATE statements:
>>
>>
>> ERROR:  lock MultiXactOffsetControlLock is not held
>>
>>
>> After a patch review this morning I think I have found what's going
>> wrong. In patch
>> v6-0001-Use-shared-lock-in-GetMultiXactIdMembers-for-offs.patch I
>> think there is a missing reinitialisation of the lockmode variable to
>> LW_NONE inside the retry loop after the call to LWLockRelease() in
>> src/backend/access/transam/multixact.c:1392:GetMultiXactIdMembers().
>> I've attached a new version of the patch for master that include the
>> fix I'm using now with PG11 and with which everything works very well
>> now.
>>
>>
>> I'm running more tests to see the impact on the performances to play
>> with multixact_offsets_slru_buffers, multixact_members_slru_buffers
>> and multixact_local_cache_entries. I will reports the results later
>> today.
>>
>
> Hi,
>
> Sorry for the delay, I have done some further tests to try to reach
> the limit without bottlenecks on multixact or shared buffers. The
> tests was done on a Microsoft Asure machine with 2TB of RAM and 4
> sockets Intel Xeon Platinum 8280M (128 cpu). PG configuration:
>
>     max_connections = 4096
>     shared_buffers = 64GB
>     max_prepared_transactions = 2048
>     work_mem = 256MB
>     maintenance_work_mem = 2GB
>     wal_level = minimal
>     synchronous_commit = off
>     commit_delay = 1000
>     commit_siblings = 10
>     checkpoint_timeout = 1h
>     max_wal_size = 32GB
>     checkpoint_completion_target = 0.9
>
> I have tested with several values for the different buffer's variables
> starting from:
>
>     multixact_offsets_slru_buffers = 64
>     multixact_members_slru_buffers = 128
>     multixact_local_cache_entries = 256
>
> to the values with the best performances we achieve with this test to
> avoid MultiXactOffsetControlLock or MultiXactMemberControlLock:
>
>     multixact_offsets_slru_buffers = 128
>     multixact_members_slru_buffers = 512
>     multixact_local_cache_entries = 1024
>
> Also shared_buffers have been increased up to 256GB to avoid
> buffer_mapping contention.
>
> Our last best test reports the following wait events:
>
>      event_type |           event            |    sum
>     ------------+----------------------------+-----------
>      Client     | ClientRead                 | 321690211
>      LWLock     | buffer_content             |   2970016
>      IPC        | ProcArrayGroupUpdate       |   2317388
>      LWLock     | ProcArrayLock              |   1445828
>      LWLock     | WALWriteLock               |   1187606
>      LWLock     | SubtransControlLock        |    972889
>      Lock       | transactionid              |    840560
>      Lock       | relation                   |    587600
>      Activity   | LogicalLauncherMain        |    529599
>      Activity   | AutoVacuumMain             |    528097
>
> At this stage I don't think we can have better performances by tuning
> these buffers at least with PG11.
>
> About performances gain related to the patch for shared lock in
> GetMultiXactIdMembers unfortunately I can not see a difference with or
> without this patch, it could be related to our particular benchmark.
> But clearly the patch on multixact buffers should be committed as this
> is really helpfull to be able to tuned PG when multixact bottlenecks
> are found.

I've done more review on these patches.

1) as reported in my previous message patch 0001 looks useless as it
doesn't allow measurable performances gain.

2) In patch 0004 there is two typo: s/informaion/information/ will fix them

3) the GUC are missing in the postgresql.conf.sample file, see patch in
attachment for a proposal.

Best regards,

--
Gilles Darold
LzLabs GmbH
https://www.lzlabs.com/

Attachment Content-Type Size
postgresql_conf_multixact_buffers_GUCs.patch text/x-patch 819 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Khandekar 2020-12-13 12:46:54 Re: Speeding up GIST index creation for tsvectors
Previous Message James Coleman 2020-12-13 01:14:42 Re: Insert Documentation - Returning Clause and Order