From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, "Imai, Yoshikazu" <imai(dot)yoshikazu(at)jp(dot)fujitsu(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Simon Riggs <simon(at)2ndquadrant(dot)com> |
Subject: | Re: Speed up transaction completion faster after many relations are accessed in a transaction |
Date: | 2022-01-01 01:44:46 |
Message-ID: | CAApHDvr=Xx8j05g3Op9chdtft4PhAu6Quzy8BFQO5ORpxVSngw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 3 Dec 2021 at 20:36, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> Two months later, this has been switched to RwF.
I was discussing this patch with Andres. He's not very keen on my
densehash hash table idea and suggested that instead of relying on
trying to make the hash table iteration faster, why don't we just
ditch the lossiness of the resource owner code which only records the
first 16 locks, and just make it have a linked list of all locks.
This is a little more along the lines of the original patch, however,
it does not increase the size of the LOCALLOCK struct.
I've attached a patch which does this. This was mostly written
(quickly) by Andres, I just did a cleanup, fixed up a few mistakes and
fixed a few bugs.
I ran the same performance tests on this patch as I did back in [1]:
-- Test 1. Select 1 record from a 140 partitioned table. Tests
creating a large number of locks with a fast query.
create table hp (a int, b int) partition by hash(a);
select 'create table hp'||x||' partition of hp for values with
(modulus 140, remainder ' || x || ');' from generate_series(0,139)x;
create index on hp (b);
insert into hp select x,x from generate_series(1, 140000) x;
analyze hp;
select3.sql: select * from hp where b = 1
select3.sql master
drowley(at)amd3990x:~$ pgbench -n -f select3.sql -T 60 -M prepared postgres
tps = 2099.708748 (without initial connection time)
tps = 2100.398516 (without initial connection time)
tps = 2094.882341 (without initial connection time)
tps = 2113.218820 (without initial connection time)
tps = 2104.717597 (without initial connection time)
select3.sql patched
drowley(at)amd3990x:~$ pgbench -n -f select3.sql -T 60 -M prepared postgres
tps = 2010.070738 (without initial connection time)
tps = 1994.963606 (without initial connection time)
tps = 1994.668849 (without initial connection time)
tps = 1995.948168 (without initial connection time)
tps = 1985.650945 (without initial connection time)
You can see that there's a performance regression here. I've not yet
studied why this appears.
-- Test 2. Tests a prepared query which will perform a generic plan on
the 6th execution then fallback on a custom plan due to it pruning all
but one partition. Master suffers from the lock table becoming
bloated after locking all partitions when planning the generic plan.
create table ht (a int primary key, b int, c int) partition by hash (a);
select 'create table ht' || x::text || ' partition of ht for values
with (modulus 8192, remainder ' || (x)::text || ');' from
generate_series(0,8191) x;
\gexec
select.sql:
\set p 1
select * from ht where a = :p
select.sql master
drowley(at)amd3990x:~$ pgbench -n -f select.sql -T 60 -M prepared postgres
tps = 18014.460090 (without initial connection time)
tps = 17973.358889 (without initial connection time)
tps = 17847.480647 (without initial connection time)
tps = 18038.332507 (without initial connection time)
tps = 17776.143206 (without initial connection time)
select.sql patched
drowley(at)amd3990x:~$ pgbench -n -f select.sql -T 60 -M prepared postgres
tps = 32393.457106 (without initial connection time)
tps = 32277.204349 (without initial connection time)
tps = 32160.719830 (without initial connection time)
tps = 32530.038130 (without initial connection time)
tps = 32299.019657 (without initial connection time)
You can see that there are some quite good performance gains with this test.
I'm going to add this to the January commitfest.
David
Attachment | Content-Type | Size |
---|---|---|
speedup_releasing_all_locks.patch | application/octet-stream | 37.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2022-01-01 02:19:18 | Re: Fix BUG #17335: Duplicate result rows in Gather node |
Previous Message | Tomas Vondra | 2021-12-31 22:06:42 | Collecting statistics about contents of JSONB columns |