From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Speed up TransactionIdIsCurrentTransactionId() with lots of subxacts |
Date: | 2024-10-10 21:21:12 |
Message-ID: | ddd3aea3-4211-4229-88c3-cc552b3e1f86@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Motivation
----------
While performance testing my CSN patch, I wrote a test case that looked
something like this:
CREATE TABLE tbl(id int);
INSERT INTO tbl SELECT g FROM generate_series(1, 100000) g;
BEGIN;
SAVEPOINT sp0;
DELETE * FROM tbl WHERE id % 1000 = 0;
SAVEPOINT sp1;
DELETE * FROM tbl WHERE id % 1000 = 1;
SAVEPOINT sp2;
DELETE * FROM tbl WHERE id % 1000 = 2;
...
SAVEPOINT sp999;
DELETE FROM tbl WHERE id % 1000 = 999;
That felt slow. I knew that having lots of subtransactions is not
optimal, but it felt *very* slow, and got progressively slower towards
the end. 'perf' says that all the time is spent in
TransactionIdIsCurrentTranactoinId:
> + 98.52% 98.45% postgres postgres [.] TransactionIdIsCurrentTransactionId
In this scenario with lots of active subtransactions,
TransactionIdIsCurrentTranactionId effectively degenerates into a linear
search over a linked list, as it traverses each level in the
CurrentTransactionState stack.
Committed subtransactions are held in the 'childXids' array on each
nesting level, so I can easily add "RELEASE SAVEPOINT" to each iteration
to make it fast.
But looking at how we track the 'childXids' in xact.c, I think there's
an easy optimization to be made here:
The patch
---------
Instead of having a separate childXids array on each transaction level,
we can maintain one large array of XIDs that includes the XIDs of all
committed and in-progress subtransactions. On each nesting level,
remember the offset within that array, so that all XIDs belonging to
that nesting level or its parents are above that offset. When a
subtransaction commits, you don't need to copy XIDs to the parent, you
just adjust the parent's offset into the array, to include the child's XIDs.
With one large array, TransactionIdIsCurrentTransactionId() can perform
one binary search over the array, regerdless of how many nesting levels
are active.
Patch attached, as well as a little shell script to produce that test
case. To use, pipe it to psql. The patch makes the test case about 10x
faster, but since this is a question of O(n) vs O(log(n)), you can make
it arbitrarily bad by changing the number of savepoints.
--
Heikki Linnakangas
Neon (https://neon.tech)
Attachment | Content-Type | Size |
---|---|---|
0001-Speed-up-TransactionIdIsCurrentTransactionId-with-lo.patch | text/x-patch | 16.4 KB |
savepoints-speed.sh | application/x-shellscript | 306 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2024-10-10 21:45:19 | Re: sunsetting md5 password support |
Previous Message | Mikael Sand | 2024-10-10 21:07:13 | Re: Build issue with postgresql 17 undefined reference to `pg_encoding_to_char' and `pg_char_to_encoding' |