From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | generic plans and "initial" pruning |
Date: | 2021-12-25 03:36:00 |
Message-ID: | CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Executing generic plans involving partitions is known to become slower
as partition count grows due to a number of bottlenecks, with
AcquireExecutorLocks() showing at the top in profiles.
Previous attempt at solving that problem was by David Rowley [1],
where he proposed delaying locking of *all* partitions appearing under
an Append/MergeAppend until "initial" pruning is done during the
executor initialization phase. A problem with that approach that he
has described in [2] is that leaving partitions unlocked can lead to
race conditions where the Plan node belonging to a partition can be
invalidated when a concurrent session successfully alters the
partition between AcquireExecutorLocks() saying the plan is okay to
execute and then actually executing it.
However, using an idea that Robert suggested to me off-list a little
while back, it seems possible to determine the set of partitions that
we can safely skip locking. The idea is to look at the "initial" or
"pre-execution" pruning instructions contained in a given Append or
MergeAppend node when AcquireExecutorLocks() is collecting the
relations to lock and consider relations from only those sub-nodes
that survive performing those instructions. I've attempted
implementing that idea in the attached patch.
Note that "initial" pruning steps are now performed twice when
executing generic plans: once in AcquireExecutorLocks() to find
partitions to be locked, and a 2nd time in ExecInit[Merge]Append() to
determine the set of partition sub-nodes to be initialized for
execution, though I wasn't able to come up with a good idea to avoid
this duplication.
Using the following benchmark setup:
pgbench testdb -i --partitions=$nparts > /dev/null 2>&1
pgbench -n testdb -S -T 30 -Mprepared
And plan_cache_mode = force_generic_plan,
I get following numbers:
HEAD:
32 tps = 20561.776403 (without initial connection time)
64 tps = 12553.131423 (without initial connection time)
128 tps = 13330.365696 (without initial connection time)
256 tps = 8605.723120 (without initial connection time)
512 tps = 4435.951139 (without initial connection time)
1024 tps = 2346.902973 (without initial connection time)
2048 tps = 1334.680971 (without initial connection time)
Patched:
32 tps = 27554.156077 (without initial connection time)
64 tps = 27531.161310 (without initial connection time)
128 tps = 27138.305677 (without initial connection time)
256 tps = 25825.467724 (without initial connection time)
512 tps = 19864.386305 (without initial connection time)
1024 tps = 18742.668944 (without initial connection time)
2048 tps = 16312.412704 (without initial connection time)
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Teach-AcquireExecutorLocks-to-acquire-fewer-locks.patch | application/octet-stream | 62.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2021-12-25 04:20:44 | Re: row filtering for logical replication |
Previous Message | Tom Lane | 2021-12-24 22:05:01 | Re: Proposal: sslmode=tls-only |