Re: generic plans and "initial" pruning

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thom Brown <thom(at)linux(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: generic plans and "initial" pruning
Date: 2025-01-31 08:31:44
Message-ID: CA+HiwqFA5hUWYktt3VMh4zQOYMxqH-MpdX8eemfM+o-9dY-zbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 23, 2025 at 4:15 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> I’ve rebased over recent changes to setrefs.c (commit bf826ea0629).
> During the rebase, I realized that the patch
> 0002-Initialize-PartitionPruneContexts-lazily wasn’t a good idea after
> all.
>
> The test case added by bf826ea0629 highlighted an issue: initializing
> pruning expressions lazily during execution could leave the
> Append/MergeAppend node’s PlanState.subPlan uninitialized at
> ExecInitNode() time. Initially, I thought this would have only
> cosmetic consequences -- such as changes in test case output where
> SubPlans referenced in "exec" pruning expressions wouldn’t appear --
> but I may have underestimated the problem. As a result, I’ve abandoned
> that approach and the patch in favor of initializing all pruning
> expressions during plan initialization.
>
> Additionally, I revisited the impact of the main patch on
> ExecutorStart_hooks. It seems better to change the return type from
> void to bool, returning the result of
> ExecPlanStillValid(queryDesc->estate). This change has the added
> benefit of breaking extensions that use ExecutorStart_hook at compile
> time, encouraging authors to update their code. The updated commit
> message includes details on additional checks extensions must
> implement, particularly for cases where they might access pruned and
> thus unlocked relations.
>
> I've stared at the refactoring patches 0001 and 0002 for long enough
> at this point that I'd like to commit them early next week, barring
> further comments or objections. I'll keep staring at 0003.

I have now pushed 0001 and 0002.

I broke 0003 into two patches:

Patch to track unpruned relations in the executor, allowing the
overhead of processing pruned partitions to be skipped during plan
initialization. This is particularly relevant for top-level nodes such
as ModifyTable and LockRows, which -- unlike Append / MergeAppend --
do not ignore initially pruned partitions. Since initial pruning is
now performed separately from plan initialization and earlier in
InitPlan(), we can fix this by checking whether a given child result
relation or RowMark belongs to a pruned partition and skipping it.

Patch to defer locking of prunable relations from GetCachedPlan() to
InitPlan(), preventing partitions pruned by initial pruning from being
locked unnecessarily.

With the attached 0001, I can see that saving the overhead of
initializing ResultRelInfos for pruned partitions in
ExecInitModifyTable() results in a noticeable speedup for pgbench
-Mprepared with partitions, especially at higher partition counts
where the overhead is more significant. The numbers I have here are a
bit noisy, but they provide a general idea of the performance benefit
of skipping initially pruned partitions during plan initialization.

Setup:

plan_cache_mode = force_generic_plan
max_locks_per_transaction = 1000

for i in 100 200 500 1000 2000; do
echo -ne "$i\t"
pgbench -i --partitions=$i > /dev/null 2>&1;
pgbench -n -Mprepared -T 10 | grep tps;
done

With master:
100 tps = 2837.095192 (without initial connection time)
200 tps = 2614.143255 (without initial connection time)
500 tps = 1960.666074 (without initial connection time)
1000 tps = 1390.691229 (without initial connection time)
2000 tps = 884.882656 (without initial connection time)

With 0001:
100 tps = 2889.600827 (without initial connection time)
200 tps = 2720.895632 (without initial connection time)
500 tps = 2096.177756 (without initial connection time)
1000 tps = 1659.265873 (without initial connection time)
2000 tps = 1148.976177 (without initial connection time)

With 0002:
100 tps = 3070.137629 (without initial connection time)
200 tps = 4589.336857 (without initial connection time)
500 tps = 2977.339119 (without initial connection time)
1000 tps = 2885.417560 (without initial connection time)
2000 tps = 3832.111167 (without initial connection time)

--
Thanks, Amit Langote

Attachment Content-Type Size
v61-0001-Track-unpruned-relids-to-avoid-processing-pruned.patch application/octet-stream 40.4 KB
v61-0002-Defer-locking-of-runtime-prunable-relations-in-c.patch application/octet-stream 87.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message Nazir Bilal Yavuz 2025-01-31 08:29:31 Re: Show WAL write and fsync stats in pg_stat_io