Quick Links

Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning

From:	Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc:	Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, tomas(at)vondra(dot)me, vignesh C <vignesh21(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Richard Guo <guofenglinux(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject:	Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
Date:	2025-02-25 11:04:28
Message-ID:	CAExHW5scMxyFRqOFE6ODmBiW2rnVBEmeEcA-p4W_CyuEikURdA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On Thu, Feb 20, 2025 at 5:28 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> On Tue, Feb 4, 2025 at 4:07 PM Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > If we are not interested in saving memory, there is a simpler way to
> > improve planning time by adding a hash table per equivalence class to
> > store the derived clauses, instead of a linked list, when the number
> > of derived clauses is higher than a threshold (say 32 same as the
> > threshold for join_rel_list. Maybe that approach will yield stable
> > planning time.
> >

I implemented the above idea in attached patches. I also added the
following query, inspired from Alvaro's query, to summarise the
results.
with master_avgs as
(select code_tag, num_parts, num_joins, pwj, avg(planning_time_ms)
avg_pt, stddev(planning_time_ms) stddev_pt
from msmts where code_tag = 'master'
group by code_tag, num_parts, num_joins, pwj),
patched_avgs as
(select code_tag, num_parts, num_joins, pwj, avg(planning_time_ms)
avg_pt, stddev(planning_time_ms) stddev_pt
from msmts where code_tag = 'patched'
group by code_tag, num_parts, num_joins, pwj)
select num_parts,
num_joins,
format('s=%s%% md=%s%% pd=%s%%',
((m.avg_pt - p.avg_pt)/m.avg_pt * 100)::numeric(6, 2),
(m.stddev_pt/m.avg_pt * 100)::numeric(6, 2),
(p.stddev_pt/p.avg_pt * 100)::numeric(6, 2))
from master_avgs m join patched_avgs p using (num_parts, num_joins,
pwj) where not pwj order by 1, 2, 3;
\crosstabview 1 2 3

not pwj in the last line should be changed to pwj to get results with
enable_partitionwise_join = true.

With the attached patches, I observe following results

Each cell is a triple (s, md, pd) where s is improvement in planning
time using the patches in % as compared to the master (higher the
better), md = standard deviation as % of the average planning time on
master, pd = is standard deviation as % of the average planning time
with patches.

Same numbers measured for previous set of patches [1], which improves
both memory consumption as well as planning time.

There are a few things to notice
1. There is not much difference in the planning time improvement by
both the patchsets. But the patchset attached to [1] improves memory
consumption as well. So it looks more attractive.
2. The performance regressions usually coincide with higher standard
deviation. This indicates that both the performance gains or losses
seen at lower numbers of partitions and joins are not real and
possibly ignored. I have run the script multiple times but some or
other combination of lower number of partitions and lower number of
joins shows higher deviation and thus unstable results. I have not be
able to find a way where all the combinations show a stable result.

I think the first patch in the attached set is worth committing, just
to tighten things up.

[1] https://www.postgresql.org/message-id/CAExHW5t6NpGaif6ZO_P+L1cPZk27+Ye2LxfqRVXf0OiSsW9WSg@mail.gmail.com
--
Best Wishes,
Ashutosh Bapat

Attachment	Content-Type	Size
0001-Add-Assert-in-find_derived_clause_for_ec_me-20250225.patch	text/x-patch	1008 bytes
0002-Make-EquivalenceClass-clause-lookup-faster-20250225.patch	text/x-patch	23.5 KB

In response to

Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning at 2025-02-20 11:58:48 from Ashutosh Bapat

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	John Naylor	2025-02-25 11:05:17	Re: Improve CRC32C performance on SSE4.2
Previous Message	Amit Kapila	2025-02-25 10:42:10	Re: long-standing data loss bug in initial sync of logical replication