From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, tomas(at)vondra(dot)me, vignesh C <vignesh21(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Richard Guo <guofenglinux(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com> |
Subject: | Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning |
Date: | 2025-03-14 12:06:25 |
Message-ID: | CA+HiwqE-A8argoCCXUS4YAyWc94zOGO2U6oCiseSjMp__1QRZg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Ashutosh,
On Tue, Feb 25, 2025 at 8:04 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> On Thu, Feb 20, 2025 at 5:28 PM Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > On Tue, Feb 4, 2025 at 4:07 PM Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > > If we are not interested in saving memory, there is a simpler way to
> > > improve planning time by adding a hash table per equivalence class to
> > > store the derived clauses, instead of a linked list, when the number
> > > of derived clauses is higher than a threshold (say 32 same as the
> > > threshold for join_rel_list. Maybe that approach will yield stable
> > > planning time.
>
> I implemented the above idea in attached patches. I also added the
> following query, inspired from Alvaro's query, to summarise the
> results.
> with master_avgs as
> (select code_tag, num_parts, num_joins, pwj, avg(planning_time_ms)
> avg_pt, stddev(planning_time_ms) stddev_pt
> from msmts where code_tag = 'master'
> group by code_tag, num_parts, num_joins, pwj),
> patched_avgs as
> (select code_tag, num_parts, num_joins, pwj, avg(planning_time_ms)
> avg_pt, stddev(planning_time_ms) stddev_pt
> from msmts where code_tag = 'patched'
> group by code_tag, num_parts, num_joins, pwj)
> select num_parts,
> num_joins,
> format('s=%s%% md=%s%% pd=%s%%',
> ((m.avg_pt - p.avg_pt)/m.avg_pt * 100)::numeric(6, 2),
> (m.stddev_pt/m.avg_pt * 100)::numeric(6, 2),
> (p.stddev_pt/p.avg_pt * 100)::numeric(6, 2))
> from master_avgs m join patched_avgs p using (num_parts, num_joins,
> pwj) where not pwj order by 1, 2, 3;
> \crosstabview 1 2 3
>
> not pwj in the last line should be changed to pwj to get results with
> enable_partitionwise_join = true.
>
> With the attached patches, I observe following results
>
> With PWJ disabled
> num_parts | 2 | 3
> | 4 | 5
> -----------+-------------------------------+------------------------------+----------------------------+-----------------------------
> 0 | s=-4.44% md=17.91% pd=23.05% | s=-0.83% md=11.10%
> pd=19.58% | s=0.87% md=4.04% pd=7.91% | s=-35.24% md=7.63% pd=9.69%
> 10 | s=30.13% md=118.18% pd=37.44% | s=-3.49% md=0.58%
> pd=0.49% | s=-0.83% md=0.29% pd=0.35% | s=-0.24% md=0.35% pd=0.32%
> 100 | s=1.94% md=13.19% pd=4.08% | s=-0.27% md=0.18%
> pd=0.44% | s=7.04% md=3.05% pd=3.11% | s=12.75% md=1.69% pd=0.81%
> 500 | s=4.39% md=1.71% pd=1.33% | s=10.17% md=1.28%
> pd=1.90% | s=23.04% md=0.24% pd=0.58% | s=30.87% md=0.30% pd=1.11%
> 1000 | s=4.27% md=1.21% pd=1.97% | s=13.97% md=0.44%
> pd=0.79% | s=24.05% md=0.63% pd=1.02% | s=30.77% md=0.77% pd=0.17%
>
>
> Each cell is a triple (s, md, pd) where s is improvement in planning
> time using the patches in % as compared to the master (higher the
> better), md = standard deviation as % of the average planning time on
> master, pd = is standard deviation as % of the average planning time
> with patches.
>
> With PWJ enabled
> num_parts | 2 | 3
> | 4 | 5
> -----------+------------------------------+------------------------------+-----------------------------+------------------------------
> 0 | s=-94.25% md=6.98% pd=56.03% | s=44.10% md=141.13%
> pd=9.32% | s=42.71% md=46.00% pd=6.55% | s=-26.12% md=6.72% pd=15.20%
> 10 | s=-25.89% md=4.29% pd=63.75% | s=-1.34% md=3.15% pd=3.26%
> | s=0.31% md=4.13% pd=4.34% | s=-1.34% md=3.10% pd=6.73%
> 100 | s=-2.83% md=0.94% pd=1.31% | s=-2.17% md=4.57% pd=4.41%
> | s=0.98% md=1.59% pd=1.81% | s=1.87% md=1.10% pd=0.79%
> 500 | s=1.57% md=3.01% pd=1.70% | s=6.99% md=1.58% pd=1.68%
> | s=11.11% md=0.24% pd=0.62% | s=11.65% md=0.18% pd=0.90%
> 1000 | s=3.59% md=0.98% pd=1.78% | s=10.83% md=0.88% pd=0.46%
> | s=15.62% md=0.46% pd=0.13% | s=16.38% md=0.63% pd=0.29%
>
> Same numbers measured for previous set of patches [1], which improves
> both memory consumption as well as planning time.
>
> With PWJ disabled
> num_parts | 2 | 3
> | 4 | 5
> -----------+-------------------------------+------------------------------+------------------------------+--------------------------------
> 0 | s=4.68% md=18.17% pd=22.09% | s=-2.54% md=12.00%
> pd=13.81% | s=-2.02% md=3.84% pd=4.43% | s=-69.14% md=11.06%
> pd=126.87%
> 10 | s=-24.85% md=20.42% pd=35.69% | s=-4.31% md=0.73%
> pd=1.53% | s=-14.97% md=0.32% pd=31.90% | s=-0.57% md=0.79% pd=0.50%
> 100 | s=0.27% md=4.69% pd=1.55% | s=4.16% md=0.29% pd=0.18%
> | s=11.76% md=0.85% pd=0.49% | s=15.76% md=1.64% pd=2.32%
> 500 | s=0.54% md=1.88% pd=1.81% | s=9.36% md=1.17% pd=0.87%
> | s=21.45% md=0.74% pd=0.88% | s=30.47% md=0.17% pd=1.17%
> 1000 | s=3.22% md=1.36% pd=0.99% | s=14.74% md=0.86%
> pd=0.44% | s=24.50% md=0.36% pd=0.31% | s=27.97% md=0.27% pd=0.25%
>
> With PWJ enabled
> num_parts | 2 | 3
> | 4 | 5
> -----------+-----------------------------+----------------------------+----------------------------+-----------------------------
> 0 | s=11.07% md=19.28% pd=8.70% | s=-1.18% md=5.88% pd=4.31%
> | s=-2.25% md=8.42% pd=3.77% | s=25.07% md=11.48% pd=3.87%
> 10 | s=-9.07% md=2.65% pd=14.58% | s=0.55% md=3.10% pd=3.41%
> | s=3.89% md=3.94% pd=3.79% | s=7.25% md=2.87% pd=3.03%
> 100 | s=-4.53% md=0.49% pd=8.53% | s=2.24% md=4.24% pd=3.96%
> | s=6.70% md=1.30% pd=2.08% | s=9.09% md=1.39% pd=1.50%
> 500 | s=-1.65% md=1.59% pd=1.44% | s=6.31% md=0.89% pd=1.11%
> | s=12.72% md=0.20% pd=0.29% | s=15.02% md=0.28% pd=0.83%
> 1000 | s=1.53% md=1.01% pd=1.66% | s=11.80% md=0.66% pd=0.71%
> | s=16.23% md=0.58% pd=0.18% | s=17.16% md=0.67% pd=0.68%
>
> There are a few things to notice
> 1. There is not much difference in the planning time improvement by
> both the patchsets. But the patchset attached to [1] improves memory
> consumption as well. So it looks more attractive.
> 2. The performance regressions usually coincide with higher standard
> deviation. This indicates that both the performance gains or losses
> seen at lower numbers of partitions and joins are not real and
> possibly ignored. I have run the script multiple times but some or
> other combination of lower number of partitions and lower number of
> joins shows higher deviation and thus unstable results. I have not be
> able to find a way where all the combinations show a stable result.
Thanks for the patch and the extensive benchmarking.
Would you please also share a simple, self-contained example that I
can use to reproduce and verify the performance improvements? It’s
helpful to first see the patch in action with a representative query,
and then refer to the more extensive benchmark results you've already
shared as needed. I'm also having a hard time telling from the scripts
which query was used to produce the numbers in the report.
Btw, in the commit message, you mention:
===
When there are thousands of partitions in a partitioned table, there
can be thousands of derived clauses in the list making it inefficient
for a lookup.
===
I haven’t found a description of how that many clauses end up in the
ec->ec_derived list. IIUC, it's create_join_clause() where the child
clauses get added, and it would be helpful to mention that, since that
also appears to be the hotspot your patch is addressing.
> I think the first patch in the attached set is worth committing, just
> to tighten things up.
I agree and am happy to commit it if there are no objections.
--
Thanks, Amit Langote
From | Date | Subject | |
---|---|---|---|
Next Message | Bertrand Drouvot | 2025-03-14 12:08:46 | Re: Draft for basic NUMA observability |
Previous Message | Álvaro Herrera | 2025-03-14 11:16:43 | Re: bogus error message for ALTER TABLE ALTER CONSTRAINT |