Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, tomas(at)vondra(dot)me, vignesh C <vignesh21(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Richard Guo <guofenglinux(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
Date: 2025-02-04 10:37:33
Message-ID: CAExHW5so5dZA_Dw0V9NCgFOw=xTyfA0rH8yi27hRmjjhft33tQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 31, 2025 at 5:41 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> On 2024-Nov-25, Ashutosh Bapat wrote:
>
> > Hmm, I am doing something similar to what you are doing. Here are my
> > scripts. setup.sql - creates partitioned table, and functions, tables
> > used to run the benchmark benchmark.sh - creates queries with all
> > combinations of enable_partitionwise_join, number of partitions, joins
> > etc. and runs EXPLAIN on each query recording the results in a table.
>
> I was curious about this still being alive and uncommitted, so I
> wondered if Dmitry was on to something about this patch not improving
> things. So I tried to rerun Ashutosh's benchmark (of course, on a build
> with no C assertions, otherwise the numbers are meaningless). First,
> the patches still apply to current master. Turns on that they improve
> planning times not insignificantly in the majority of cases (not all).
> This is on my laptop and I didn't do anything in particular to keep it
> stable, though.
>
> I don't know about this modern googol sheets thing you're talking about,
> so I use \crosstabview like my cavemen forefathers did. After running
> benchmark.sh

Thanks for suggesting \crosstabview. The summary table looks good. But
it doesn't tell the difference (absolute or %), so the reader has to
do the maths themselves. Maybe I could improve the query itself to
report the difference.

> (note: it needs "-X" on the psql lines, otherwise it gets
> all confused by funny stuff in my .psqlrc) to get the SQL to run, I
> obtain a summary table with this psql query:

Will incorporate -X

>
> select code_tag, num_parts, num_joins,
> format ('n=%s avg=%s dev=%s', count(planning_time_ms) filter (where pwj),
> (avg(planning_time_ms) filter (where pwj))::numeric(6, 2),
> (stddev(planning_time_ms) filter (where pwj))::numeric(6,2))
> from msmts where code_tag = 'master'
> group by code_tag, num_parts, num_joins
> order by 1, 2, 3 \crosstabview 2 3 4
>
> Here's the tables I got.
>
> PWJ on, master:
> num_parts │ 2 │ 3 │ 4 │ 5
> ───────────┼───────────────────────────┼────────────────────────────┼─────────────────────────────┼─────────────────────────────
> 0 │ n=10 avg=0.05 dev=0.00 │ n=10 avg=0.16 dev=0.04 │ n=10 avg=0.38 dev=0.10 │ n=10 avg=0.95 dev=0.14
> 10 │ n=10 avg=0.41 dev=0.02 │ n=10 avg=1.25 dev=0.09 │ n=10 avg=4.93 dev=0.30 │ n=10 avg=12.12 dev=0.69
> 100 │ n=10 avg=4.68 dev=0.40 │ n=10 avg=21.34 dev=1.04 │ n=10 avg=65.09 dev=1.91 │ n=10 avg=206.87 dev=3.87
> 500 │ n=10 avg=55.11 dev=1.43 │ n=10 avg=240.97 dev=7.90 │ n=10 avg=834.72 dev=35.67 │ n=10 avg=2534.78 dev=107.28
> 1000 │ n=10 avg=242.40 dev=21.09 │ n=10 avg=1085.65 dev=38.11 │ n=10 avg=3161.00 dev=151.04 │ n=10 avg=9634.34 dev=635.57
>
> PWJ on, all patches:
> num_parts │ 2 │ 3 │ 4 │ 5
> ───────────┼──────────────────────────┼───────────────────────────┼─────────────────────────────┼─────────────────────────────
> 0 │ n=10 avg=0.05 dev=0.00 │ n=10 avg=0.12 dev=0.01 │ n=10 avg=0.34 dev=0.01 │ n=10 avg=0.91 dev=0.02
> 10 │ n=10 avg=0.37 dev=0.01 │ n=10 avg=1.17 dev=0.07 │ n=10 avg=4.09 dev=0.25 │ n=10 avg=10.31 dev=0.38
> 100 │ n=10 avg=4.62 dev=0.14 │ n=10 avg=17.17 dev=0.45 │ n=10 avg=54.05 dev=0.98 │ n=10 avg=178.05 dev=2.69
> 500 │ n=10 avg=61.32 dev=1.91 │ n=10 avg=229.54 dev=15.82 │ n=10 avg=701.33 dev=34.16 │ n=10 avg=2176.00 dev=84.28
> 1000 │ n=10 avg=195.74 dev=5.73 │ n=10 avg=789.49 dev=16.44 │ n=10 avg=2786.55 dev=254.03 │ n=10 avg=9177.05 dev=467.33
>
>
> PWJ off, master:
> num_parts │ 2 │ 3 │ 4 │ 5
> ───────────┼──────────────────────────┼───────────────────────────┼────────────────────────────┼─────────────────────────────
> 0 │ n=10 avg=0.06 dev=0.02 │ n=10 avg=0.16 dev=0.04 │ n=10 avg=0.39 dev=0.07 │ n=10 avg=1.08 dev=0.17
> 10 │ n=10 avg=0.27 dev=0.03 │ n=10 avg=0.54 dev=0.01 │ n=10 avg=1.05 dev=0.03 │ n=10 avg=2.09 dev=0.07
> 100 │ n=10 avg=5.17 dev=2.45 │ n=10 avg=8.96 dev=0.14 │ n=10 avg=17.25 dev=0.29 │ n=10 avg=36.11 dev=1.06
> 500 │ n=10 avg=46.82 dev=1.84 │ n=10 avg=149.06 dev=2.79 │ n=10 avg=396.95 dev=26.15 │ n=10 avg=912.93 dev=31.78
> 1000 │ n=10 avg=219.86 dev=5.21 │ n=10 avg=697.27 dev=14.96 │ n=10 avg=1925.81 dev=65.78 │ n=10 avg=4857.81 dev=248.71
>
>
> PWJ off, allpatches:
> num_parts │ 2 │ 3 │ 4 │ 5
> ───────────┼───────────────────────────┼───────────────────────────┼────────────────────────────┼─────────────────────────────
> 0 │ n=10 avg=0.06 dev=0.01 │ n=10 avg=0.13 dev=0.02 │ n=10 avg=0.34 dev=0.02 │ n=10 avg=0.95 dev=0.06
> 10 │ n=10 avg=0.25 dev=0.02 │ n=10 avg=0.52 dev=0.01 │ n=10 avg=0.96 dev=0.01 │ n=10 avg=1.86 dev=0.01
> 100 │ n=10 avg=5.43 dev=2.37 │ n=10 avg=7.30 dev=0.16 │ n=10 avg=12.93 dev=0.35 │ n=10 avg=24.56 dev=0.49
> 500 │ n=10 avg=50.10 dev=2.35 │ n=10 avg=156.04 dev=11.05 │ n=10 avg=332.48 dev=17.44 │ n=10 avg=711.22 dev=21.44
> 1000 │ n=10 avg=174.02 dev=15.26 │ n=10 avg=567.23 dev=8.81 │ n=10 avg=1480.75 dev=45.04 │ n=10 avg=3578.19 dev=240.20
>
>
> So it looks to me like for high number of partitions and joins, this
> wins hands down in terms of planning time. For some of the 2/3 joins
> and 100/500 partitions, it loses.
>

The combination for which the planning time regresses is not fixed -
it shifts every time I run the benchmark. But I see regression with
one or the other combination. So I haven't been able to decide whether
it's a real regression or not. Planning time for a small number of
joins vary a lot from run to run.

>
> As for planner memory, which this was supposed to improve, I don't find
> any significant improvement, except for 1000 partitions and 5 joins
> (where it goes from 1071991 to 874480 kilobytes); IMO it's not worth
> framing this improvement from that point of view, because it doesn't
> seem compelling, at least to me.

If we are not interested in saving memory, there is a simpler way to
improve planning time by adding a hash table per equivalence class to
store the derived clauses, instead of a linked list, when the number
of derived clauses is higher than a threshold (say 32 same as the
threshold for join_rel_list. Maybe that approach will yield stable
planning time.

--
Best Wishes,
Ashutosh Bapat

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Vladlen Popolitov 2025-02-04 10:46:10 Re: Make COPY format extendable: Extract COPY TO format implementations
Previous Message Yura Sokolov 2025-02-04 10:36:39 Re: sinvaladt.c: remove msgnumLock, use atomic operations on maxMsgNum