From: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
---|---|
To: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
Cc: | tomas(at)vondra(dot)me, vignesh C <vignesh21(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Richard Guo <guofenglinux(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Subject: | Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning |
Date: | 2024-11-25 05:50:05 |
Message-ID: | CAExHW5vnwgTgfsCiNM7E4TnkxD1b_ZHPafNe1f041u=o131PYg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Nov 24, 2024 at 7:00 PM Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
>
> > On Thu, Oct 10, 2024 at 05:36:10PM GMT, Ashutosh Bapat wrote:
> >
> > 3. With just patch 0001 applied, planning time usually shows
> > degradation (column Q and X in planning time sheets) with or without
> > PWJ enabled. I first thought that it might be because of the increased
> > size of PlannerInfo. We had seen a similar phenomenon when adding a
> > new member to WindowAggState [2]. Hence I introduced patch 0002 which
> > moves two fields around to not increase the size of structure. But
> > that doesn't fix the regression in the planning time (columns R and
> > Y). Apart from increasing the PlannerInfo size and may be object file
> > size, 0002 does not have any other impact. But the regression seen
> > with just that patch is more than what we saw in [2]. More investigate
> > is required to decide whether this regression is real or not and if
> > real, the root cause. Looking at the numbers, it seems that this
> > regression is causing the planning time regression in rest of the
> > patches. If we fix regression by 0001, we should not see much
> > regression in rest of the patches. I am looking for some guidance in
> > investigating this regression.
>
> Hi,
>
> I've tried to reproduce some subset of those results, in case if I would
> be able to notice anything useful. Strangely enough, I wasn't able to
> get much boost in planning time e.g. with 4 first patches, 100
> partitions and 5 joins -- the results you've posted are showing about
> 16% in that case, where I'm getting only a couple of percents. Probably
> I'm doing something differently, but it's turned out to be hard to
> reconstruct (based only on this thread) how did you exactly benchmark
> the patch -- could you maybe summarize the benchmark in a reproducible
> way?
Thanks for your interest in this patch.
> As a side note, may I ask to attach benchmark results as files, rather than a
> link to a google spreadsheet? It feels nice to have something static to work
> with.
I maintain separate GDoc files for every email I send. The numbers
don't change in that sheet. But I have attached the same sheet to this
email.
Thanks for running banc
>
> From what I understand you were testing againt an empty partitioned table. Here
> is what I was doing:
>
> create table t1p (c1 int) partition by list(c1);
> select format('create table %I partition of t1p for values in (%s)',
> 't1p' || i, i) from generate_series(1, 100) i; \gexec
>
> do $x$
> declare
> i record;
> plan float[];
> plan_line text;
> begin
> for i in select * from generate_series(1, 1000) i loop
> for plan_line in execute format($y$
> explain analyze
> select * from t1p t1, t1p t2, t1p t3, t1p t4, t1p t5
> where t2.c1 =
> t1.c1 and t3.c1 = t2.c1 and t4.c1 = t3.c1 and t5.c1 = t4.c1
> $y$) loop
> if plan_line like '%Planning Time%' then
> plan := array_append(plan, substring(plan_line from '\d+.\d+')::float);
> end if;
> end loop;
> end loop;
>
> -- skip the first record as prewarming
> raise warning 'avg: %',
> (select avg(v) from unnest(plan[2:]) v);
> raise warning 'median: %',
> (select percentile_cont(0.5) within group(order by v)
> from unnest(plan[2:]) v);
> end;
> $x$;
>
Hmm, I am doing something similar to what you are doing. Here are my scripts.
setup.sql - creates partitioned table, and functions, tables used to
run the benchmark
benchmark.sh - creates queries with all combinations of
enable_partitionwise_join, number of partitions, joins etc. and runs
EXPLAIN on each query recording the results in a table.
run_bm_on_commits.sh - runs setup.sql once, then runs benchmark.sh on
each commit (using git rebase) and finally outputs the average numbers
to be fed to the "aggregate numbers" sheet.
The SQL script should be portable but the shell scripts might need
changes per your environment. E.g it assumes an alias
$ alias bs
alias bs='$BinDir/pg_ctl start -D $DataDir'
--
Best Wishes,
Ashutosh Bapat
Attachment | Content-Type | Size |
---|---|---|
PWJ memory consumption with RestrictInfo fixes - 8 Oct 2024.xlsx | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet | 311.5 KB |
benchmark.sh | application/x-shellscript | 862 bytes |
setup.sql | application/sql | 3.8 KB |
run_bm_on_commits.sh | application/x-shellscript | 980 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Sutou Kouhei | 2024-11-25 06:01:50 | Re: Make COPY format extendable: Extract COPY TO format implementations |
Previous Message | Ashutosh Bapat | 2024-11-25 04:54:47 | Re: Enhancing Memory Context Statistics Reporting |