Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: David Christensen <david+pg(at)pgguru(dot)net>
Cc: Richard Guo <guofenglinux(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)
Date: 2024-06-05 07:47:56
Message-ID: CAExHW5tvE4DLYuVjWs-vRNACdk2b_5+SK8mpc9L+CDOAbXBdEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi David,
Thanks for looking into this.

On Fri, May 31, 2024 at 2:19 AM David Christensen <david+pg(at)pgguru(dot)net>
wrote:

> Hello,
>
> I am looking through some outstanding CommitFest entries; I wonder if
> this particular patch is already effectively fixed by 5278d0a2, which
> is both attributed to the original author as well as an extremely
> similar approach. Can this entry
> (https://commitfest.postgresql.org/48/4553/) be closed?
>

This is different. But it needs a rebase. PFA rebased patch. The revised
commit message explains the change better.

Here are numbers revised after 5278d0a2. Since the code changes only affect
partitionwise join code path, I am reporting only partition wise join
numbers. The first column reports the number of joining relations, each
having 1000 partitions. Rest of the column names are self-explanatory.

Planning memory used:
num_joins | master | patched | memory saved | memory saved
-----------+---------+---------+--------------+----------------------------
2 | 31 MB | 30 MB | 525 kB | 1.68%
3 | 111 MB | 107 MB | 4588 kB | 4.03%
4 | 339 MB | 321 MB | 17 MB | 5.13%
5 | 1062 MB | 967 MB | 95 MB | 8.96%

Here's planning time measurements.
num_joins | master (ms) | patched (ms) | change in planning time (ms) |
change in planning time
-----------+-------------+--------------+------------------------------+---------------------------------------
2 | 187.86 | 177.27 | 10.59 |
5.64%
3 | 721.81 | 758.80 | -36.99 |
-5.12%
4 | 2239.87 | 2236.19 | 3.68 |
0.16%
5 | 6830.86 | 7027.76 | -196.90 |
-2.88%
The patch calls find_appinfos_by_relids() only once (instead of twice),
calls bms_union() on child relids only once, allocates and deallocates
appinfos array only once saving some CPU cycles but spends some CPU cycles
in bms_free(). There's expected to be a net small saving in planning time.
But above numbers show planning time increases in some cases and decreases
in some cases. Repeating the experiment multiple times does not show a
stable increase or decrease. But the variations all being within noise
levels. We could conclude that the patch does not affect planning time
adversely.

The scripts I used originally [1] need a revision too since EXPLAIN MEMORY
output has changed. PFA the scripts. setup.sql creates the required
functions and tables. queries.sql EXPLAINs queries with different number of
joining relations collecting planning time and memory numbers in a table
(msmts). We need to change psql variable code to 'patched' or 'master' when
using master + patches or master branch respectively. Following queries are
used to report above measurements from msmt.

planning memory: select p.num_joins, pg_size_pretty(m.average * 1024)
"master", pg_size_pretty(p.average * 1024) as "patched",
pg_size_pretty((m.average - p.average) * 1024) "memory saved", ((m.average
- p.average) * 100/m.average)::numeric(42, 2) "memory saved in percentage"
from msmts p, msmts m where p.num_joins = m.num_joins and p.variant =
m.variant and p.code = 'patched' and m.code = 'master' and p.measurement =
m.measurement and p.measurement = 'planning memory' and p.variant = 'pwj'
order by num_joins;

planning time: select p.num_joins, m.average "master (ms)", p.average
"patched (ms)", m.average - p.average "change in planning time (ms)",
((m.average - p.
average) * 100/m.average)::numeric(42, 2) "change in planning time as
percentage" from msmts p, msmts m where p.num_joins = m.num_joins and
p.variant = m.var
iant and p.code = 'patched' and m.code = 'master' and p.measurement =
m.measurement and p.measurement = 'planning time' and p.variant = 'pwj'
order by p.vari
ant, num_joins;

[1]
https://www.postgresql.org/message-id/CAExHW5snUW7pD2RdtaBa1T_TqJYaY6W_YPVjWDrgSf33i-0uqA%40mail.gmail.com

--
Best Wishes,
Ashutosh Bapat

Attachment Content-Type Size
setup.sql application/sql 3.6 KB
queries.sql application/sql 2.5 KB
0001-Avoid-repeated-computations-in-try_partitio-20240605.patch text/x-patch 5.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2024-06-05 08:19:01 Should we move the resowner field from JitContext to LLVMJitContext?
Previous Message Peter Eisentraut 2024-06-05 07:21:22 Re: Logical Replication of sequences