Re: [PoC] Reducing planning time when tables have many partitions

From: Yuya Watari <watari(dot)yuya(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, jian he <jian(dot)universality(at)gmail(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Zhang Mingli <zmlpostgres(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: [PoC] Reducing planning time when tables have many partitions
Date: 2025-01-07 06:56:53
Message-ID: CAJ2pMkZZHrhgQ5UV0y+STKqx7XVGzENMhL98UbKM-OArvK9dmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello all,

On Fri, Dec 20, 2024 at 2:26 PM Yuya Watari <watari(dot)yuya(at)gmail(dot)com> wrote:
>
> Overall, v29 demonstrates a better balance between planning time and
> memory usage. There may still be room for further optimization of the
> iterator mechanism, but I believe this is a good step towards
> addressing previous concerns. I would appreciate any feedback or
> suggestions.

I was looking at the v29-0001 patch and noticed that it lacks indexes
for the child joinrels generated by add_child_join_rel_equivalences().
To address this, I have introduced an inverted index mechanism to
speed up lookups for these child joinrels. This email summarizes the
changes, experimental results, and comparisons to both older patches
(especially v19) and the master.

1. Inverted index approach

In my approach, child EquivalenceMembers are stored in RelOptInfo
during add_child_join_rel_equivalences(). The inverted indexes in v30
maintain a mapping from Relids to their corresponding RelOptInfos.
When retrieving child members, we take the union of these indexes to
find all child EquivalenceMembers whose em_relids overlap the given
Relids. For more details, please see the comments in
iterate_child_rel_equivalences().

2. Merging small changes

Many small changes had accumulated as individual commits in previous
versions. For easier review, I have merged them into fewer commits in
v30. A diff between v29 and v30 is attached for quick reference.

3. Experimental setup

In this email, I ran additional experiments using a new query called
Query C (attached). Query C highlights the performance issues found in
previous versions, especially v29. I also tested v19 (rebased) [1],
which was the last version before my new approach was introduced. Note
that the rebased v19 does NOT pass regression tests. I may have missed
something, but I have not investigated the issue in detail.

In the experiments, I tested three queries, A and B (from [2]), and
the new query C. The patch versions tested were:
* Master
* v19 (rebased, but fails regression tests)
* v29
* v30
* v30 w/o 0004 (to evaluate the effect of the iterator mechanism)

4. Memory Usage

Using EXPLAIN (MEMORY), I measured the memory usage. The results are
shown in Table 1 and the attached figure. Here, "n" is the number of
partitions per table, and "PWJ" stands for partition-wise join.

Table 1: Memory usage (MB)
-------------------------------------------------------------------------------
Query | n | PWJ | Master | v19 | v29 | v30 | v30 w/o 0004
-------------------------------------------------------------------------------
A | 1024 | OFF | 47.821 | 79.281 | 59.159 | 59.183 | 69.027
A | 1024 | ON | 123.347 | 154.807 | 134.685 | 134.708 | 144.553
B | 256 | OFF | 90.409 | 204.711 | 110.068 | 110.084 | 197.214
B | 256 | ON | 5198.579 | 5312.881 | 5218.238 | 5218.254 | 5305.383
C | 1024 | OFF | 36.854 | 44.356 | 37.999 | 38.022 | 38.022
C | 1024 | ON | 85.574 | 100.843 | 215.571 | 88.932 | 88.932
-------------------------------------------------------------------------------

Summary:
* v19 used more memory than the other versions (including v30 w/o
0004, where the iterator mechanism was removed, and excluding query C
in v29).
* v29 used excessive memory for query C when PWJ was enabled, but v30
reduced it significantly.
* v30 used less memory than v19.

5. Planning Time (Queries A, B, and C)

Tables 2, 4, and 6 show the absolute planning times, and tables 3, 5,
and 7 show the corresponding speedups (higher is better). Below is a
brief summary:
* v19 and v30 have nearly identical planning times for small and large sizes.
* v29 introduced a major regression in query C, which was fixed in v30.
* v30 showed some regression for large sizes in query B, but this was
not seen in v30 w/o 0004, indicating that this regression was due to
the iterator mechanism introduced in v29.

Table 2: Planning time for query A (ms)
----------------------------------------------------------
n | Master | v19 | v29 | v30 | v30 w/o 0004
----------------------------------------------------------
1 | 0.241 | 0.245 | 0.246 | 0.246 | 0.247
2 | 0.261 | 0.264 | 0.264 | 0.264 | 0.266
4 | 0.320 | 0.343 | 0.330 | 0.337 | 0.330
8 | 0.430 | 0.432 | 0.433 | 0.434 | 0.437
16 | 0.612 | 0.606 | 0.611 | 0.611 | 0.615
32 | 1.070 | 1.001 | 1.009 | 1.014 | 1.029
64 | 2.278 | 1.952 | 2.176 | 2.189 | 2.190
128 | 6.129 | 4.579 | 4.633 | 4.562 | 4.216
256 | 17.485 | 10.432 | 10.636 | 10.750 | 10.534
384 | 32.584 | 15.916 | 16.495 | 16.569 | 15.993
512 | 54.771 | 21.923 | 23.066 | 23.065 | 22.008
640 | 88.273 | 28.381 | 30.097 | 30.127 | 28.136
768 | 136.878 | 34.678 | 37.818 | 37.931 | 34.573
896 | 216.365 | 50.652 | 53.296 | 53.031 | 49.368
1024 | 293.751 | 49.036 | 55.981 | 55.623 | 48.279
----------------------------------------------------------

Table 3: Speedup of query A (higher is better)
------------------------------------------------
n | v19 | v29 | v30 | v30 w/o 0004
------------------------------------------------
1 | 98.3% | 97.7% | 97.8% | 97.6%
2 | 98.9% | 98.8% | 99.1% | 98.1%
4 | 93.5% | 96.9% | 95.2% | 97.2%
8 | 99.4% | 99.2% | 99.0% | 98.3%
16 | 101.1% | 100.2% | 100.2% | 99.6%
32 | 106.9% | 106.1% | 105.5% | 104.0%
64 | 116.7% | 104.7% | 104.1% | 104.0%
128 | 133.8% | 132.3% | 134.3% | 145.4%
256 | 167.6% | 164.4% | 162.7% | 166.0%
384 | 204.7% | 197.5% | 196.7% | 203.7%
512 | 249.8% | 237.4% | 237.5% | 248.9%
640 | 311.0% | 293.3% | 293.0% | 313.7%
768 | 394.7% | 361.9% | 360.9% | 395.9%
896 | 427.2% | 406.0% | 408.0% | 438.3%
1024 | 599.1% | 524.7% | 528.1% | 608.5%
------------------------------------------------

Table 4: Planning time for query B (ms)
-----------------------------------------------------------
n | Master | v19 | v29 | v30 | v30 w/o 0004
-----------------------------------------------------------
1 | 11.918 | 12.405 | 12.020 | 11.870 | 12.020
2 | 11.413 | 11.864 | 11.524 | 11.369 | 11.575
4 | 11.896 | 12.225 | 11.895 | 11.787 | 11.966
8 | 13.201 | 13.086 | 12.830 | 12.658 | 12.888
16 | 15.917 | 14.742 | 14.490 | 14.398 | 14.569
32 | 21.838 | 17.842 | 17.776 | 17.658 | 17.793
64 | 44.337 | 26.242 | 27.055 | 26.910 | 26.508
128 | 126.472 | 46.073 | 50.969 | 51.004 | 47.114
256 | 631.093 | 98.469 | 128.827 | 129.046 | 101.041
-----------------------------------------------------------

Table 5: Speedup of query B (higher is better)
-----------------------------------------------
n | v19 | v29 | v30 | v30 w/o 0004
-----------------------------------------------
1 | 96.1% | 99.1% | 100.4% | 99.1%
2 | 96.2% | 99.0% | 100.4% | 98.6%
4 | 97.3% | 100.0% | 100.9% | 99.4%
8 | 100.9% | 102.9% | 104.3% | 102.4%
16 | 108.0% | 109.8% | 110.5% | 109.3%
32 | 122.4% | 122.9% | 123.7% | 122.7%
64 | 169.0% | 163.9% | 164.8% | 167.3%
128 | 274.5% | 248.1% | 248.0% | 268.4%
256 | 640.9% | 489.9% | 489.0% | 624.6%
-----------------------------------------------

Table 6: Planning time for query C (ms)
-------------------------------------------------------------
n | Master | v19 | v29 | v30 | v30 w/o 0004
-------------------------------------------------------------
1 | 0.262 | 0.266 | 0.263 | 0.262 | 0.263
2 | 0.380 | 0.383 | 0.379 | 0.379 | 0.380
4 | 0.526 | 0.534 | 0.529 | 0.528 | 0.526
8 | 0.841 | 0.851 | 0.844 | 0.833 | 0.833
16 | 1.593 | 1.599 | 1.581 | 1.564 | 1.573
32 | 3.393 | 3.359 | 3.392 | 3.306 | 3.293
64 | 6.795 | 6.584 | 6.835 | 6.950 | 6.977
128 | 15.439 | 14.461 | 15.619 | 14.143 | 14.231
256 | 35.247 | 31.036 | 35.643 | 30.011 | 30.422
512 | 85.460 | 66.008 | 91.261 | 64.484 | 64.724
1024 | 331.060 | 151.319 | 338.119 | 147.063 | 146.964
-------------------------------------------------------------

Table 7: Speedup of query C (higher is better)
------------------------------------------------
n | v19 | v29 | v30 | v30 w/o 0004
------------------------------------------------
1 | 98.6% | 99.7% | 100.0% | 99.7%
2 | 99.1% | 100.2% | 100.2% | 99.8%
4 | 98.6% | 99.6% | 99.6% | 100.0%
8 | 98.8% | 99.6% | 100.9% | 100.9%
16 | 99.6% | 100.7% | 101.8% | 101.2%
32 | 101.0% | 100.0% | 102.6% | 103.0%
64 | 103.2% | 99.4% | 97.8% | 97.4%
128 | 106.8% | 98.8% | 109.2% | 108.5%
256 | 113.6% | 98.9% | 117.4% | 115.9%
512 | 129.5% | 93.6% | 132.5% | 132.0%
1024 | 218.8% | 97.9% | 225.1% | 225.3%
------------------------------------------------

6. Conclusions

For planning time, v30 performs as well as the older version, v19. For
memory usage, v30 still consumes some memory, but much less than v19.
v19 consumes more memory than v30 w/o 0004, where the memory reduction
mechanism is not present.

Overall, v30 offers a balanced approach to both planning time and
memory usage. I would greatly appreciate any feedback, reviews, or
further suggestions.

[1] https://www.postgresql.org/message-id/CAJ2pMkbsP4f4SvUx%2BGguQ1BaA8oVo4BfcLvy6--c3QqQcB8PAQ%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAJ2pMkYcKHFBD_OMUSVyhYSQU0-j9T6NZ0pL6pwbZsUCohWc7Q@mail.gmail.com

--
Best regards,
Yuya Watari

Attachment Content-Type Size
diff-v29-v30.txt text/plain 10.4 KB
figure.png image/png 266.8 KB
v30-0001-Speed-up-searches-for-child-EquivalenceMembers.patch application/octet-stream 68.2 KB
v30-0002-Resolve-conflict-with-commit-66c0185.patch application/octet-stream 6.8 KB
v30-0003-Introduce-indexes-for-RestrictInfo.patch application/octet-stream 41.5 KB
v30-0004-Introduce-RestrictInfoIterator-to-reduce-memory-.patch application/octet-stream 17.1 KB
v19-rebased.txt text/plain 101.8 KB
create-tables-c.sql application/octet-stream 506 bytes
query-c.sql application/octet-stream 292 bytes

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2025-01-07 06:57:57 Re: Proposal: add new API to stringinfo
Previous Message Zhijie Hou (Fujitsu) 2025-01-07 06:40:50 RE: Conflict detection for update_deleted in logical replication