From: | Yuya Watari <watari(dot)yuya(at)gmail(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Making empty Bitmapsets always be NULL |
Date: | 2023-06-12 12:31:47 |
Message-ID: | CAJ2pMkYcKHFBD_OMUSVyhYSQU0-j9T6NZ0pL6pwbZsUCohWc7Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
On Tue, Mar 7, 2023 at 1:07 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> I adjusted the patch to remove the invariant checks and fixed up a
> couple of things I'd missed. The 0002 patch changes the for loops
> into do while loops. I wanted to see if we could see any performance
> gains from doing this.
In March, I reported that David's patch caused a degradation in
planning performance. I have investigated this issue further and found
some bugs in the patch. Due to these bugs, Bitmapset operations in the
original patch computed incorrect results. This incorrect computation
resulted in unexpected behavior, which I observed as performance
degradation. After fixing the bugs, David's patch showed significant
performance improvements. In particular, it is worth noting that the
patch obtained a good speedup even when most Bitmapsets have only one
word.
1.1. Wrong truncation that we should not do (fixed in v2-0003)
The first bug is in bms_difference() and bms_del_members(). At the end
of these functions, the original patch truncated the result Bitmapset
when lastnonzero was -1. However, we must not do this when the result
Bitmapset is longer than the other. In such a case, the last word of
the result was still non-zero, so we cannot shorten nwords. I fixed
this bug in v2-0003.
1.2. Missing truncation that we should do (fixed in v2-0004)
The other bug is in bms_del_member(). As seen from v2-0004-*.patch,
the original patch missed the necessary truncation. I also fixed this
bug.
2. Experiments
I conducted experiments to evaluate the performance of David's patch
with bug fixes. In the experiments, I used two queries attached to
this email. The first query, Query A (query-a.sql), joins three tables
and performs an aggregation. This is quite a simple query. The second
query, Query B (query-b.sql), is more complicated because it joins
eight tables. In both queries, every table is split into n partitions.
I issued these queries with varying n and measured their planning
times. The following tables and attached figure show the results.
Table 1: Planning time and its speedup of Query A
(n: the number of partitions of each table)
(Speedup: higher is better)
---------------------------------------------
n | Master (ms) | Patched (ms) | Speedup
---------------------------------------------
1 | 0.722 | 0.682 | 105.8%
2 | 0.779 | 0.774 | 100.6%
4 | 0.977 | 0.958 | 101.9%
8 | 1.286 | 1.287 | 99.9%
16 | 1.993 | 1.986 | 100.4%
32 | 3.967 | 3.900 | 101.7%
64 | 7.783 | 7.310 | 106.5%
128 | 23.369 | 19.722 | 118.5%
256 | 108.723 | 75.149 | 144.7%
384 | 265.576 | 167.354 | 158.7%
512 | 516.468 | 301.100 | 171.5%
640 | 883.167 | 494.960 | 178.4%
768 | 1423.839 | 755.201 | 188.5%
896 | 2195.935 | 1127.786 | 194.7%
1024 | 3041.131 | 1444.145 | 210.6%
---------------------------------------------
Table 2: Planning time and its speedup of Query B
--------------------------------------------
n | Master (ms) | Patched (ms) | Speedup
--------------------------------------------
1 | 36.038 | 35.455 | 101.6%
2 | 34.831 | 34.178 | 101.9%
4 | 36.537 | 35.998 | 101.5%
8 | 41.234 | 40.333 | 102.2%
16 | 52.427 | 50.596 | 103.6%
32 | 87.064 | 80.013 | 108.8%
64 | 228.050 | 187.762 | 121.5%
128 | 886.140 | 645.731 | 137.2%
256 | 4212.709 | 2853.072 | 147.7%
--------------------------------------------
You can quickly reproduce my experiments by the following commands.
== Query A ==
psql -f create-tables-a.sql
psql -f query-a.sql
=============
== Query B ==
psql -f create-tables-b.sql
psql -f query-b.sql
=============
The above results indicate that David's patch demonstrated outstanding
performance. The speedup reached 210.6% for Query A and 147.7% for
Query B. Even when n is small, the patch reduced planning time. The
main concern about this patch was overheads for Bitmapsets with only
one or two words. My experiments imply that such overheads are
non-existent or negligible because some performance improvements were
obtained even for small sizes.
The results of my experiments strongly support the effectiveness of
David's patch. I think this optimization is worth considering.
I am looking forward to your comments.
--
Best regards,
Yuya Watari
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Remove-trailing-zero-words-from-Bitmapsets.patch | application/octet-stream | 9.7 KB |
v2-0002-Use-do-while-loops-instead-of-for-loops.patch | application/octet-stream | 8.9 KB |
v2-0003-Fixed-a-bug-where-the-result-Bitmapset-was-incorr.patch | application/octet-stream | 1.6 KB |
v2-0004-Fix-a-bug-where-the-necessary-truncation-was-miss.patch | application/octet-stream | 2.2 KB |
figure.png | image/png | 185.4 KB |
create-tables-a.sql | application/octet-stream | 1.7 KB |
query-a.sql | application/octet-stream | 307 bytes |
create-tables-b.sql | application/octet-stream | 915 bytes |
query-b.sql | application/octet-stream | 491 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2023-06-12 12:46:28 | Re: Do we want a hashset type? |
Previous Message | Pavel Borisov | 2023-06-12 12:23:14 | Re: Let's make PostgreSQL multi-threaded |