From: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
---|---|
To: | Amit Langote <amitlangote09(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, tomas(at)vondra(dot)me, vignesh C <vignesh21(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Richard Guo <guofenglinux(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com> |
Subject: | Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning |
Date: | 2025-03-19 06:31:29 |
Message-ID: | CAExHW5uzJ7VLGzRcp9A+keATk8WBUg+8SggyUUYYS-EZZyLC8w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Mar 19, 2025 at 8:22 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>
> On Tue, Mar 18, 2025 at 8:48 PM Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > On Tue, Mar 18, 2025 at 4:02 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > >
> > > I ran a couple of benchmarks of my own as follows.
> > >
> > > cat benchmark_amit.sh
> > > for p in 0 16 32 64 128 256 512 1024; do
> > > echo -ne "$p\t";
> > > pgbench -i --partitions=$p > /dev/null 2>&1
> > > pgbench -n -T 30 -f /tmp/query.sql | grep latency | awk '{print $4}';
> > > done
> > >
> > > For a 3-way join:
> > > cat /tmp/query.sql
> > > select * from pgbench_accounts t1, pgbench_accounts t2,
> > > pgbench_accounts t3 where t2.aid = t1.aid AND t3.aid = t2.aid;
> > >
> > > nparts master patched %change
> > > 0 35.508 36.066 1.571
> > > 16 66.79 67.704 1.368
> > > 32 67.774 68.179 0.598
> > > 64 51.023 50.471 -1.082
> > > 128 56.4 55.759 -1.137
> > > 256 70.134 68.401 -2.471
> > > 512 120.621 113.552 -5.861
> > > 1024 405.339 312.726 -22.848
> > >
> > > For a 6-way jon
> > > cat /tmp/query.sql
> > > select * from pgbench_accounts t1, pgbench_accounts t2,
> > > pgbench_accounts t3, pgbench_accounts t4, pgbench_accounts t5,
> > > pgbench_accounts t6 where t2.aid = t1.aid AND t3.aid = t2.aid and
> > > t4.aid = t3.aid and t5.aid = t4.aid and t6.aid = t5.aid;
> > >
> > > nparts master patched %change
> > > 0 66.144 64.932 -1.832
> > > 16 100.874 100.491 -0.380
> > > 32 104.645 104.536 -0.104
> > > 64 114.415 109.193 -4.564
> > > 128 145.422 130.458 -10.290
> > > 256 273.761 209.919 -23.320
> > > 512 1359.896 616.295 -54.681
> > > 1024 7183.765 2857.086 -60.229
> > >
> > > -60% means 60% reduction in latency due to the patch.
> > >
> > > As others have already found, performance improvements become
> > > substantial as both partition count and join depth increase. The patch
> > > seems to have minimal impact at low partition counts and low join
> > > complexity -- base cases (e.g., 0–32 partitions, 3-way joins) are
> > > essentially unchanged, which is good to see.
> >
> > I assume these are with enable_partitionwise_join = off since it's not
> > enabled in the query.
>
> Yes, those were with pwj=off. FTR, numbers I get with pwj=on.
>
> 3-way:
>
> nparts master patched %change
> 0 38.407 34.675 -9.717
> 16 69.357 64.312 -7.274
> 32 70.027 67.079 -4.210
> 64 73.807 70.725 -4.176
> 128 83.875 81.945 -2.301
> 256 102.858 106.19 3.239
> 512 181.95 180.891 -0.582
> 1024 464.503 440.355 -5.199
>
> 6-way:
>
> nparts master patched %change
> 0 64.411 67.175 4.291
> 16 203.344 209.75 3.150
> 32 296.952 300.966 1.352
> 64 445.536 449.917 0.983
> 128 805.103 781.892 -2.883
> 256 1695.746 1574.346 -7.159
> 512 4743.16 4010.196 -15.453
> 1024 16772.454 12284.706 -26.757
>
> So a bit less impressive than the improvements for pwj=off. Also,
> patch seems to make things worse for low partition counts (0-16) for
> 6-way joins, which I am not quite sure is within the noise range.
> Have you noticed that too and, if yes, do you know what might be
> causing it?
I have observed similar things with lower numbers of partitions. In my
observation, such results have coincided with relatively large
variance. With 0 partitions there is no difference in the code
behaviour irrespective of pwj ON or Off. Hence we don't expect any
variation in the numbers you posted yesterday and today when nparts =
0. But there it is. I have always observed that much variation in
planning time for one combination or the other.
With 6 -way join there will be 6 * 5 - 5 derived clauses in one
equivalence class, That's close to 32, which is the threshold to start
using a hash table. So some slight perturbation is expected around
that threshold. But given that it's lower than 32, that shouldn't
apply here.
--
Best Wishes,
Ashutosh Bapat
From | Date | Subject | |
---|---|---|---|
Next Message | Steven Niu | 2025-03-19 06:34:51 | Add missing PQclear for StreamLogicalLog function |
Previous Message | Shubham Khanna | 2025-03-19 06:14:21 | Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided. |