From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, tomas(at)vondra(dot)me, vignesh C <vignesh21(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Richard Guo <guofenglinux(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com> |
Subject: | Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning |
Date: | 2025-03-19 02:52:13 |
Message-ID: | CA+HiwqEnCuXBz3sWap8EZnfgv8Z9tSGq+mhPztYsSy73A6khkg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Mar 18, 2025 at 8:48 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> On Tue, Mar 18, 2025 at 4:02 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> >
> > I ran a couple of benchmarks of my own as follows.
> >
> > cat benchmark_amit.sh
> > for p in 0 16 32 64 128 256 512 1024; do
> > echo -ne "$p\t";
> > pgbench -i --partitions=$p > /dev/null 2>&1
> > pgbench -n -T 30 -f /tmp/query.sql | grep latency | awk '{print $4}';
> > done
> >
> > For a 3-way join:
> > cat /tmp/query.sql
> > select * from pgbench_accounts t1, pgbench_accounts t2,
> > pgbench_accounts t3 where t2.aid = t1.aid AND t3.aid = t2.aid;
> >
> > nparts master patched %change
> > 0 35.508 36.066 1.571
> > 16 66.79 67.704 1.368
> > 32 67.774 68.179 0.598
> > 64 51.023 50.471 -1.082
> > 128 56.4 55.759 -1.137
> > 256 70.134 68.401 -2.471
> > 512 120.621 113.552 -5.861
> > 1024 405.339 312.726 -22.848
> >
> > For a 6-way jon
> > cat /tmp/query.sql
> > select * from pgbench_accounts t1, pgbench_accounts t2,
> > pgbench_accounts t3, pgbench_accounts t4, pgbench_accounts t5,
> > pgbench_accounts t6 where t2.aid = t1.aid AND t3.aid = t2.aid and
> > t4.aid = t3.aid and t5.aid = t4.aid and t6.aid = t5.aid;
> >
> > nparts master patched %change
> > 0 66.144 64.932 -1.832
> > 16 100.874 100.491 -0.380
> > 32 104.645 104.536 -0.104
> > 64 114.415 109.193 -4.564
> > 128 145.422 130.458 -10.290
> > 256 273.761 209.919 -23.320
> > 512 1359.896 616.295 -54.681
> > 1024 7183.765 2857.086 -60.229
> >
> > -60% means 60% reduction in latency due to the patch.
> >
> > As others have already found, performance improvements become
> > substantial as both partition count and join depth increase. The patch
> > seems to have minimal impact at low partition counts and low join
> > complexity -- base cases (e.g., 0–32 partitions, 3-way joins) are
> > essentially unchanged, which is good to see.
>
> I assume these are with enable_partitionwise_join = off since it's not
> enabled in the query.
Yes, those were with pwj=off. FTR, numbers I get with pwj=on.
3-way:
nparts master patched %change
0 38.407 34.675 -9.717
16 69.357 64.312 -7.274
32 70.027 67.079 -4.210
64 73.807 70.725 -4.176
128 83.875 81.945 -2.301
256 102.858 106.19 3.239
512 181.95 180.891 -0.582
1024 464.503 440.355 -5.199
6-way:
nparts master patched %change
0 64.411 67.175 4.291
16 203.344 209.75 3.150
32 296.952 300.966 1.352
64 445.536 449.917 0.983
128 805.103 781.892 -2.883
256 1695.746 1574.346 -7.159
512 4743.16 4010.196 -15.453
1024 16772.454 12284.706 -26.757
So a bit less impressive than the improvements for pwj=off. Also,
patch seems to make things worse for low partition counts (0-16) for
6-way joins, which I am not quite sure is within the noise range.
Have you noticed that too and, if yes, do you know what might be
causing it?
> > I haven’t measured memory increase from the patch, but let me know if
> > that has already been evaluated and shown not to be a showstopper.
>
> There's a slight increase in memory consumption because of the hash
> table but it's very minimal.
>
> Here are memory numbers in kb (presented in the same format as before)
> with pwj disabled
> num_parts | 2 | 3 |
> 4 | 5
> -----------+--------------------------+--------------------------+---------------------------+-----------------------------
> 0 | s=0 md=15 pd=15 | s=0 md=21 pd=21 | s=0
> md=27 pd=27 | s=0 md=33 pd=33
> 10 | s=0 md=218 pd=218 | s=-20 md=455 pd=475 |
> s=-20 md=868 pd=888 | s=-20 md=1697 pd=1717
> 100 | s=-20 md=1824 pd=1844 | s=-80 md=3718 pd=3798 |
> s=-160 md=6400 pd=6560 | s=-320 md=10233 pd=10553
> 500 | s=-160 md=9395 pd=9555 | s=-320 md=20216 pd=20536 |
> s=-640 md=35735 pd=36375 | s=-1280 md=60808 pd=62088
> 1000 | s=-320 md=19862 pd=20182 | s=-640 md=45739 pd=46379 |
> s=-1280 md=84210 pd=85490 | s=-2561 md=149740 pd=152301
>
> each column is s=difference in memory consumed, md = memory consumed
> without patch, pd = memory consumed with patch. -ve difference shows
> increase in memory consumption.
>
> with pwj enabled
> num_parts | 2 | 3 |
> 4 | 5
> -----------+--------------------------+----------------------------+-----------------------------+-------------------------------
> 0 | s=0 md=15 pd=15 | s=0 md=21 pd=21 |
> s=0 md=27 pd=27 | s=0 md=33 pd=33
> 10 | s=0 md=365 pd=365 | s=-20 md=1198 pd=1218 |
> s=-20 md=3571 pd=3591 | s=-20 md=10426 pd=10446
> 100 | s=-21 md=3337 pd=3358 | s=-80 md=11237 pd=11317 |
> s=-160 md=33845 pd=34005 | s=-320 md=99502 pd=99822
> 500 | s=-160 md=17206 pd=17366 | s=-320 md=60096 pd=60416 |
> s=-640 md=183306 pd=183946 | s=-1280 md=556705 pd=557985
> 1000 | s=-320 md=36119 pd=36439 | s=-640 md=131457 pd=132097 |
> s=-1280 md=404809 pd=406089 | s=-2561 md=1263664 pd=1266225
>
> %wise this is 3-4% maximum.
Ok, thanks for those. Looks within acceptable range to me.
--
Thanks, Amit Langote
From | Date | Subject | |
---|---|---|---|
Next Message | Sutou Kouhei | 2025-03-19 02:56:17 | Re: Make COPY format extendable: Extract COPY TO format implementations |
Previous Message | Bruce Momjian | 2025-03-19 02:46:43 | Re: Release freeze April 8 |