From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
Cc: | Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Richard Guo <guofenglinux(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com> |
Subject: | Re: Memory consumed by child SpecialJoinInfo in partitionwise join planning |
Date: | 2024-03-22 09:23:42 |
Message-ID: | CA+HiwqEpMiDw=416uRqfmyXd-SeCFQm48jLkowxzXpDa7uGqYg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Ashutosh,
On Tue, Mar 19, 2024 at 12:47 AM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> On Mon, Mar 18, 2024 at 5:40 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>> >>
>> >> Sorry, I should’ve mentioned that I was interested in seeing cpu times to compare the two approaches. Specifically, to see if the palloc / frees add noticeable overhead.
>> >
>> > No problem. Here you go
>> >
>> > tables | master | patched | perc_change
>> > --------+----------+----------+-------------
>> > 2 | 477.87 | 492.32 | -3.02
>> > 3 | 1970.83 | 1989.52 | -0.95
>> > 4 | 6305.01 | 6268.81 | 0.57
>> > 5 | 19261.56 | 18684.86 | 2.99
>> >
>> > +ve change indicates reduced planning time. It seems that the planning time improves as the number of tables increases. But all the numbers are well within noise levels and thus may not show any improvement or deterioration. If you want, I can repeat the experiment.
>>
>> Thanks. I also wanted to see cpu time difference between the two
>> approaches of SpecialJoinInfo allocation -- on stack and on heap. So
>> comparing times with the previous version of the patch using stack
>> allocation and the new one with palloc. I'm hoping that the new
>> approach is only worse in the noise range.
>
>
> Ah, sorry, I didn't read it carefully. Alvaro made me realise that I have been gathering numbers using assert enabled builds, so they are not that reliable. Here they are with non-assert enabled builds.
>
> planning time (in ms) reported by EXPLAIN.
> tables | master | stack_alloc | perc_change_1 | palloc | perc_change_2 | total_perc_change
> --------+----------+-------------+---------------+----------+---------------+-------------------
> 2 | 338.1 | 333.92 | 1.24 | 332.16 | 0.53 | 1.76
> 3 | 1507.93 | 1475.76 | 2.13 | 1472.79 | 0.20 | 2.33
> 4 | 5099.45 | 4980.35 | 2.34 | 4947.3 | 0.66 | 2.98
> 5 | 15442.64 | 15531.94 | -0.58 | 15393.41 | 0.89 | 0.32
>
> stack_alloc = timings with SpecialJoinInfo on stack
> perc_change_1 = percentage change of above wrt master
> palloc = timings with palloc'ed SpecialJoinInfo
> perc_change_2 = percentage change of above wrt timings with stack_alloc
> total_perc_change = percentage change between master and all patches
>
> total change is within noise. Timing with palloc is better than that with SpecialJoinInfo on stack but only marginally. I don't think that means palloc based allocation is better or worse than stack based.
>
> You requested CPU time difference between stack based SpecialJoinInfo and palloc'ed SpecialJoinInfo. Here it is
> tables | stack_alloc | palloc | perc_change
> --------+-------------+----------+-------------
> 2 | 0.438204 | 0.438986 | -0.18
> 3 | 1.224672 | 1.238781 | -1.15
> 4 | 3.511317 | 3.663334 | -4.33
> 5 | 9.283856 | 9.340516 | -0.61
>
> Yes, there's a consistent degradation albeit within noise levels.
>
> The memory measurements
> tables | master | with_patch
> --------+--------+------------
> 2 | 26 MB | 26 MB
> 3 | 93 MB | 84 MB
> 4 | 277 MB | 235 MB
> 5 | 954 MB | 724 MB
>
> The first row shows the same value because of rounding. The actual values there are 27740432 and 26854704 resp.
>
> Please let me know if you need anything.
Thanks for repeating the benchmark. So I don't see a problem with
keeping the existing palloc() way of allocating the
SpecialJoinInfos(). We're adding a few cycles by adding
free_child_join_sjinfo() (see my delta patch about the rename), but
the reduction in memory consumption due to it, which is our goal here,
far outweighs what looks like a minor CPU slowdown.
I have squashed 0002, 0003 into 0001, done some changes of my own, and
attached the delta patch as 0002. I've listed the changes in the
commit message. Let me know if anything seems amiss.
I'd like to commit 0001 + 0002 + any other changes based on your
comments (if any) sometime early next week.
--
Thanks, Amit Langote
Attachment | Content-Type | Size |
---|---|---|
v1-0002-Some-fixes.patch | application/x-patch | 11.9 KB |
v1-0001-Reduce-memory-used-by-child-SpecialJoinInfo.patch | application/x-patch | 12.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2024-03-22 09:29:21 | Re: Introduce XID age and inactive timeout based replication slot invalidation |
Previous Message | Peter Eisentraut | 2024-03-22 09:18:33 | Re: Reducing output size of nodeToString |