Re: Memory consumed by paths during partitionwise join planning

From: Andrei Lepikhov <lepihov(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Memory consumed by paths during partitionwise join planning
Date: 2024-09-19 12:16:44
Message-ID: 203efc9b-a7f4-474f-9ac1-c5db57f727c7@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19/9/2024 13:12, Ashutosh Bapat wrote:
> On Thu, Sep 19, 2024 at 4:18 PM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
>> At the same time, this technique (while highly useful in general) adds
>> fragility and increases complexity: a developer needs to remember to
>> link the path using the pointer in different places of the code.
>> So, maybe go the alternative way? Invent a subquery memory context and
>> store all the path allocations there. It can be freed after setrefs
>> finishes this subquery planning without pulling up this subquery.
>>
>
> Using memory context for subquery won't help with partitioning right?
> If the outermost query has a partitioned table with thousands of
> partitions, it will still accumulate those paths till the very end of
> planning.
I got it. Just haven't had huge tables in the outer before.

> We could instead use memory context/s to store all the paths
> created, then copy the optimal paths into a new memory context at
> strategic points and blow up the old memory context. And repeat this
> till we choose the final path and create a plan out of it; at that
> point we could blow up the memory context containing remaining paths
> as well. That will free the paths as soon as they are rendered
> useless.
I think any scalable solution should be based on a per-partition
cleanup. For starters, why not adopt Tom's patch [1] for selectivity
estimations? We will see the profit in the case of long lists of clauses.

> I discussed this idea with Alvaro offline. We thought that
> this approach needs some code to copy paths and then copying paths
> recursively has some overhead of itself.
It needs path_tree_walker at first. We discussed it before but failed.
Maybe design it beforehand and use it in re-parameterising code?

> The current work of adding a
> reference count, OTOH has potential to bring discipline into the way
> we handle paths. We need to avoid risks posed by dangling pointers.
Both hands up for having pointer counters: It is painful all the time in
extensions to invent an approach to safely removing a path you want to
replace with a custom one. I just want to say it looks too dangerous
compared to the value of a possible positive outcome.

> For which Alvaro suggested looking at the way we manage snapshots. But
> I didn't get time to explore that idea yet.
Unfortunately, I can't understand this idea without an explanation.

[1] Optimize planner memory consumption for huge arrays
https://www.postgresql.org/message-id/1367418.1708816059@sss.pgh.pa.us

--
regards, Andrei Lepikhov

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2024-09-19 12:53:52 Re: Wrong results with equality search using trigram index and non-deterministic collation
Previous Message vignesh C 2024-09-19 12:13:09 Re: Conflict Detection and Resolution