Re: Memory consumed by paths during partitionwise join planning

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Memory consumed by paths during partitionwise join planning
Date: 2024-09-19 11:12:14
Message-ID: CAExHW5svb4B161GYJD3cGAx1oFV0ny7nJdr_QEGLZzxsSXaZVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 19, 2024 at 4:18 PM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
>
> On 6/2/2024 13:51, Ashutosh Bapat wrote:
> > On Fri, Dec 15, 2023 at 5:22 AM Ashutosh Bapat
> > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> >>>
> >>> That looks pretty small considering the benefits. What do you think?
> >>>
> >>> [1] https://www.postgresql.org/message-id/CAExHW5stmOUobE55pMt83r8UxvfCph+Pvo5dNpdrVCsBgXEzDQ@mail.gmail.com
> >>
> >> If you want to experiment, please use attached patches. There's a fix
> >> for segfault during initdb in them. The patches are still raw.
> >
> > First patch is no longer required. Here's rebased set
> >
> > The patches are raw. make check has some crashes that I need to fix. I
> > am waiting to hear whether this is useful and whether the design is on
> > the right track.
> I think this work is promising, especially in the scope of partitioning.
> I've analysed how it works by basing these patches on the current
> master. You propose freeing unused paths after the end of the standard
> path search.
> In my view, upper paths generally consume less memory than join and scan
> paths. This is primarily due to the limited options provided by Postgres
> so far.
> At the same time, this technique (while highly useful in general) adds
> fragility and increases complexity: a developer needs to remember to
> link the path using the pointer in different places of the code.
> So, maybe go the alternative way? Invent a subquery memory context and
> store all the path allocations there. It can be freed after setrefs
> finishes this subquery planning without pulling up this subquery.
>

Using memory context for subquery won't help with partitioning right?
If the outermost query has a partitioned table with thousands of
partitions, it will still accumulate those paths till the very end of
planning. We could instead use memory context/s to store all the paths
created, then copy the optimal paths into a new memory context at
strategic points and blow up the old memory context. And repeat this
till we choose the final path and create a plan out of it; at that
point we could blow up the memory context containing remaining paths
as well. That will free the paths as soon as they are rendered
useless. I discussed this idea with Alvaro offline. We thought that
this approach needs some code to copy paths and then copying paths
recursively has some overhead of itself. The current work of adding a
reference count, OTOH has potential to bring discipline into the way
we handle paths. We need to avoid risks posed by dangling pointers.
For which Alvaro suggested looking at the way we manage snapshots. But
I didn't get time to explore that idea yet.

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2024-09-19 12:10:04 Re: generic plans and "initial" pruning
Previous Message Andrei Lepikhov 2024-09-19 10:48:40 Re: Memory consumed by paths during partitionwise join planning