Re: Partition-wise join for join between (declaratively) partitioned tables

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-03-01 08:56:29
Message-ID: CAFjFpRfTHJ5AHd0Ydb+DAsybhcT+2ViiuzKy+yKtOZArQ4eN+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> 2. If the PartitionJoinPath emerges as the best path, we create paths
> for each of the remaining child-joins. Then we collect paths with
> properties same as the given PartitionJoinPath, one from each
> child-join. These paths are converted into plans and a Merge/Append
> plan is created combing these plans. The paths and plans for
> child-join are created in a temporary memory context. The final plan
> for each child-join is copied into planner's context and the temporary
> memory context is reset.
>

Robert and I discussed this in more detail. Path creation code may
allocate objects other than paths. postgres_fdw, for example,
allocates character array to hold the name of relation being
pushed-down. When the temporary context gets zapped after creating
paths for a given child-join, those other objects also gets thrown
away. Attached patch has implemented the idea that came out of the
discussion.

We create a memory context for holding paths at the time of creating
PlannerGlobal and save it in PlannerGlobal. The patch introduces a new
macro makePathNode() which allocates the memory for given type of path
from this context. Every create_*_path function has been changed to
use this macro instead of makeNode(). In standard_planner(), at the
end of planning we destroy the memory context freeing all the paths
allocated. While creating a plan node, planner copies everything
required by the plan from the path, so the path is not needed any
more. So, freeing corresponding memory should not have any adverse
effects.

Most of the create_*_path() functions accept root as an argument, thus
the temporary path context is available through root->glob everywhere.
An exception is create_append_path() which does not accept root as an
argument. The patch changes create_append_path() and its callers like
set_dummy_rel_pathlist(), mark_dummy_rel() to accept root as an
argument. Ideally paths are not required after creating plan, so we
should be
able to free the context right after the call to create_plan(). But we
need dummy paths while creating flat rtable in
set_plan_references()->add_rtes_to_flat_rtable(). We used to So free
the path context at the end of planning cycle. Now that we are
allocating all the paths in a different memory context, it doesn't
make sense to switch context in mark_dummy_rel().

0001 patch implements the idea described above.
0002 patch adds instrumentation to measure memory consumed in
standard_planner() call.
0003 patch adds a GUC zap_paths to enable/disable destroying path context.
The last two patches are for testing only.

Attached also find the SQL script and its output showing the memory
saved. For a 5 way self-join of pg_class, the total memory consumed in
standard_planner() is 760K without patch and with patch it comes down
to 713K, saving 47K memory otherwise occupied by paths. It looks like
something useful even without partition-wise joins.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
0001-Free-up-memory-consumed-by-the-paths.patch application/octet-stream 27.0 KB
0002-Patch-to-measure-memory-used-in-CurrentMemoryContext.patch application/octet-stream 3.1 KB
0003-GUC-zap_path-to-enable-freeing-memory-consumed-by-pa.patch application/octet-stream 2.9 KB
path_memory.out application/octet-stream 3.0 KB
path_memory.sql application/octet-stream 381 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo Nagata 2017-03-01 09:10:10 Re: [POC] hash partitioning
Previous Message Anastasia Lubennikova 2017-03-01 08:46:59 Re: WIP: Covering + unique indexes.