Re: [PoC] Reducing planning time when tables have many partitions

From: Yuya Watari <watari(dot)yuya(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, jian he <jian(dot)universality(at)gmail(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Thom Brown <thom(at)linux(dot)com>, Zhang Mingli <zmlpostgres(at)gmail(dot)com>
Subject: Re: [PoC] Reducing planning time when tables have many partitions
Date: 2025-03-14 10:09:38
Message-ID: CAJ2pMkaLi_2vCSrxwWjXvpGd_EExy4sQFL5-3BAwALRKhsviGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello David,

Thank you very much for your thorough review and valuable comments.

I have refactored the patches based on your feedback and attached the
updated versions (v34). Additionally, I have included a diff between
v33 and v34 for your quick reference.

On Thu, Mar 13, 2025 at 1:53 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> 1) Can you describe the difference between
> PlannerInfo.top_parent_relid_array and RelOptInfo.top_parent_relids?
> If you've added the PlannerInfo field for performance reasons, then
> that needs to be documented. I think the bar for adding another field
> to do the same thing should be quite high. The
> RelOptInfo.top_parent_relids field already is commented with
> "redundant, but handy", so having another field in another struct
> that's also redundant leads me to think that some design needs more
> thought.
>
> If you need a cheap way to take the same shortcut as you're doing in
> setup_eclass_child_member_iterator() with "if
> (root->top_parent_relid_array == NULL)", then maybe PlannerInfo should
> have a boolean field to record if there are any other member rels

Thank you for highlighting this. I initially introduced
PlannerInfo.top_parent_relid_array primarily for performance reasons
to quickly determine whether a relation is a parent or child,
particularly in setup_eclass_child_member_iterator(). As you
mentioned, earlier versions utilized the check "if
(root->top_parent_relid_array == NULL)" to skip unnecessary operations
when no child relations exist.

However, I have realized that the same behavior can be achieved by
using root->append_rel_array. Specifically, if a relation is a parent,
the corresponding AppendRelInfo is NULL, and if there are no child
relations at all, the entire array itself is NULL. So,
PlannerInfo.top_parent_relid_array is no longer necessary.

In v34-0001, I removed root->top_parent_relid_array and instead
utilized root->append_rel_array. However, this caused issues in
add_setop_child_rel_equivalences(), since this function adds a new
child EquivalenceMember without building a parent-child relationship
in root->append_rel_array. To address this, I have created a dummy
AppendRelInfo object in v34-0002. This is just a workaround, and there
may be a more elegant solution. I'd greatly appreciate any suggestions
or alternative approaches you might have.

> 2) I think the naming of setup_eclass_child_member_iterator() and
> dispose_eclass_child_member_iterator() is confusing. From the names,
> I'd expect these to only be returning em_is_child == true members, but
> that's not the case.

I agree the original naming was misleading. In v34-0001, I have
renamed these functions to
setup_eclass_all_member_iterator_for_relids() and
dispose_eclass_all_member_iterator_for_relids(). To align with this
change, I have also renamed EquivalenceChildMemberIterator to
EquivalenceAllMemberIterator. Does this new naming better address your
concern?

> 3) The header comment for setup_eclass_child_member_iterator() does
> not seem concise enough. It claims "so that it can iterate over
> EquivalenceMembers in 'ec'.", but what does that mean? The definition
> of "EquivalenceMembers in 'ec'" isn't clear. Is that just the class's
> ec_members, or also the child members that are stored somewhere else.
> Users of this function need to know what they'll get so they know
> which members they need to ignore or which they can assume won't be
> returned. If you don't document that, then it's quite hard to
> determine where the faulty code is when we get bugs. The "relids"
> parameter needs to be documented too.

I have clarified the header comment in v34-0001. It now explicitly
states that the iterator iterates over all parent members and child
members whose em_relids are subsets of the given 'relids'. I have also
clearly documented the parameters, including 'relids'.

> 4) add_transformed_child_version sounds like it does some
> transformation, but all it does is add the EMs for the given
> RelOptInfo to the iterator's list. I don't quite follow what's being
> "transformed". Maybe there's a better name?

Thank you for highlighting this. The original name was indeed
misleading. I have renamed this function to
add_eclass_child_members_to_iterator().

--
Best regards,
Yuya Watari

Attachment Content-Type Size
diff-v33-v34.txt text/plain 33.9 KB
v34-0001-Speed-up-searches-for-child-EquivalenceMembers.patch application/octet-stream 52.5 KB
v34-0002-Resolve-conflict-with-commit-66c0185.patch application/octet-stream 2.9 KB
v34-0003-Introduce-indexes-for-RestrictInfo.patch application/octet-stream 45.9 KB
v34-0004-Introduce-RestrictInfoIterator-to-reduce-memory-.patch application/octet-stream 17.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2025-03-14 10:22:42 Re: [PATCH] Fixed creation of empty .log files during log rotation
Previous Message Jakub Wartak 2025-03-14 10:05:28 Re: Draft for basic NUMA observability