Quick Links

Re: [PoC] Reducing planning time when tables have many partitions

From:	David Rowley <dgrowleyml(at)gmail(dot)com>
To:	Yuya Watari <watari(dot)yuya(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, jian he <jian(dot)universality(at)gmail(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Thom Brown <thom(at)linux(dot)com>, Zhang Mingli <zmlpostgres(at)gmail(dot)com>
Subject:	Re: [PoC] Reducing planning time when tables have many partitions
Date:	2025-03-19 10:48:40
Message-ID:	CAApHDvrnm-AJbB4rqTZ5HUmje-rX1+C=AZA08nvUHwY+fM7M6Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, 14 Mar 2025 at 23:10, Yuya Watari <watari(dot)yuya(at)gmail(dot)com> wrote:
> I have refactored the patches based on your feedback and attached the
> updated versions (v34). Additionally, I have included a diff between
> v33 and v34 for your quick reference.

Thanks for updating the patch. I've looked at v34-0001. Here are my
review comments:

1. There are various places in equivclass.c that Assert the member
isn't a child which have a comment "/* no children yet */", e.g.
generate_base_implied_equalities_const() there's
"Assert(!cur_em->em_is_child); /* no children yet */". The comment
implies that ec_members will later contain em_is_child members, but
that's not true. What you've written in
generate_implied_equalities_for_column() seems like the correct way to
handle this, i.e. "Child members should not exist in ec_members"

2. There's a comment in setup_eclass_all_member_iterator_for_relids which says:

* whose em_relids is a subset of the given 'child_relids'. The inverted

is 'child_relids' meant to be 'relids'? Otherwise, I don't know what
'child_relids' is.

3. Can you explain why you've added the following assert to
reconsider_full_join_clause()?

Assert(!bms_is_empty(coal_em->em_relids));

There are no other changes to that function and I can't quite figure
out why that Assert is relevant now if it wasn't before. Or is this a
case of adding additional protection against this? If so, is there
more risk now than there was before?

4. Is there any point in renaming add_eq_member() to
add_parent_eq_member()? You don't have a function called
add_child_eq_member() so is the "parent" word needed here?

5. In add_child_join_rel_equivalences() 'lc' can be moved into the
scope of the while loop.

6. The following comment in add_child_join_rel_equivalences() should
be deleted. That used to be above the "if (cur_em->em_is_child)
continue;" statement and it does not seem relevant to the code you've
replaced that with.

/*
* We consider only original EC members here, not
* already-transformed child members.
*/

7. EquivalenceAllMemberIterator's "modified" field does not seem like
a great name. Is it better to call this something like
"list_is_copy"?

8. It looks like most of the changes in createplan.c are there because
you need to get the PlannerInfo down to a few functions where it's
currently unavailable. I think you should break this out into another
precursor patch which does this only. It'll be easier to review the
main patch this way.

9. The new PlannerInfo parameter in prepare_sort_from_pathkeys() isn't
documented in the "Input Parameters:" header comment. Likewise for
make_sort_from_pathkeys() and make_incrementalsort_from_pathkeys()

10. The comment for PlannerInfo.eclass_indexes_array states it's for
"faster lookups of RestrictInfo". Shouldn't it be "faster
EquivalenceMember lookups"?

/*
* eclass_indexes_array is the same length as simple_rel_array and holds
* the indexes of the corresponding rels for faster lookups of
* RestrictInfo. See the EquivalenceClass comment for more details.
*/
struct EquivalenceClassIndexes *eclass_indexes_array
pg_node_attr(read_write_ignore);

I'm also not sure this comment would be accurate enough with only that
fix as it looks like we don't store any indexes for base rels.

11. In setup_simple_rel_arrays(), is there any point in pallocing
memory for eclass_indexes_array when root->append_rel_list is empty?

12. join_rel_list_index isn't being initialized in all the places that
do makeNode(RelOptInfo); Maybe -1 is a good initial value? (See my
next point)

13. RelOptInfo.join_rel_list_index is an index into
PlannerInfo.join_rel_list. It shouldn't be of type "Index". "int" is
the correct type for an index into a List.

14. In add_join_rel(), if you assign the join_rel_list_index before
the lappend, you don't need to "- 1".

> On Thu, Mar 13, 2025 at 1:53 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> >
> > 1) Can you describe the difference between
> > PlannerInfo.top_parent_relid_array and RelOptInfo.top_parent_relids?
> > If you've added the PlannerInfo field for performance reasons, then
> > that needs to be documented. I think the bar for adding another field
> > to do the same thing should be quite high. The
> > RelOptInfo.top_parent_relids field already is commented with
> > "redundant, but handy", so having another field in another struct
> > that's also redundant leads me to think that some design needs more
> > thought.
> >
> > If you need a cheap way to take the same shortcut as you're doing in
> > setup_eclass_child_member_iterator() with "if
> > (root->top_parent_relid_array == NULL)", then maybe PlannerInfo should
> > have a boolean field to record if there are any other member rels
>
> Thank you for highlighting this. I initially introduced
> PlannerInfo.top_parent_relid_array primarily for performance reasons
> to quickly determine whether a relation is a parent or child,
> particularly in setup_eclass_child_member_iterator(). As you
> mentioned, earlier versions utilized the check "if
> (root->top_parent_relid_array == NULL)" to skip unnecessary operations
> when no child relations exist.
>
> However, I have realized that the same behavior can be achieved by
> using root->append_rel_array. Specifically, if a relation is a parent,
> the corresponding AppendRelInfo is NULL, and if there are no child
> relations at all, the entire array itself is NULL. So,
> PlannerInfo.top_parent_relid_array is no longer necessary.
>
> In v34-0001, I removed root->top_parent_relid_array and instead
> utilized root->append_rel_array. However, this caused issues in
> add_setop_child_rel_equivalences(), since this function adds a new
> child EquivalenceMember without building a parent-child relationship
> in root->append_rel_array. To address this, I have created a dummy
> AppendRelInfo object in v34-0002. This is just a workaround, and there
> may be a more elegant solution. I'd greatly appreciate any suggestions
> or alternative approaches you might have.

I don't currently have the answers you need here. The problem is down
to how prepunion.c hacks together a targetlist with Vars having
varno==0. For the union planner work I did last year, to make the
union planner properly know if a union child had the required
PathKeys, I had to add EquivalenceMembers for each union child. At
the moment I don't really know if we could get away with classing
union children's non-junk target entries as fully-fledged EMs, or if
these should be child EMs. There's some contradiction as to how the
RelOptInfo is set up without a parent in recurse_set_operations(), and
making these child EMs as you're doing in the v34-0002 patch. The
problem with building the RelOptInfo with a parent is that
build_simple_rel() would try to apply the same base quals to the child
as the parent has. That's not the correct thing to do for union
children as each union child can have different quals. I just don't
think we have the correct design for the union planner just yet. I
don't currently have ideas on how to make this better. Maybe
build_simple_rel() needs some way to distinguish "this rel has a
parent" and "this rel should copy the parent's quals". However,
redesigning how this works now likely is a bad idea as it just feels a
bit late in the release cycle for that. You can see I chickened out
doing that in 12933dc60, previously 66c0185a3.

> > 2) I think the naming of setup_eclass_child_member_iterator() and
> > dispose_eclass_child_member_iterator() is confusing. From the names,
> > I'd expect these to only be returning em_is_child == true members, but
> > that's not the case.
>
> I agree the original naming was misleading. In v34-0001, I have
> renamed these functions to
> setup_eclass_all_member_iterator_for_relids() and
> dispose_eclass_all_member_iterator_for_relids(). To align with this
> change, I have also renamed EquivalenceChildMemberIterator to
> EquivalenceAllMemberIterator. Does this new naming better address your
> concern?

It's better. These names still feel very long. Also, I don't think the
iterator struct's name needs to be specific to "AllMembers". Surely
another setup function could have an iterator loop over any members it
likes. Likewise, the dispose function does not seem very specific to
"AllMembers". It's really the setup function that controls which
members are going to be visited. I suggest the next function, the
dispose function and the struct are given much more generic names.
setup_eclass_all_member_iterator_for_relids() feels long. Maybe
eclass_member_iterator_with_children and forget trying to include
"relids" in the name?

David

In response to

Re: [PoC] Reducing planning time when tables have many partitions at 2025-03-14 10:09:38 from Yuya Watari

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Kapila	2025-03-19 10:48:45	Re: Conflict detection for multiple_unique_conflicts in logical replication
Previous Message	Bertrand Drouvot	2025-03-19 10:26:05	Re: Fix 035_standby_logical_decoding.pl race conditions