From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | Yuya Watari <watari(dot)yuya(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, jian he <jian(dot)universality(at)gmail(dot)com>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Thom Brown <thom(at)linux(dot)com>, Zhang Mingli <zmlpostgres(at)gmail(dot)com> |
Subject: | Re: [PoC] Reducing planning time when tables have many partitions |
Date: | 2025-03-24 17:49:17 |
Message-ID: | 1016307.1742838557@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> ... The main
> thing I'd like to understand here is if there's not enough time to get
> the entire patch set committed, is there much benefit to just having
> the EquivalenceMember index stuff in by itself without the
> RestrictInfo changes.
I finally made some time to look at this patchset, and I'm pretty
disappointed, because after 35 versions I'd expect to see something
that looks close to committable. This doesn't really. I like the
basic idea of taking child EC members out of ECs' main ec_members
lists, but there are too many weird details and
underexplained/overcomplicated/unmaintainable data structures.
One thing I don't love is putting the children into RelOptInfos.
That seems like an unrelated data structure. Have you thought
about instead having, in each EC that needs it, an array indexed
by RTI of per-relation child-member lists? I think this might
net out as less storage because there typically aren't that many
ECs in a query. But the main thing is to not have so many
interconnections between ECs and RelOptInfos.
Another thing I really don't like is the back-link from EMs to ECs:
+ EquivalenceClass *em_ec; /* EquivalenceClass which has this member */
That makes the data structure circular, which will cause pprint to
recurse infinitely. (The fact that you hadn't noticed that makes
me wonder how you debugged any of these data structure changes.)
We could prevent the recursion with suitable annotation on this field,
but I'd really rather not have the field in the first place. Circular
pointers are dangerous and best avoided. Also, it's bloating a node
type that you are concerned about supporting a lot of. Another point
is that I don't see any code to take care of updating these links
during an EC merge.
Some thoughts about the iterator stuff:
* setup_eclass_member_iterator_with_children is a carpal-tunnel-inducing
name. Could we drop the "_with_children" part? It doesn't seem to
add much, since there's no variant for "without children".
* The root parameter should be first; IMO there should be no
exceptions to that within the planner. Perhaps putting the target
iterator parameter last would make it read more nicely. Or you could
rely on struct assignment:
it = setup_eclass_member_iterator(root, ec, relids);
* Why did you define the iterator as possibly returning irrelevant
members? Doesn't that mean that every caller has to double-check?
Wouldn't it make for less code and fewer bugs for the iterator to
have that responsibility? If there is a good reason to do it like
that, the comments should explain why.
I don't really like the concept of 0004 at all. Putting *all*
the EC-related RelOptInfos into a root-stored list seems to be
doubling down very hard on the assumption that no performance-critical
operations will ever need to search that whole list. Is there a good
reason to do it like that, rather than say using the bitmap-index
concept separately within each EC? That might also alleviate the
problem you're having with the bitmapsets getting too big.
Given that we've only got a week left, I see little hope of getting
any of this into v18.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Mahendra Singh Thalor | 2025-03-24 17:51:32 | Re: getting "shell command argument contains a newline or carriage return:" error with pg_dumpall when db name have new line in double quote |
Previous Message | Nathan Bossart | 2025-03-24 17:48:29 | Re: vacuum_truncate configuration parameter and isset_offset |