From: | Richard Guo <guofenglinux(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Reduce "Var IS [NOT] NULL" quals during constant folding |
Date: | 2025-04-11 06:51:30 |
Message-ID: | CAMbWs4_soFRubFd64SX-GC+MA6RV=pZrwfn3oq89LNKoDuLjAg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Apr 11, 2025 at 4:45 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> OK. Maybe I shouldn't be worrying about the table_open() /
> table_close() here, because I see that you are right that
> has_subclass() is nearby, which admittedly does not involve opening
> the relation, but it does involve fetching from the syscache a tuple
> that we wouldn't need to fetch if we had a Relation available at that
> point. And I see now that expand_virtual_generated_columns() is also
> in that area and works similar to your proposed function in that it
> just opens and closes all the relations. Perhaps that's just the way
> we do things at this very early stage of the planner? But I feel like
> it might make sense to do some reorganization of this code, though, so
> that it more resembles the phase 1 and phase 2 as you describe them.
> Both expand_virtual_generated_columns() and collect_relation_attrs()
> care about exactly those relations with rte->rtekind == RTE_RELATION,
> and even if we have to open and close all of those relations once to
> do this processing, perhaps we can avoid doing it twice, and maybe
> avoid the has_subclass() call along the way?
Yeah, I agree on this. This is the optimization I mentioned upthread
in the last paragraph of [1] - retrieving the small portion of catalog
information needed at an early stage in one place instead of three.
Hopefully, this way we only need one table_open/table_close at the
early stage of the planner.
> Maybe we can hoist a loop
> over parse->rtable up into subquery_planner and then have it call a
> virtual-column expansion function and a non-null collection function
> once per RTE_RELATION entry?
Hmm, I'm afraid there might be some difficulty with this approach.
The virtual-column expansion needs to be done after sublink pull-up to
ensure that the virtual-column references within the SubLinks that
should be transformed into joins can get expanded, while non-null
collection needs to be done before sublink pull-up since we might want
to leverage the non-null information to convert NOT IN sublinks to
anti joins.
What I had in mind is that we hoist a loop over parse->rtable before
pull_up_sublinks to gather information about which columns of each
relation are defined as NOT NULL and which are virtually generated.
These information will be used in sublink pull-up and virtual-column
expansion. We also call has_subclass for each relation that is maked
'inh' within that loop and clear the inh flag if needed.
This seems to require a fair amount of code changes, so I'd like to
get some feedback on this approach before proceeding with the
implementation.
> A related point that I'm noticing is that you record the not-NULL
> information in the RangeTblEntry. I wonder whether that's going to be
> a problem, because I think of the RangeTblEntry as a parse-time
> structure and the RelOptInfo as a plan-time structure, meaning that we
> shouldn't scribble on the former and that we should record any
> plan-time details we need in the latter. I understand that the problem
> is precisely that the RelOptInfo isn't yet available, but I'm not sure
> that makes it OK to use the RangeTblEntry instead.
Fair point. We should do our best to refrain from scribbling on the
parsetree in the planner. I initially went with the hashtable
approach as Tom suggested, but later found it quite handy to store the
not-null information in the RangeTblEntry, especially since we do
something similar with rte->inh. However, I've come to realize that
inh may not be a good example to follow after all, so I'll go back to
the hashtable method.
Thank you for pointing that out before I went too far down the wrong
path.
[1] https://postgr.es/m/CAMbWs4-DryEm_U-juPn3HwUiwZRXW3jhfX18b_AFgrgihgq4kg@mail.gmail.com
Thanks
Richard
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2025-04-11 07:07:04 | Re: Improve a few appendStringInfo calls new to v18 |
Previous Message | Peter Smith | 2025-04-11 06:05:45 | Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding |