From: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
---|---|
To: | Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> |
Subject: | Re: Tracking notnull attributes inside Var |
Date: | 2022-05-17 12:49:53 |
Message-ID: | CAExHW5unn3mEBhzaoBurynFR295XGqRm6OqSfA1q5Gi+st7jCA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, May 15, 2022 at 8:41 AM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>
> The var in RelOptInfo->reltarget should have nullable = 0 but the var in
> RelOptInfo->baserestrictinfo should have nullable = 1; The beauty of this
> are: a). It can distinguish the two situations perfectly b). Whenever we want
> to know the nullable attribute of a Var for an expression, it is super easy to
> know. In summary, we need to maintain the nullable attribute at 2 different
> places. one is the before the filters are executed(baserestrictinfo, joininfo,
> ec_list at least). one is after the filters are executed (RelOptInfo.reltarget
> only?)
Thanks for identifying this. What you have written makes sense and it
might open a few optimization opportunities. But let me put down some
other thoughts here. You might want to take those into consideration
when designing your solution.
Do we want to just track nullable and non-nullable. May be we want
expand this class to nullable (var may be null), non-nullable (Var is
definitely non-NULL), null (Var will be always NULL).
But the other way to look at this is along the lines of equivalence
classes. Equivalence classes record the expressions which are equal in
the final result of the query. The equivalence class members are not
equal at all the stages of query execution. But because they are
equal in the final result, we can impose that restriction on the lower
levels as well. Can we think of nullable in that fashion? If a Var is
non-nullable in the final result, we can impose that restriction on
the intermediate stages since rows with NULL values for that Var will
be filtered out somewhere. Similarly we could argue for null Var. But
knowledge that a Var is nullable in the final result does not impose a
NULL, non-NULL restriction on the intermediate stages. If we follow
this thought process, we don't need to differentiate Var at different
stages in query.
>
> Come to JoinRel, we still need to maintain the 2 different cases as well.
>
> As for the joinrel.reltarget, currently it looks up the inputrel's reltarget to
> get the Var, so it is easy to inherit from Var->nullable from inputrel, but
> we need to consider the new changes introduced by current join,
> Like new NOT nullable attributes because of join clauses OR new nullable
> attributes because of outer join. Everything looks good for now.
Yes, if we want to maintain nullness at different stages in the query.
>
> The hard part is RelOptInfo.joininfo & root->eq_classes. All of them uses
> the shared RestrictInfo, and it is unclear which Var->nullable should be used in
> them. To not provide a wrong answer, I think we can assume nullable=-1 (unknown)
> and let the upper layer decides what to do (do we have known use cases to use
> the nullable attribute here?).
I think what applies to baserestrictinfo and reltarget also applies to
joininfo and join's reltarget. There will be three stages - join
clauses, join quals and reltarget.
In EQs the Vars in RestrictInfo will come from joininfo but EQ member
Vars will derive their nullable-ness from corresponding reltarget. I
can be wrong though.
>
> More considerations about this strategy:
> 1. We might use more memory for different var copies, the only known cases
> RelOptInfo->reltarget for now.
When a Var is copied the whole expression tree needs to be copied.
That might be more memory than just copies of Var nodes.
> 2. _equalVar() has more complex semantics: shall we consider nulls or not.
This is interesting. It might have impact on set_plan_references and
planner's ability to search and match expressions.
But if we take the approach I have suggested earlier, this question
will not arise.
--
Best Wishes,
Ashutosh Bapat
From | Date | Subject | |
---|---|---|---|
Next Message | Chirag Karkera | 2022-05-17 12:54:41 | Provide read-only access to system catalog tables |
Previous Message | Nikolay Shaplov | 2022-05-17 12:09:55 | Re: [PATCH] New [relation] option engine |