From: | Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> |
---|---|
To: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> |
Subject: | Re: Tracking notnull attributes inside Var |
Date: | 2022-05-20 05:18:32 |
Message-ID: | CAKU4AWr=0rm=GoALn3dSwYT7zKFE0VHSt-sAnL1ECCPBV16MQQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Ashutosh:
Nice to see you again!
On Tue, May 17, 2022 at 8:50 PM Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
wrote:
> On Sun, May 15, 2022 at 8:41 AM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>
> >
> > The var in RelOptInfo->reltarget should have nullable = 0 but the var in
> > RelOptInfo->baserestrictinfo should have nullable = 1; The beauty of
> this
> > are: a). It can distinguish the two situations perfectly b). Whenever we
> want
> > to know the nullable attribute of a Var for an expression, it is super
> easy to
> > know. In summary, we need to maintain the nullable attribute at 2
> different
> > places. one is the before the filters are executed(baserestrictinfo,
> joininfo,
> > ec_list at least). one is after the filters are executed
> (RelOptInfo.reltarget
> > only?)
>
> Thanks for identifying this. What you have written makes sense and it
> might open a few optimization opportunities. But let me put down some
> other thoughts here. You might want to take those into consideration
> when designing your solution.
>
Thanks.
>
> Do we want to just track nullable and non-nullable. May be we want
> expand this class to nullable (var may be null), non-nullable (Var is
> definitely non-NULL), null (Var will be always NULL).
>
>
Currently it doesn't support "Var will be always NULL" . Do you have any
use cases for this? and I can't think of too many cases where we can get
such information except something like "SELECT a FROM t WHERE a
IS NULL".
But the other way to look at this is along the lines of equivalence
> classes. Equivalence classes record the expressions which are equal in
> the final result of the query. The equivalence class members are not
> equal at all the stages of query execution. But because they are
> equal in the final result, we can impose that restriction on the lower
> levels as well. Can we think of nullable in that fashion? If a Var is
> non-nullable in the final result, we can impose that restriction on
> the intermediate stages since rows with NULL values for that Var will
> be filtered out somewhere. Similarly we could argue for null Var. But
> knowledge that a Var is nullable in the final result does not impose a
> NULL, non-NULL restriction on the intermediate stages. If we follow
> this thought process, we don't need to differentiate Var at different
> stages in query.
>
I agree this is an option. If so we need to track it under the PlannerInfo
struct but it would not be as fine-grained as my previous. Without
intermediate information, We can't know if a UnqiueKey contains multiple
NULLs, this would not be an issue for the "MARK Distinct as no-op" case,
but I'm not sure it is OK for other UniqueKey user cases. So my current
idea
is I still prefer to maintain the intermediate information, unless we are
sure it
costs too much or it is too complex to implement which I don't think so for
now
at least. So if you have time to look at the attached patch, that would be
super
great as well.
--
Best Regards
Andy Fan
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2022-05-20 05:25:10 | Re: 15beta1 test failure on mips in isolation/expected/stats |
Previous Message | Andy Fan | 2022-05-20 04:42:41 | Re: Tracking notnull attributes inside Var |