From: | Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dmitry Astapov <dastapov(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not? |
Date: | 2022-02-02 07:37:13 |
Message-ID: | CAKU4AWrecbiuXaF4rVTji4=ow9VL-WBoZdjSCyQSDVmcgTL=Vw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Justin:
Thanks for your attention.
On Wed, Feb 2, 2022 at 1:13 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > Subject: [PATCH v1 1/6] Rebaee David's patch against the latest code.
>
> If you use git-am, then the author/commit information is preserved.
> It's probably good to include a link to the patch in any case.
>
>
Thanks for this reminder, I didn't pay enough attention to this. Fixed.
(The original patch looks like a diff file not a commit, I wrote a simple
commit
message for this and link to the origin discussion link.)
> > Subject: [PATCH v1 4/6] remove duplicated qual executing.
>
> ---
>
>
> src/backend/optimizer/path/equivclass.c | 22 +++++++++++++++++++
>
>
> src/backend/optimizer/plan/createplan.c | 29 +++++++++++++++++++++++--
>
>
> src/include/optimizer/paths.h | 2 ++
>
>
> src/test/regress/parallel_schedule | 2 ++
>
>
> 4 files changed, 53 insertions(+), 2 deletions(-)
>
>
>
> I think the ./ec_filter test is missing from from this patch.
>
>
Indeed..
> > Subject: [PATCH v1 6/6] adding some test cases for this feature and fix
> the existing case
>
>
> The tests should be updated with the corresponding patches. It's common
> for
> the patches to be commited separately, like if 0001 is ready but the
> others are
> still evolving.
>
Yes, I agree with this. Just that in this case, the commit split is just
for easy
review/discussion. they are unlikely to be able to commit separately. so I
keep
it as it was and improve each commit message.
>
> I'm not sure whether you think this patch is ready to be added to a
> commitfest,
> but do you know about the CI infrastructure ? It allows running all the
> cfbot
> tests for a github branch against 4 OS, which helps catch portability
> issues,
> including memory errors and unstable explain output. See:
> src/tools/ci/README.
>
Added. https://commitfest.postgresql.org/37/3524/
> There's an failure in postgres_fdw, probably the output needs to be
> updated.
>
For the postgres_fdw, I just refreshed the content. with this patch, the
plan changed
from
Foreign Scan
Output: ft5.*, ft5.c1, ft5.c2, ft5.c3, ft4.c1, ft4.c2
Relations: (public.ft5) INNER JOIN (public.ft4)
Remote SQL: SELECT CASE WHEN (r1.*)::text IS NOT NULL THEN ROW(r1.c1,
r1.c2, r1.c3) END, r1.c1, r1.c2, r1.c3, r2.c1, r2.c2 FROM ("S 1"."T 4" r1
INNER JOIN "S 1"."T 3" r2 ON (((r1.c1 = r2.c1)) AND ((r2.c1 >= 10)) AND
((r2.c1 <= 30)))) ORDER BY r1.c1 ASC NULLS LAST
(4 rows)
to
Sort (cost=108.02..108.04 rows=7 width=62)
Output: ft5.*, ft5.c1, ft5.c2, ft5.c3, ft4.c1, ft4.c2
Sort Key: ft5.c1
-> Foreign Scan (cost=100.00..107.92 rows=7 width=62)
Output: ft5.*, ft5.c1, ft5.c2, ft5.c3, ft4.c1, ft4.c2
Relations: (public.ft5) INNER JOIN (public.ft4)
Remote SQL: SELECT CASE WHEN (r1.*)::text IS NOT NULL THEN
ROW(r1.c1, r1.c2, r1.c3) END, r1.c1, r1.c2, r1.c3, r2.c1, r2.c2 FROM ("S
1"."T 4" r1 INNER JOIN "S 1"."T 3" r2 ON (((r1.c1 = r2.c1)) AND ((r2.c1 >=
10)) AND ((r2.c1 <= 30)) AND ((r1.c1 >= 10)) AND ((r1.
c1 <= 30))))
But if I set enable_sort = off, we can still get the previous plan, which
proves that
this patch doesn't make the above path unavailable, it is just not cheaper
than
the new one. Here is the new commit messages:
commit e0a7838a09e73f831eecb23b5e7884cc34d71301
Author: David Rowley <dgrowleyml(at)gmail(dot)com>
Date: Tue Feb 1 20:56:40 2022 +0800
Introudce ec_filters in EquivalenceClass struct, the semantics is the
quals can
be applied to any EquivalenceMember in this EC. Later this information
is used
to generate new RestrictInfo and was distributed to related RelOptInfo
very
soon.
Author: David Rowley at 2015-12 [1]
Andy Fan rebase this patch to current latest code.
commit 73f52d0909374446cd689457f0a4ef52addb035e
Author: Andy Fan <yizhi(dot)fzh(at)alibaba-inc(dot)com>
Date: Tue Feb 1 14:54:07 2022 +0800
After distributing the new derived RestrictInfo into RelOptInfo, then
the rows
estimation is wrong at the joinrel part. The reason is well described
at [1] and
[2], To fix this issue, I added a new field "EquivalenceClass
*derived" in
RestrictInfo struct to indicate how this qual is generated. we would
ignore such
qual during estimate of the rows size. All the set_xx_size should be
taken care of, but
for now, just set_plain_rel_size is taken care of for the PoC purpose.
[1]
commit 8439b4818410d860a4ca4be3458b54c04c6f8648
Author: Andy Fan <yizhi(dot)fzh(at)alibaba-inc(dot)com>
Date: Tue Feb 1 15:20:10 2022 +0800
Introduce RelOptInfo.filtered_rows.
Previously the Path.rows (shown in the explain output) and
RelOptInfo.rows
which would be used to calculating joinrel's estimated rows are same
at many scan paths, like SeqScan, IndexScan, BitmapHeapScan and so on.
But
they would be different after distributing a new restrictinfo from
ec_filter.
So I developed RelOptInfo.filtered_rows to take some duty out of
RelOptInfo.rows.
commit 11b3395bb5bcc4a2bcff6fed8078dbbf3cda81b1
Author: Andy Fan <yizhi(dot)fzh(at)alibaba-inc(dot)com>
Date: Tue Feb 1 17:37:27 2022 +0800
Remove duplicated qual executing for executor.
Take the SELECT * FROM t1, t2 WHERE t1.a = t2.a and t2.a > 3 for
example,
we can derive t1.a > 3 with EC filter infrastructure. However if it
generate a
plan like below, the new generated qual does not deserve to execute.
Nest Loop
Seq Scan (t1.a > 3)
Index Scan t2_a
(a = t1.a) (t2.a > 3)
This patch removes the "t2.a > 3" for the above case.
commit 2875a76136293589b6e409cb6be4defab87ade59
Author: Andy Fan <yizhi(dot)fzh(at)alibaba-inc(dot)com>
Date: Wed Feb 2 11:54:24 2022 +0800
Support ScalarArrayOpExpr and perudoconstant on ef_filter.
commit a4b21ab6fd0fd57902f5471ec962a77b59085158 (HEAD -> cf_v4)
Author: Andy Fan <yizhi(dot)fzh(at)alibaba-inc(dot)com>
Date: Wed Feb 2 11:59:53 2022 +0800
Added the testcase for this feature and fix the previous test case
as well. The new added test case needs outputting some runtime
statistics, which will probably be different at each run. I can think
of a way to make the test case stable if the patchsets are not wrong
at the first step.
--
Best Regards
Andy Fan
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2022-02-02 07:49:57 | Re: RFC: Logging plan of the running query |
Previous Message | David G. Johnston | 2022-02-02 07:36:08 | Re: Design of pg_stat_subscription_workers vs pgstats |