Re: [HACKERS] Removing LEFT JOINs in more cases

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Removing LEFT JOINs in more cases
Date: 2017-11-22 13:30:34
Message-ID: CAFjFpReA5GJ5s039iDc9EOWEP3j2k_GjrGKJ59XAVot8HFKt7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 1, 2017 at 5:39 AM, David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:

> In this case, the join *can* cause row duplicates, but the distinct or
> group by would filter these out again anyway, so in these cases, we'd
> not only get the benefit of not joining but also not having to remove
> the duplicate rows caused by the join.

+1.

>
> Given how simple the code is to support this, it seems to me to be
> worth handling.
>

I find this patch very simple and still useful.

@@ -597,15 +615,25 @@ rel_supports_distinctness(PlannerInfo *root,
RelOptInfo *rel)
+ if (root->parse->distinctClause != NIL)
+ return true;
+
+ if (root->parse->groupClause != NIL && !root->parse->hasAggs)
+ return true;
+

The other callers of rel_supports_distinctness() are looking for distinctness
of the given relation, whereas the code change here are applicable to any
relation, not just the given relation. I find that confusing. Looking at the
way we are calling rel_supports_distinctness() in join_is_removable() this
change looks inevitable, but still if we could reduce the confusion, that will
be good. Also if we could avoid duplication of comment about unique index, that
will be good.

DISTINCT ON allows only a subset of columns being selected to be listed in that
clause. I initially thought that specifying only a subset would be a problem
and we should check whether the DISTINCT applies to all columns being selected.
But that's not true, since DISTINCT ON would eliminate any duplicates in the
columns listed in that clause, effectively deduplicating the row being
selected. So, we are good there. May be you want to add a testcase with
DISTINCT ON.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-11-22 13:36:09 Re: [HACKERS] [POC] Faster processing at Gather node
Previous Message Taylor, Nathaniel N. 2017-11-22 11:57:33 Logical Replication and PgPool