Re: [HACKERS] Removing LEFT JOINs in more cases

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Removing LEFT JOINs in more cases
Date: 2017-11-30 02:41:58
Message-ID: CAB7nPqRxD4GywWXZ4jRDZm9EZGzOj7mVA=Y9YKgn3_yUUMECbA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 22, 2017 at 10:30 PM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> On Wed, Nov 1, 2017 at 5:39 AM, David Rowley
> <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>
>> In this case, the join *can* cause row duplicates, but the distinct or
>> group by would filter these out again anyway, so in these cases, we'd
>> not only get the benefit of not joining but also not having to remove
>> the duplicate rows caused by the join.
>
> +1.
>
>>
>> Given how simple the code is to support this, it seems to me to be
>> worth handling.
>>
>
> I find this patch very simple and still useful.
>
> @@ -597,15 +615,25 @@ rel_supports_distinctness(PlannerInfo *root,
> RelOptInfo *rel)
> + if (root->parse->distinctClause != NIL)
> + return true;
> +
> + if (root->parse->groupClause != NIL && !root->parse->hasAggs)
> + return true;
> +
>
> The other callers of rel_supports_distinctness() are looking for distinctness
> of the given relation, whereas the code change here are applicable to any
> relation, not just the given relation. I find that confusing. Looking at the
> way we are calling rel_supports_distinctness() in join_is_removable() this
> change looks inevitable, but still if we could reduce the confusion, that will
> be good. Also if we could avoid duplication of comment about unique index, that
> will be good.
>
> DISTINCT ON allows only a subset of columns being selected to be listed in that
> clause. I initially thought that specifying only a subset would be a problem
> and we should check whether the DISTINCT applies to all columns being selected.
> But that's not true, since DISTINCT ON would eliminate any duplicates in the
> columns listed in that clause, effectively deduplicating the row being
> selected. So, we are good there. May be you want to add a testcase with
> DISTINCT ON.

I am counting that as a review, which got no replies yet. The thing is
somewhat fresh so I am moving it to next CF with waiting on author as
status.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-11-30 02:44:05 Re: [HACKERS] plpgsql - additional extra checks
Previous Message Michael Paquier 2017-11-30 02:40:09 Re: [HACKERS] Proposal: generic WAL compression