Re: Allow DISTINCT to use Incremental Sort

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Allow DISTINCT to use Incremental Sort
Date: 2023-01-10 03:07:09
Message-ID: CAMbWs49WXscbTRyQ_9tqFHDC+v7KrjMn8x4E-zWb-Wm_Qm-BLA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 10, 2023 at 10:14 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> > /* For explicit-sort case, always use the more rigorous clause */
> > if (list_length(root->distinct_pathkeys) <
> > list_length(root->sort_pathkeys))
> > {
> > needed_pathkeys = root->sort_pathkeys;
> > /* Assert checks that parser didn't mess up... */
> > Assert(pathkeys_contained_in(root->distinct_pathkeys,
> > needed_pathkeys));
> > }
> > else
> > needed_pathkeys = root->distinct_pathkeys;
> >
> > I'm not sure if this is necessary, as AFAIU the parser should have
> > ensured that the sortClause is always a prefix of distinctClause.
>
> I think that's true for standard DISTINCT, but it's not for DISTINCT ON.
>
> > In the patch this code has been removed. I think we should also remove
> > the related comments in create_final_distinct_paths.
> >
> > * When we have DISTINCT ON, we must sort by the more rigorous of
> > * DISTINCT and ORDER BY, else it won't have the desired behavior.
> > - * Also, if we do have to do an explicit sort, we might as well use
> > - * the more rigorous ordering to avoid a second sort later. (Note
> > - * that the parser will have ensured that one clause is a prefix of
> > - * the other.)
>
> I'm not quite following what you think has changed here.

Sorry I didn't make myself clear. I mean currently on HEAD in planner.c
from line 4847 to line 4857, we have the code to make sure we always use
the more rigorous clause for explicit-sort case. I think this code is
not necessary, because we have already done that in line 4791 to 4796,
for both DISTINCT ON and standard DISTINCT.

In this patch that code (line 4847 to line 4857) has been removed, which
I agree with. I just wondered if the related comment should be removed
too. But now after a second thought, I think it's OK to keep that
comment, as it still holds true in the new patch.

> I've attached an updated patch

The patch looks good to me.

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shiy.fnst@fujitsu.com 2023-01-10 03:08:04 RE: Fix pg_publication_tables to exclude generated columns
Previous Message Regina Obe 2023-01-10 03:03:11 RE: [PATCH] Support % wildcard in extension upgrade filenames