Re: [PATCH] Erase the distinctClause if the result is unique by definition

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Erase the distinctClause if the result is unique by definition
Date: 2020-03-10 15:41:11
Message-ID: CAKU4AWqPv1-h-=sh-ip=MiERAoV7unHu3V1KfrF-FYFNC3mr4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tom & David & Bapat:

Thanks for your review so far. I want to summarize the current issues to
help
our following discussion.

1. Shall we bypass the AggNode as well with the same logic.

I think yes, since the rules to bypass a AggNode and UniqueNode is exactly
same.
The difficulty of bypassing AggNode is the current aggregation function
call is closely
coupled with AggNode. In the past few days, I have make the aggregation
call can
run without AggNode (at least I tested sum(without finalized fn), avg
(with finalized fn)).
But there are a few things to do, like acl check, anynull check and maybe
more check.
also there are some MemoryContext mess up need to fix.
I still need some time for this goal, so I think the complex of it
deserves another thread
to discuss it, any thought?

2. Shall we used the UniquePath as David suggested.

Actually I am trending to this way now. Daivd, can you share more insights
about the
benefits of UniquePath? Costing size should be one of them, another one
may be
changing the semi join to normal join as the current innerrel_is_unique
did. any others?

3. Can we make the rule more general?

Currently it requires every relation yields a unique result. Daivd & Bapat
provides another example:
select m2.pk from m1, m2 where m1.pk = m2.non_unqiue_key. That's
interesting and not easy to
handle in my current framework. This is another reason I want to take the
UniquePath framework.

Do we have any other rules to think about before implementing it?

Thanks for your feedback.

> This should be ok. The time spent in annotating a RelOptInfo about
> uniqueness is not going to be a lot. But doing so would help generic
> elimination of Distinct/Group/Unique operations. What is
> UniquePathKey; I didn't find this in your patch or in the code.
>
> This is a proposal from David, so not in current patch/code :)

Regards
Andy Fan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2020-03-10 15:41:23 Re: backend type in log_line_prefix?
Previous Message Alvaro Herrera 2020-03-10 15:39:53 Re: [Patch] pg_rewind: options to use restore_command from recovery.conf or command line