From: | Zhihong Yu <zyu(at)yugabyte(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Allow parallel DISTINCT |
Date: | 2021-08-17 20:56:23 |
Message-ID: | CALNJ-vQg3=YwoJyo=bkDY3=AQi6LZSBX+W=V0QTP5ErTF9tE6Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Aug 17, 2021 at 1:47 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> On Wed, 18 Aug 2021 at 02:42, Zhihong Yu <zyu(at)yugabyte(dot)com> wrote:
> > Since create_partial_distinct_paths() calls
> create_final_distinct_paths(), I wonder if numDistinctRows can be passed to
> create_final_distinct_paths() so that the latter doesn't need to call
> estimate_num_groups().
>
> That can't be done. The two calls to estimate_num_groups() are passing
> in a different number of input rows. In
> create_partial_distinct_paths() the number of rows is the number of
> expected input rows from a partial path. In
> create_final_distinct_paths() when called to complete the final
> distinct step, that's the number of distinct values multiplied by the
> number of workers.
>
> It might be more possible to do something like cache the value of
> distinctExprs, but I just don't feel the need. If there are partial
> paths in the input_rel then it's most likely that planning time is not
> going to dominate much between planning and execution. Also, if we
> were to calculate the value of distinctExprs in create_distinct_paths
> always, then we might end up calculating it for nothing as
> create_final_distinct_paths() does not always need it. I don't feel
> the need to clutter up the code by doing any lazy calculating of it
> either.
>
> David
>
Hi,
Thanks for your explanation.
The patch is good from my point of view.
From | Date | Subject | |
---|---|---|---|
Next Message | Bossart, Nathan | 2021-08-17 21:09:52 | Re: archive status ".ready" files may be created too early |
Previous Message | David Rowley | 2021-08-17 20:47:25 | Re: Allow parallel DISTINCT |