From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | liuxy(at)gatech(dot)edu, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #16887: Group by is faster than distinct |
Date: | 2021-02-23 05:28:18 |
Message-ID: | CAApHDvrAgN4APYrsoMGoAhps6zsa2SEom5QW+O-ZqEpjggm-6w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Tue, 23 Feb 2021 at 10:26, PG Bug reporting form
<noreply(at)postgresql(dot)org> wrote:
> Actual Behavior
> We executed both queries on the TPC-H benchmark of scale factor 5: the first
> query takes over 20 seconds, while the second query only takes 6.5 seconds.
> We think the time difference results from different plans selected.
> Specifically, in the first (slow) query, the optimizer decides to not
> parallelize the SCAN and GROUP operations.
> Expected Behavior
> Since these two queries are semantically equivalent, we were hoping that
> PostgreSQL will evaluate them in roughly the same amount of time.
It makes sense that you'd expect this, however, we don't currently
generate parallel plans for DISTINCT queries. So this is more of
something that's yet to be implemented rather than a bug.
When parallel aggregates were added in 9.6, it was quite late in the
cycle and I narrowed the scope to just GROUP BY. DISTINCT was left
behind. I tried to pick that up again several years ago, but I was
encouraged to drop it in favour of other work.
David
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Lakhin | 2021-02-23 06:00:00 | Re: BUG #16801: Invalid memory access on WITH RECURSIVE with nested WITHs |
Previous Message | Adrian Klaver | 2021-02-23 05:23:02 | Re: pg_restore - generated column - not populating |