Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser'

From: Andrei Lepikhov <lepihov(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser'
Date: 2024-09-24 05:08:09
Message-ID: ceb23d97-7ac0-4d45-9196-137691b95079@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19/9/2024 09:55, Andrei Lepikhov wrote:
> This wrong prediction makes things much worse if the query has more
> upper query blocks.
> His question was: Why not consider the grouping column unique in the
> upper query block? It could improve estimations.
> After a thorough investigation, I discovered that in commit  4767bc8ff2
> most of the work was already done for DISTINCT clauses. So, why not do
> the same for grouping? A sketch of the patch is attached.
> As I see it, grouping in this sense works quite similarly to DISTINCT,
> and we have no reason to ignore it. After applying the patch, you can
> see that prediction has been improved:
>
> Hash Right Join  (cost=5.62..162.56 rows=50 width=36)
>
A regression test is added into new version.
The code looks tiny, simple and non-invasive - it will be easy to commit
or reject. So I add it to next commitfest.

--
regards, Andrei Lepikhov

Attachment Content-Type Size
v2-0001-Improve-statistics-estimation-considering-GROUP-B.patch text/plain 8.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-09-24 05:19:10 Re: Conflict detection for update_deleted in logical replication
Previous Message Maciek Sakrejda 2024-09-24 04:37:58 Re: Proposal to Enable/Disable Index using ALTER INDEX