From: | Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Zhihong Yu <zyu(at)yugabyte(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru>, Белялов Дамир Наилевич <d(dot)belyalov(at)postgrespro(dot)ru> |
Subject: | Re: POC: GROUP BY optimization |
Date: | 2023-07-21 04:21:01 |
Message-ID: | c4a003d9-1a29-64fd-7bc8-248731c55503@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 20/7/2023 18:46, Tomas Vondra wrote:
> On 7/20/23 08:37, Andrey Lepikhov wrote:
>> On 3/10/2022 21:56, Tom Lane wrote:
>>> Revert "Optimize order of GROUP BY keys".
>>>
>>> This reverts commit db0d67db2401eb6238ccc04c6407a4fd4f985832 and
>>> several follow-on fixes.
>>> ...
>>> Since we're hard up against the release deadline for v15, let's
>>> revert these changes for now. We can always try again later.
>>
>> It may be time to restart the project. As a first step, I rebased the
>> patch on the current master. It wasn't trivial because of some latest
>> optimizations (a29eab, 1349d27 and 8d83a5d).
>> Now, Let's repeat the review and rewrite the current path according to
>> the reasons uttered in the revert commit.
>
> I think the fundamental task is to make the costing more reliable, and
> the commit message 443df6e2db points out a couple challenges in this
> area. Not sure how feasible it is to address enough of them ...
>
> 1) procost = 1.0 - I guess we could make this more realistic by doing
> some microbenchmarks and tuning the costs for the most expensive cases.
>
> 2) estimating quicksort comparisons - This relies on ndistinct
> estimates, and I'm not sure how much more reliable we can make those.
> Probably not much :-( Not sure what to do about this, the only thing I
> can think of is to track "reliability" of the estimates and only do the
> reordering if we have high confidence in the estimates. That means we'll
> miss some optimization opportunities, but it should limit the risk.
For me personally, the most challenging issue is:
3. Imbalance, induced by the cost_sort() changes. It may increase or
decrease the contribution of sorting to the total cost compared to other
factors and change choice of sorted paths.
--
regards,
Andrey Lepikhov
Postgres Professional
From | Date | Subject | |
---|---|---|---|
Next Message | Garrett Thornburg | 2023-07-21 04:47:00 | PATCH: Add REINDEX tag to event triggers |
Previous Message | Kyotaro Horiguchi | 2023-07-21 04:14:58 | Re: Buildfarm failures on chipmunk |