From: | Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru> |
---|---|
To: | Richard Guo <guofenglinux(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru> |
Subject: | Re: POC: GROUP BY optimization |
Date: | 2024-02-22 04:18:42 |
Message-ID: | 890ed877-e2c0-448a-93b8-b07ccf3a2b37@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 22/2/2024 09:09, Richard Guo wrote:
>
> On Wed, Feb 21, 2024 at 6:20 PM Alexander Korotkov <aekorotkov(at)gmail(dot)com
> <mailto:aekorotkov(at)gmail(dot)com>> wrote:
>
> Hi, Richard!
>
> > What do you think about the revisions for the test cases?
>
> I've rebased your patch upthread. Did some minor beautifications.
>
> > * The table 'btg' is inserted with 10000 tuples, which seems a bit
> > expensive for a test. I don't think we need such a big table to test
> > what we want.
>
> Your patch reduces the number of rows to 1000 tuples. I found it
> possible to further reduce it to 100 tuples. That also allowed me to
> save the plan in the test case introduced by e1b7fde418.
>
> Please check if you're OK with the patch attached.
>
>
> I looked through the v2 patch and have two comments.
>
> * The test case under "Check we don't pick aggregate path key instead of
> grouping path key" does not have EXPLAIN to show the plan. So how can
> we ensure we do not mistakenly select the aggregate pathkeys instead of
> the grouping pathkeys?
I confirm it works correctly. I am not sure about the stability of the
zeros number in the output on different platforms here:
avg
--------------------
4.0000000000000000
5.0000000000000000
It was why I'd used the format() function before. So, may we elaborate
on the test and restrict the output?
>
> * I don't think the test case introduced by e1b7fde418 is still needed,
> because we already have one under "Utilize the ordering of merge join to
> avoid a full Sort operation". This kind of test case is just to ensure
> that we are able to utilize the ordering of the subplans underneath. So
> it should be parallel to the test cases for utilize the ordering of
> index scan and subquery scan.
I confirm, this version also checks ec_sortref initialization in the
case when ec are contructed from WHERE clause. Generally, I like more
one test for one issue instead of one test for all at once. But it works
and I don't have any reason to dispute it.
Also, I'm unsure about removing the disabling of the
max_parallel_workers_per_gather parameter. Have you discovered the
domination of the current plan over the partial one? Do the cost
fluctuations across platforms not trigger a parallel plan?
What's more, I suggest to address here the complaint from [1]. As I see,
cost difference between Sort and IncrementalSort strategies in that case
is around 0.5. To make the test more stable I propose to change it a bit
and add a limit:
SELECT count(*) FROM btg GROUP BY z, y, w, x LIMIT 10;
It makes efficacy of IncrementalSort more obvious difference around 10
cost points.
--
regards,
Andrei Lepikhov
Postgres Professional
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2024-02-22 04:25:42 | Re: DSA_ALLOC_NO_OOM doesn't work |
Previous Message | Ashutosh Bapat | 2024-02-22 04:06:06 | Re: RFC: Logging plan of the running query |