Quick Links

Re: Expand applicability of aggregate's sortop optimization

From:	Andrei Lepikhov <lepihov(at)gmail(dot)com>
To:	Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, ilmari(at)ilmari(dot)org
Subject:	Re: Expand applicability of aggregate's sortop optimization
Date:	2024-07-17 03:28:58
Message-ID:	24c11397-d9a1-4aac-a999-a79352a6a35b@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 5/8/24 17:13, Matthias van de Meent wrote:
> As you may know, aggregates like SELECT MIN(unique1) FROM tenk1; are
> rewritten as SELECT unique1 FROM tenk1 ORDER BY unique1 USING < LIMIT
> 1; by using the optional sortop field in the aggregator.
> However, this optimization is disabled for clauses that in itself have
> an ORDER BY clause such as `MIN(unique1 ORDER BY <anything>), because
> <anything> can cause reordering of distinguisable values like 1.0 and
> 1.00, which then causes measurable differences in the output. In the
> general case, that's a good reason to not apply this optimization, but
> in some cases, we could still apply the index optimization.
Thanks for the job! I guess you could be more brave and push down also
FILTER statement.
>
> One of those cases is fixed in the attached patch: if we order by the
> same column that we're aggregating, using the same order class as the
> aggregate's sort operator (i.e. the aggregate's sortop is in the same
> btree opclass as the ORDER BY's sort operator), then we can still use
> the index operation: The sort behaviour isn't changed, thus we can
> apply the optimization.
>
> PFA the small patch that implements this.
>
> Note that we can't blindly accept just any ordering by the same
> column: If we had an opclass that sorted numeric values by the length
> of the significant/mantissa, then that'd provide a different (and
> distinct) ordering from that which is expected by the normal
> min()/max() aggregates for numeric, which could cause us to return
> arguably incorrect results for the aggregate expression.
As I see, the code:
aggsortop = fetch_agg_sort_op(aggref->aggfnoid);
if (!OidIsValid(aggsortop))
return false; /* not a MIN/MAX aggregate */

used twice and can be evaluated earlier to avoid duplicated code.

Also, I'm unsure about the necessity of looking through the btree
classes. Maybe just to check the commutator to the sortop, like in the
diff attached? Or could you provide an example to support your approach?

--
regards, Andrei Lepikhov

Attachment	Content-Type	Size
rewritten_patch.diff	text/x-patch	6.4 KB

In response to

Expand applicability of aggregate's sortop optimization at 2024-05-08 10:13:29 from Matthias van de Meent

Responses

Re: Expand applicability of aggregate's sortop optimization at 2024-07-17 09:33:05 from Matthias van de Meent

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Smith	2024-07-17 03:31:14	Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Previous Message	Nathan Bossart	2024-07-17 03:23:08	Re: improve performance of pg_dump with many sequences