Re: Sort functions with specialized comparators

From: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
To: John Naylor <johncnaylorls(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Антуан Виолин <violin(dot)antuan(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sort functions with specialized comparators
Date: 2024-12-09 13:02:19
Message-ID: 64F121E6-CD30-4CD8-BBC3-9369C0F66B8C@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 6 Dec 2024, at 08:49, John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
>
> https://www.postgresql.org/message-id/CA%2BhUKGKztHEWm676csTFjYzortziWmOcf8HDss2Zr0muZ2xfEg%40mail.gmail.com

Wow, what a thread!
"Simpsons Already Did It"

> On Fri, Dec 6, 2024 at 1:32 AM Andrey M. Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>>
>>> On 5 Dec 2024, at 15:16, John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
>
>>> I believe src/port/qsort.c was meant to be just for the standard sort
>>> interface as found in a C library. We do have one globally visible
>>> special sort here:
>>> src/backend/utils/sort/qsort_interruptible.c
>>> ...so that directory seems a better fit.
>> OK. BTW do we need ST_CHECK_FOR_INTERRUPTS?
>
> That's a good thing to raise right now -- intarray currently doesn't
> have one, and we haven't gotten complaints from people trying to sort
> large arrays and cancel the query. This extension is not commonly
> used, so that's not surprising. It could be that large arrays are even
> less common, or no one bothered to report it. What's the largest size
> that your customers use?
>
> If we do need a check for interrupts, then this whole thing must
> remain private to intarray. From reading e64cdab0030 , it's not safe
> to interrupt in general.

I think commit message states that it's better to opt-in for interruptible sort. So I do not think making sort interruptible is a blocker for making global specialized sorting routines.

>
>>> And one more bikeshedding bit that might get noticed: tuplesorts
>>> express their boolean as "reversed". We don't necessarily need to
>>> follow that, but consistency is good for readability.
>>
>> I do not know if "reversed sorting order" is more idiomatic than "ascending sorting order". If you think it is - let's switch argument's name to "reversed".
>
> After sleeping on it, I actually think it's mildly ridiculous for this
> module to force the comparator to know about the sort direction.
> Tuplesorts must do that because each sort key could have a different
> sort order. There is only one place in intarray that wants reversed
> order -- maybe that place should reverse elements itself? It's fine to
> keep thing as they are if the sort function stays private to intarray,
> but this patch creates a global function, where the "ascending"
> parameter is just noise. And if we don't have large int32 sorts
> outside of intarray, then the path of least resistance may be to keep
> it private.

We could use global function for oid lists which may be arbitrary large.
But if you think that local intarray function is better - let's go that route.

> I had a look at the files touched by this patch and noticed that there
> is another sort used for making arrays unique. Were you going to look
> at that as well?

Done.

Best regards, Andrey Borodin.

Attachment Content-Type Size
v5-0001-Use-specialized-sort-facilities.patch application/octet-stream 5.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ilia Evdokimov 2024-12-09 13:10:01 Define STATS_MIN_ROWS for minimum rows of stats in ANALYZE
Previous Message Tomas Vondra 2024-12-09 12:47:49 Re: Refactoring postmaster's code to cleanup after child exit