Re: Sort functions with specialized comparators

From: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
To: John Naylor <johncnaylorls(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Антуан Виолин <violin(dot)antuan(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sort functions with specialized comparators
Date: 2024-12-15 17:58:09
Message-ID: B2A50C15-1B3B-4B53-99F7-E1FF88E58121@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 11 Dec 2024, at 11:39, John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
>
> On Mon, Dec 9, 2024 at 8:02 PM Andrey M. Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>>
>> I think commit message states that it's better to opt-in for interruptible sort. So I do not think making sort interruptible is a blocker for making global specialized sorting routines.
>
> There is a difference, though -- that commit had a number of uses for
> it immediately. In my view, there is no reason to have a global
> interruptible sort that's only used by one contrib module. YAGNI
>
> Also, I was hoping get an answer for how this would actually affect
> intarray use you've seen in the wild. If the answer is "I don't know
> of any one who uses this either", then I'm actually starting to wonder
> if the speed matters at all. Maybe all uses are for a few hundred or
> thousand integers, in which case the sort time is trivial anyway?

I do not have access to user data in most clusters... I remember only one particular case: tags and folders applied to mail messages are represented by int array. Mostly for GIN search. In that case vast majority of these arrays are 0-10 elements, some hot-acceses fraction of 10-1000. Only robots (service accounts) can have millions, and in their case latency have no impact at all.
But this particular case also does not trigger sorting much: arrays are stored sorted and modifications are infrequent. In most cases sorting is invoked for already sorted or almost sorted input.

So yeah, from practical point of view cosmetic reasons seems to be most important :)

>> We could use global function for oid lists which may be arbitrary large.
>
> BTW, oids are unsigned. (See the 0002 patch from Thomas M. I linked to earlier)

Seems like we cannot reuse same function...

So, let's do the function private for intarray and try to remove as much code as possible?

Best regards, Andrey Borodin.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-12-15 18:40:27 Re: [BUG] pgbench nested \if conditions incorrectly processed
Previous Message Michail Nikolaev 2024-12-15 17:43:35 Re: Windows UTF8 system locale