From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
---|---|
To: | "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, Антуан Виолин <violin(dot)antuan(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Sort functions with specialized comparators |
Date: | 2025-01-07 04:43:19 |
Message-ID: | CANWCAZa+O3QiNKW1YYTOusHL-vEFcSz2VH+-qPv1EfvCpc1b9w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jan 6, 2025 at 10:51 PM Andrey M. Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
> > On 6 Jan 2025, at 15:54, John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
> > argument. Like some other patches in this series, this does have the
> > side effect of removing the ability to skip quinique(), so that should
> > be benchmarked (I can do that this week unless you beat me to it).
>
> With the same setup as in the first message of this thread we can do:
>
> postgres=# SELECT _int_contains(arr,ARRAY[1]) FROM arrays_to_sort;
>
> before patch patch
> Time: 567.928 ms
> after patch
> Time: 890.297 ms
> timing of this function is dominated by PREPAREARR(a);
>
> What bothers me is that PREPAREARR(a); is returning new array in case of empty input. That's why I considered little refactoring of resize_intArrayType(): reorder cases so that if (num == ARRNELEMS(a)) was first.
Hmm, I'm confused. First, none of the arrays are empty that I can see
-- am I missing something?
Then, the first message setup is
CREATE TABLE arrays_to_sort AS
SELECT array_shuffle(a) arr
FROM
(SELECT ARRAY(SELECT generate_series(1, 1000000)) a),
generate_series(1, 10);
...so most of the time is in sorting the big array, and I don't see a
regression, the opposite in fact:
SELECT _int_contains(arr,ARRAY[1]) FROM arrays_to_sort;
master:
1492.552 ms
v9:
873.697 ms
The case I was concerned about was if the big array was already sorted
and unique. Then, it's conceivable that unnecessarily running qunique
would make things slower, but I don't see that either:
--ordered
CREATE TABLE arrays_sorted AS
SELECT a arr
FROM
(SELECT ARRAY(SELECT generate_series(1, 1000000)) a),
generate_series(1, 10);
SELECT _int_contains(arr,ARRAY[1]) FROM arrays_sorted;
master:
31.388
v9:
28.247
--
John Naylor
Amazon Web Services
From | Date | Subject | |
---|---|---|---|
Next Message | Nisha Moond | 2025-01-07 04:55:50 | Re: Conflict detection for update_deleted in logical replication |
Previous Message | Tom Lane | 2025-01-07 03:29:07 | Re: allow changing autovacuum_max_workers without restarting |