Re: Sort functions with specialized comparators

From: Антуан Виолин <violin(dot)antuan(at)gmail(dot)com>
To: Stepan Neretin <sncfmgg(at)gmail(dot)com>
Cc: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sort functions with specialized comparators
Date: 2024-07-15 05:22:16
Message-ID: CAFjUV9zqpY-ssRV2C8YOjErw28xcPJhUSjmakxd28xrGyQmy1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> Hello all.
>
> I have decided to explore more areas in which I can optimize and have added
> two new benchmarks. Do you have any thoughts on this?
>
> postgres=# select bench_int16_sort(1000000);
> bench_int16_sort
>
>
> -----------------------------------------------------------------------------------------------------------------
> Time taken by usual sort: 66354981 ns, Time taken by optimized sort:
> 52151523 ns, Percentage difference: 21.41%
> (1 row)
>
> postgres=# select bench_float8_sort(1000000);
> bench_float8_sort
>
>
> ------------------------------------------------------------------------------------------------------------------
> Time taken by usual sort: 121475231 ns, Time taken by optimized sort:
> 74458545 ns, Percentage difference: 38.70%
> (1 row)
>

Hello all
We would like to see the relationship between the length of the sorted array
and the performance gain, perhaps some graphs. We also want to see to set a
"worst case" test, sorting the array in ascending order when it is initially
descending

Best, regards, Antoine Violin

postgres=#
>

On Mon, Jul 15, 2024 at 10:32 AM Stepan Neretin <sncfmgg(at)gmail(dot)com> wrote:

>
>
> On Sat, Jun 8, 2024 at 1:50 AM Stepan Neretin <sncfmgg(at)gmail(dot)com> wrote:
>
>> Hello all.
>>
>> I am interested in the proposed patch and would like to propose some
>> additional changes that would complement it. My changes would introduce
>> similar optimizations when working with a list of integers or object
>> identifiers. Additionally, my patch includes an extension for
>> benchmarking, which shows an average speedup of 30-40%.
>>
>> postgres=# SELECT bench_oid_sort(1000000);
>> bench_oid_sort
>>
>>
>> ----------------------------------------------------------------------------------------------------------------
>> Time taken by list_sort: 116990848 ns, Time taken by list_oid_sort:
>> 80446640 ns, Percentage difference: 31.24%
>> (1 row)
>>
>> postgres=# SELECT bench_int_sort(1000000);
>> bench_int_sort
>>
>>
>> ----------------------------------------------------------------------------------------------------------------
>> Time taken by list_sort: 118168506 ns, Time taken by list_int_sort:
>> 80523373 ns, Percentage difference: 31.86%
>> (1 row)
>>
>> What do you think about these changes?
>>
>> Best regards, Stepan Neretin.
>>
>> On Fri, Jun 7, 2024 at 11:08 PM Andrey M. Borodin <x4mmm(at)yandex-team(dot)ru>
>> wrote:
>>
>>> Hi!
>>>
>>> In a thread about sorting comparators[0] Andres noted that we have
>>> infrastructure to help compiler optimize sorting. PFA attached PoC
>>> implementation. I've checked that it indeed works on the benchmark from
>>> that thread.
>>>
>>> postgres=# CREATE TABLE arrays_to_sort AS
>>> SELECT array_shuffle(a) arr
>>> FROM
>>> (SELECT ARRAY(SELECT generate_series(1, 1000000)) a),
>>> generate_series(1, 10);
>>>
>>> postgres=# SELECT (sort(arr))[1] FROM arrays_to_sort; -- original
>>> Time: 990.199 ms
>>> postgres=# SELECT (sort(arr))[1] FROM arrays_to_sort; -- patched
>>> Time: 696.156 ms
>>>
>>> The benefit seems to be on the order of magnitude with 30% speedup.
>>>
>>> There's plenty of sorting by TransactionId, BlockNumber, OffsetNumber,
>>> Oid etc. But this sorting routines never show up in perf top or something
>>> like that.
>>>
>>> Seems like in most cases we do not spend much time in sorting. But
>>> specialization does not cost us much too, only some CPU cycles of a
>>> compiler. I think we can further improve speedup by converting inline
>>> comparator to value extractor: more compilers will see what is actually
>>> going on. But I have no proofs for this reasoning.
>>>
>>> What do you think?
>>>
>>>
>>> Best regards, Andrey Borodin.
>>>
>>> [0]
>>> https://www.postgresql.org/message-id/flat/20240209184014.sobshkcsfjix6u4r%40awork3.anarazel.de#fc23df2cf314bef35095b632380b4a59
>>
>>
> Hello all.
>
> I have decided to explore more areas in which I can optimize and have
> added two new benchmarks. Do you have any thoughts on this?
>
> postgres=# select bench_int16_sort(1000000);
> bench_int16_sort
>
>
> -----------------------------------------------------------------------------------------------------------------
> Time taken by usual sort: 66354981 ns, Time taken by optimized sort:
> 52151523 ns, Percentage difference: 21.41%
> (1 row)
>
> postgres=# select bench_float8_sort(1000000);
> bench_float8_sort
>
>
> ------------------------------------------------------------------------------------------------------------------
> Time taken by usual sort: 121475231 ns, Time taken by optimized sort:
> 74458545 ns, Percentage difference: 38.70%
> (1 row)
>
> postgres=#
>
> Best regards, Stepan Neretin.
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-07-15 05:31:33 Re: Removing unneeded self joins
Previous Message Fujii Masao 2024-07-15 05:00:29 Re: MERGE/SPLIT partition commands should create new partitions in the parent's tablespace?