Re: Sort functions with specialized comparators

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Stepan Neretin <sndcppg(at)gmail(dot)com>
Cc: Stepan Neretin <sncfmgg(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Антуан Виолин <violin(dot)antuan(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sort functions with specialized comparators
Date: 2024-09-08 21:31:34
Message-ID: CAApHDvq9hAVbG376QHdewaESzKAHo-VBvGs=-RQrDVcGzKunrA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 9 Sept 2024 at 01:00, Stepan Neretin <sndcppg(at)gmail(dot)com> wrote:
> Hi, why do you think that I rejected Andrey's offer? I included his patch first in my own. Yes, patch 2-0006 is the only patch to which I have not attached any statistics and it looks really dubious. But the rest seem useful. Above, I attached a speed graph for one of the patches and tps(pgbench)

The difference with your patches and Andrey's patch is that he
included a benchmark which is targeted to the code he changed and his
results show a speed-up.

What it appears that you've done is made an assortment of changes and
picked the least effort thing that tests performance of something. You
by chance saw a performance increase so assumed it was due to your
changes.

> What do you think is the format in which to make benchmarks for my patches?

You'll need a benchmark that exercises the code you've changed to some
degree where it has a positive impact on performance. As far as I can
see, you've not done that yet.

Just to give you the benefit of the doubt, I applied all 10 v2 patches
and adjusted the call sites to add a NOTICE to include the size of the
array being sorted. Here is the result of running your benchmark:

$ pgbench -t1000 -d postgres
pgbench (18devel)
NOTICE: RelationGetIndexList 1
NOTICE: RelationGetStatExtList 0
NOTICE: RelationGetIndexList 3
NOTICE: RelationGetStatExtList 0
NOTICE: RelationGetIndexList 2
NOTICE: RelationGetStatExtList 0
NOTICE: RelationGetIndexList 1
NOTICE: RelationGetStatExtList 0
NOTICE: RelationGetIndexList 2
NOTICE: RelationGetStatExtList 0
starting vacuum...NOTICE: RelationGetIndexList 1
NOTICE: RelationGetIndexList 0
end.
NOTICE: RelationGetIndexList 1
NOTICE: RelationGetStatExtList 0
NOTICE: RelationGetIndexList 1
NOTICE: RelationGetStatExtList 0
NOTICE: RelationGetIndexList 1
NOTICE: RelationGetStatExtList 0
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
number of transactions per client: 1000
number of transactions actually processed: 1000/1000
number of failed transactions: 0 (0.000%)
latency average = 0.915 ms
initial connection time = 23.443 ms
tps = 1092.326732 (without initial connection time)

Note that -t1000 shows the same number of notices as -t1.

So, it seems everything you've changed that runs in your benchmark is
RelationGetIndexList() and RelationGetStatExtList(). In one of the
calls to RelationGetIndexList() we're sorting up to a maximum of 3
elements.

Just to be clear, I'm not stating that I think all of your changes are
useless. If you want these patches accepted, then you're going to need
to prove they're useful and you've not done that.

Also, unless Andrey is happy for you to tag onto the work he's doing,
I'd suggest another thread for that work. I don't think there's any
good reason for that work to delay Andrey's work.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-09-08 21:32:16 Test improvements and minor code fixes for formatting.c.
Previous Message Thomas Munro 2024-09-08 20:55:31 Re: CI, macports, darwin version problems