RE: SIMD optimization for list_sort

From: "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>
To: John Naylor <johncnaylorls(at)gmail(dot)com>, "R, Rakshit" <rakshit(dot)r(at)intel(dot)com>
Cc: "Devulapalli, Raghuveer" <raghuveer(dot)devulapalli(at)intel(dot)com>, "Andrei Lepikhov" <lepihov(at)gmail(dot)com>, "Giacchino, Luca" <luca(dot)giacchino(at)intel(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Paul, Sourav Kumar" <sourav(dot)kumar(dot)paul(at)intel(dot)com>
Subject: RE: SIMD optimization for list_sort
Date: 2025-03-01 06:23:39
Message-ID: PH0PR11MB5000105EBB6BE87074B9A553F2CF2@PH0PR11MB5000.namprd11.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>> > I don't think "another extension might use it someday" makes a very strong case,
>> > particularly for something that requires a new dependency.
>>
>> The x86-simdsort library is an optional dependency in Postgres. Also the new list sort implementation which uses the x86-simdsort library does not impact any of the existing workflows in Postgres.

>"Optional" and "Does not impact" are not great selling points to get
>us to take a 1500 line patch. As we told you in November, list_sort
>isn't critical for us. You need to start with the user and work
>backwards to the technology. Don't pick a technology and try to sell
>people on using it.

Agree on starting from the user problem and work towards technology. As stated upthread, the problem being addressed here is optimizing pg_vector list_sort (which relies on postgres list_sort) to speed up HNSW index construction. The results show 7-10% improvements on vector workloads using the library.

Given the same x86-simdsort library is intended to optimize 1) list_sort 2) tuple sort, it didn't make sense to duplicate the work to integrate it in pg_vector for list_sort, and then again in postgres for tuple-sort.
We're trying to tackle list_sort and tuple_sort in separate patches using the same x86-simdsort library, with the goal to optimize both. Let us know if this approach is preferred and if the list_sort patch could be reviewed and any other tests we should do, or would you rather see tuple_sort benchmarks.

- Akash Shankaran

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-03-01 06:43:44 Re: bug when apply fast default mechanism for adding new column over domain with default value
Previous Message jian he 2025-03-01 05:39:56 bug when apply fast default mechanism for adding new column over domain with default value