Quick Links

RE: SIMD optimization for list_sort

From:	"R, Rakshit" <rakshit(dot)r(at)intel(dot)com>
To:	"Devulapalli, Raghuveer" <raghuveer(dot)devulapalli(at)intel(dot)com>, John Naylor <johncnaylorls(at)gmail(dot)com>
Cc:	Andrei Lepikhov <lepihov(at)gmail(dot)com>, "Giacchino, Luca" <luca(dot)giacchino(at)intel(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>, "Paul, Sourav Kumar" <sourav(dot)kumar(dot)paul(at)intel(dot)com>
Subject:	RE: SIMD optimization for list_sort
Date:	2025-02-28 05:43:16
Message-ID:	CO1PR11MB5060E34D88194334B7F708E9EDCC2@CO1PR11MB5060.namprd11.prod.outlook.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi All,

Thank you so much for the feedback.

> I don't think "another extension might use it someday" makes a very strong case,
> particularly for something that requires a new dependency.

The x86-simdsort library is an optional dependency in Postgres. Also the new list sort implementation which uses the x86-simdsort library does not impact any of the existing workflows in Postgres. We have at least one use case where Pgvector gets a gain of 7 - 10 % in HNSW index build time using the SIMD sort implementation. Hence we feel that this patch could be up streamed further. Open to further discussion on the same.

> Note that our current implemention is highly optimized for low-cardinality inputs.
> This is needed for aggregate queries. I found this write-up of a
> couple scalar and vectorized sorts, and they show this library doing
> very poorly on very-low cardinality inputs. I would look into that
> before trying to get something in shape to share.
>
> https://github.com/Voultapher/sort-research-
> rs/blob/main/writeup/intel_avx512/text.md

We ran our extension to stress list sort with low cardinality inputs. For eg, for an array of size 100k having repeated values in the range 1-10 we still see a gain of around 20% in throughput.
We will collect more data for low cardinality inputs and with AVX2 too.

Thanks and regards
Rakshit Rajeev

In response to

RE: SIMD optimization for list_sort at 2025-02-20 17:25:49 from Devulapalli, Raghuveer

Responses

Re: SIMD optimization for list_sort at 2025-02-28 12:36:08 from John Naylor

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2025-02-28 05:43:17	Re: Remove extra Sort node above a btree-compatible Index Scan
Previous Message	Hayato Kuroda (Fujitsu)	2025-02-28 05:41:59	RE: ReplicationSlotRelease() crashes when the instance is in the single user mode