From: | "R, Rakshit" <rakshit(dot)r(at)intel(dot)com> |
---|---|
To: | "Devulapalli, Raghuveer" <raghuveer(dot)devulapalli(at)intel(dot)com>, John Naylor <johncnaylorls(at)gmail(dot)com> |
Cc: | Andrei Lepikhov <lepihov(at)gmail(dot)com>, "Giacchino, Luca" <luca(dot)giacchino(at)intel(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>, "Paul, Sourav Kumar" <sourav(dot)kumar(dot)paul(at)intel(dot)com> |
Subject: | RE: SIMD optimization for list_sort |
Date: | 2025-02-28 05:43:16 |
Message-ID: | CO1PR11MB5060E34D88194334B7F708E9EDCC2@CO1PR11MB5060.namprd11.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi All,
Thank you so much for the feedback.
> I don't think "another extension might use it someday" makes a very strong case,
> particularly for something that requires a new dependency.
The x86-simdsort library is an optional dependency in Postgres. Also the new list sort implementation which uses the x86-simdsort library does not impact any of the existing workflows in Postgres. We have at least one use case where Pgvector gets a gain of 7 - 10 % in HNSW index build time using the SIMD sort implementation. Hence we feel that this patch could be up streamed further. Open to further discussion on the same.
> Note that our current implemention is highly optimized for low-cardinality inputs.
> This is needed for aggregate queries. I found this write-up of a
> couple scalar and vectorized sorts, and they show this library doing
> very poorly on very-low cardinality inputs. I would look into that
> before trying to get something in shape to share.
>
> https://github.com/Voultapher/sort-research-
> rs/blob/main/writeup/intel_avx512/text.md
We ran our extension to stress list sort with low cardinality inputs. For eg, for an array of size 100k having repeated values in the range 1-10 we still see a gain of around 20% in throughput.
We will collect more data for low cardinality inputs and with AVX2 too.
Thanks and regards
Rakshit Rajeev
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2025-02-28 05:43:17 | Re: Remove extra Sort node above a btree-compatible Index Scan |
Previous Message | Hayato Kuroda (Fujitsu) | 2025-02-28 05:41:59 | RE: ReplicationSlotRelease() crashes when the instance is in the single user mode |