From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | John Naylor <johncnaylorls(at)gmail(dot)com> |
Cc: | "Devulapalli, Raghuveer" <raghuveer(dot)devulapalli(at)intel(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com> |
Subject: | Re: Improve CRC32C performance on SSE4.2 |
Date: | 2025-04-01 16:25:23 |
Message-ID: | Z-wTcyia0z7i2Qnb@nathan |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Apr 01, 2025 at 05:33:02PM +0700, John Naylor wrote:
> On Thu, Mar 27, 2025 at 2:55 AM Devulapalli, Raghuveer <raghuveer(dot)devulapalli(at)intel(dot)com> wrote:
>> (1) zmm_regs_available() in pg_crc32c_sse42_choose.c is a duplicate of
>> pg_popcount_avx512.c but perhaps that’s fine for now. I will propose a
>> consolidated SIMD runtime check in a follow up patch.
>
> Yeah, I was thinking a small amount of duplication is tolerable.
+1. This is easy enough to change in the future if/when we add some sort
of centralized CPU feature detection.
>> (2) Might be apt to rename pg_crc32c_sse42*.c to pg_crc32c_x86*.c since
>> they contain both sse42 and avx512 versions.
>
> The name is now not quite accurate, but it's not exactly misleading
> either. I'm leaning towards keeping it the same, so for now I've just
> updated the header comment.
I'm not too worried about this one either. FWIW I'm likely going to look
into moving all the x86_64 popcount stuff into pg_popcount_avx512.c and
renaming it to pg_popcount_x86_64.c for v19. This would parallel
pg_popcount_aarch64.c a bit better, and a file per architecture seems like
a logical way to neatly organize things.
> For v16, I made another pass through made some more mostly superficial
> adjustments:
> - copied rewritten Meson comment to configure.ac
> - added some more #include guards out of paranoia
> - added tests with longer lengths that exercise the new paths
> - adjusted configure / meson messages for consistency
> - changed not-quite-accurate wording about "AVX-512 CRC instructions"
> - used "PCLMUL" only when talking about specific intrinsics and prefer
> "AVX-512" elsewhere, to head off potential future confusion with Arm
> PMULL.
I read through the code a couple of times and nothing stood out to me.
--
nathan
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2025-04-01 16:51:53 | Re: AIO v2.5 |
Previous Message | Noah Misch | 2025-04-01 16:07:27 | Re: AIO v2.5 |