Re: Improve CRC32C performance on SSE4.2

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: John Naylor <johncnaylorls(at)gmail(dot)com>
Cc: "Devulapalli, Raghuveer" <raghuveer(dot)devulapalli(at)intel(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>
Subject: Re: Improve CRC32C performance on SSE4.2
Date: 2025-04-01 16:25:23
Message-ID: Z-wTcyia0z7i2Qnb@nathan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 01, 2025 at 05:33:02PM +0700, John Naylor wrote:
> On Thu, Mar 27, 2025 at 2:55 AM Devulapalli, Raghuveer <raghuveer(dot)devulapalli(at)intel(dot)com> wrote:
>> (1) zmm_regs_available() in pg_crc32c_sse42_choose.c is a duplicate of
>> pg_popcount_avx512.c but perhaps that’s fine for now. I will propose a
>> consolidated SIMD runtime check in a follow up patch.
>
> Yeah, I was thinking a small amount of duplication is tolerable.

+1. This is easy enough to change in the future if/when we add some sort
of centralized CPU feature detection.

>> (2) Might be apt to rename pg_crc32c_sse42*.c to pg_crc32c_x86*.c since
>> they contain both sse42 and avx512 versions.
>
> The name is now not quite accurate, but it's not exactly misleading
> either. I'm leaning towards keeping it the same, so for now I've just
> updated the header comment.

I'm not too worried about this one either. FWIW I'm likely going to look
into moving all the x86_64 popcount stuff into pg_popcount_avx512.c and
renaming it to pg_popcount_x86_64.c for v19. This would parallel
pg_popcount_aarch64.c a bit better, and a file per architecture seems like
a logical way to neatly organize things.

> For v16, I made another pass through made some more mostly superficial
> adjustments:
> - copied rewritten Meson comment to configure.ac
> - added some more #include guards out of paranoia
> - added tests with longer lengths that exercise the new paths
> - adjusted configure / meson messages for consistency
> - changed not-quite-accurate wording about "AVX-512 CRC instructions"
> - used "PCLMUL" only when talking about specific intrinsics and prefer
> "AVX-512" elsewhere, to head off potential future confusion with Arm
> PMULL.

I read through the code a couple of times and nothing stood out to me.

--
nathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-04-01 16:51:53 Re: AIO v2.5
Previous Message Noah Misch 2025-04-01 16:07:27 Re: AIO v2.5