From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
---|---|
To: | "Devulapalli, Raghuveer" <raghuveer(dot)devulapalli(at)intel(dot)com> |
Cc: | Nathan Bossart <nathandbossart(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com> |
Subject: | Re: Improve CRC32C performance on SSE4.2 |
Date: | 2025-03-25 13:18:41 |
Message-ID: | CANWCAZY1Le1tpTZauY-JzbLpk=VSerP8=GZs36Cza9iJfRnn-A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Mar 24, 2025 at 6:37 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
>
> I'll take a look at the configure
> checks soon, since I had some questions there.
One other thing I forgot to mention: The previous test function had
local constants that the compiler was able to fold, resulting in no
actual vector instructions being emitted:
movabs rdx, 12884901891
xor eax, eax
crc32 rax, rdx
crc32 rax, rdx
ret
That may be okay for practical purposes, but in the spirit of commit
fdb5dd6331e30 I changed it in v15 to use global variables and made
sure it emits what the function attributes are intended for:
vmovdqu64 zmm3, ZMMWORD PTR x[rip]
xor eax, eax
vpclmulqdq zmm0, zmm3, ZMMWORD PTR y[rip], 0
vextracti32x4 xmm2, zmm0, 1
vmovdqa64 xmm1, xmm0
vmovdqu64 ZMMWORD PTR y[rip], zmm0
vextracti32x4 xmm0, zmm0, 2
vpternlogq xmm1, xmm2, xmm0, 150
vmovq rdx, xmm1
crc32 rax, rdx
vzeroupper
ret
--
John Naylor
Amazon Web Services
From | Date | Subject | |
---|---|---|---|
Next Message | Noah Misch | 2025-03-25 13:33:21 | Re: AIO v2.5 |
Previous Message | John Naylor | 2025-03-25 13:04:10 | Re: Improve CRC32C performance on SSE4.2 |