From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
---|---|
To: | "Devulapalli, Raghuveer" <raghuveer(dot)devulapalli(at)intel(dot)com> |
Cc: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Xiang Gao <Xiang(dot)Gao(at)arm(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: CRC32C Parallel Computation Optimization on ARM |
Date: | 2025-03-18 11:50:59 |
Message-ID: | CANWCAZbdjPLkojSFo2kObBOsucvyExkAJ9rnTUneoAR=5mrQGQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Mar 13, 2025 at 12:50 AM Devulapalli, Raghuveer
<raghuveer(dot)devulapalli(at)intel(dot)com> wrote:
>
> > > Intel has contributed SSE4.2 CRC32C [1] and AVX-512 CRC32C [2] based on
> > similar techniques to postgres.
> >
> > ...this is a restatement of facts we already know. I'm guessing the intended
> > takeaway is "since Intel submitted an implementation to us based on paper A,
> > then we are free to separately also use a technique from paper B (which cites
> > patents)".
>
> Yes.
>
> > The original proposal that started this thread is below, and I'd like to give that
> > author credit for initiating that work
>
> Yup, that should be fine.
Thank you for confirming. I've attached v10, which has mostly
polishing and comment writing, and a draft commit message. The lookup
table and software carryless multiplication routine are still in
pg_crc32c_sb.c , which is now built unconditionally. That's good
foreshadowing of future pclmul/pmull support, as I've found building
that file everywhere makes some things simpler anyway. That file has
become a bit of a misnomer, and I've thought of renaming it to
*_common.c or perhaps *_fallback.c , since the addition from this
patch is still kind of a fallback where we won't have the hardware
needed for faster algorithms, as discussed elsewhere.
0002-3 puts the relevant parts into a header so that the hardware
details can be abstracted away. These would be squashed, but I've kept
them separate here for comparison.
--
John Naylor
Amazon Web Services
Attachment | Content-Type | Size |
---|---|---|
v10-0002-Use-template-file-for-parallel-CRC-computation.patch | text/x-patch | 8.2 KB |
v10-0001-Execute-hardware-CRC-computation-in-parallel.patch | text/x-patch | 19.8 KB |
v10-0003-Fix-headerscheck.patch | text/x-patch | 916 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Ashutosh Bapat | 2025-03-18 11:51:16 | Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided. |
Previous Message | Ashutosh Bapat | 2025-03-18 11:48:32 | Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning |