Re: CRC32C Parallel Computation Optimization on ARM

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Xiang Gao <Xiang(dot)Gao(at)arm(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: CRC32C Parallel Computation Optimization on ARM
Date: 2023-11-10 16:36:08
Message-ID: 20231110163608.GA1225566@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 07, 2023 at 08:05:45AM +0000, Xiang Gao wrote:
> I think I understand what you mean, this is the latest patch. Thank you!

Thanks for the new patch.

+# PGAC_ARMV8_VMULL_INTRINSICS
+# ----------------------------
+# Check if the compiler supports the vmull_p64
+# intrinsic functions. These instructions
+# were first introduced in ARMv8 crypto Extension.

I wonder if it'd be better to call this PGAC_ARMV8_CRYPTO_INTRINSICS since
this check seems to indicate the presence of +crypto. Presumably there are
other instructions in this extension that could be used elsewhere, in which
case we could reuse this.

+# Use ARM VMULL if available and ARM CRC32C intrinsic is avaliable too.
+if test x"$USE_ARMV8_VMULL" = x"" && test x"$USE_ARMV8_VMULL_WITH_RUNTIME_CHECK" = x"" && (test x"$USE_ARMV8_CRC32C" = x"1" || test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x"1"); then
+ if test x"$pgac_armv8_vmull_intrinsics" = x"yes" && test x"$CFLAGS_VMULL" = x""; then
+ USE_ARMV8_VMULL=1
+ else
+ if test x"$pgac_armv8_vmull_intrinsics" = x"yes"; then
+ USE_ARMV8_VMULL_WITH_RUNTIME_CHECK=1
+ fi
+ fi
+fi

I'm not sure I see the need to check USE_ARMV8_CRC32C* when setting these.
Couldn't we set them solely on the results of our
PGAC_ARMV8_VMULL_INTRINSICS check? It looks like this is what you are
doing in meson.build already.

+extern pg_crc32c pg_comp_crc32c_with_vmull_armv8(pg_crc32c crc, const void *data, size_t len);

nitpick: Maybe pg_comp_crc32_armv8_parallel()?

-# all versions of pg_crc32c_armv8.o need CFLAGS_CRC
-pg_crc32c_armv8.o: CFLAGS+=$(CFLAGS_CRC)
-pg_crc32c_armv8_shlib.o: CFLAGS+=$(CFLAGS_CRC)
-pg_crc32c_armv8_srv.o: CFLAGS+=$(CFLAGS_CRC)

Why are these lines deleted?

- ['pg_crc32c_armv8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 'crc'],
+ ['pg_crc32c_armv8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],

What is the purpose of this change?

+__attribute__((target("+crc+crypto")))

I'm not sure we can assume that all compilers will understand this, and I'm
not sure we need it.

+ if (use_vmull)
+ {
+/*
+ * Crc32c parallel computation Input data is divided into three
+ * equal-sized blocks. Block length : 42 words(42 * 8 bytes).
+ * CRC0: 0 ~ 41 * 8,
+ * CRC1: 42 * 8 ~ (42 * 2 - 1) * 8,
+ * CRC2: 42 * 2 * 8 ~ (42 * 3 - 1) * 8.
+ */

Shouldn't we surround this with #ifdefs for USE_ARMV8_VMULL*?

if (pg_crc32c_armv8_available())
+ {
+#if defined(USE_ARMV8_VMULL)
+ pg_comp_crc32c = pg_comp_crc32c_with_vmull_armv8;
+#elif defined(USE_ARMV8_VMULL_WITH_RUNTIME_CHECK)
+ if (pg_vmull_armv8_available())
+ {
+ pg_comp_crc32c = pg_comp_crc32c_with_vmull_armv8;
+ }
+ else
+ {
+ pg_comp_crc32c = pg_comp_crc32c_armv8;
+ }
+#else
pg_comp_crc32c = pg_comp_crc32c_armv8;
+#endif
+ }

IMO it'd be better to move the #ifdefs into the functions so that we can
simplify this to something like

if (pg_crc32c_armv8_available())
{
if (pg_crc32c_armv8_crypto_available())
pg_comp_crc32c = pg_comp_crc32c_armv8_parallel;
else
pg_comp_crc32c = pg_comp_crc32c_armv8;
else
pc_comp_crc32c = pg_comp_crc32c_sb8;

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-11-10 16:43:32 Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Previous Message Heikki Linnakangas 2023-11-10 15:16:35 Re: AdvanceXLInsertBuffers() vs wal_sync_method=open_datasync