From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | Xiang Gao <Xiang(dot)Gao(at)arm(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: CRC32C Parallel Computation Optimization on ARM |
Date: | 2023-11-10 16:36:08 |
Message-ID: | 20231110163608.GA1225566@nathanxps13 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Nov 07, 2023 at 08:05:45AM +0000, Xiang Gao wrote:
> I think I understand what you mean, this is the latest patch. Thank you!
Thanks for the new patch.
+# PGAC_ARMV8_VMULL_INTRINSICS
+# ----------------------------
+# Check if the compiler supports the vmull_p64
+# intrinsic functions. These instructions
+# were first introduced in ARMv8 crypto Extension.
I wonder if it'd be better to call this PGAC_ARMV8_CRYPTO_INTRINSICS since
this check seems to indicate the presence of +crypto. Presumably there are
other instructions in this extension that could be used elsewhere, in which
case we could reuse this.
+# Use ARM VMULL if available and ARM CRC32C intrinsic is avaliable too.
+if test x"$USE_ARMV8_VMULL" = x"" && test x"$USE_ARMV8_VMULL_WITH_RUNTIME_CHECK" = x"" && (test x"$USE_ARMV8_CRC32C" = x"1" || test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x"1"); then
+ if test x"$pgac_armv8_vmull_intrinsics" = x"yes" && test x"$CFLAGS_VMULL" = x""; then
+ USE_ARMV8_VMULL=1
+ else
+ if test x"$pgac_armv8_vmull_intrinsics" = x"yes"; then
+ USE_ARMV8_VMULL_WITH_RUNTIME_CHECK=1
+ fi
+ fi
+fi
I'm not sure I see the need to check USE_ARMV8_CRC32C* when setting these.
Couldn't we set them solely on the results of our
PGAC_ARMV8_VMULL_INTRINSICS check? It looks like this is what you are
doing in meson.build already.
+extern pg_crc32c pg_comp_crc32c_with_vmull_armv8(pg_crc32c crc, const void *data, size_t len);
nitpick: Maybe pg_comp_crc32_armv8_parallel()?
-# all versions of pg_crc32c_armv8.o need CFLAGS_CRC
-pg_crc32c_armv8.o: CFLAGS+=$(CFLAGS_CRC)
-pg_crc32c_armv8_shlib.o: CFLAGS+=$(CFLAGS_CRC)
-pg_crc32c_armv8_srv.o: CFLAGS+=$(CFLAGS_CRC)
Why are these lines deleted?
- ['pg_crc32c_armv8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 'crc'],
+ ['pg_crc32c_armv8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
What is the purpose of this change?
+__attribute__((target("+crc+crypto")))
I'm not sure we can assume that all compilers will understand this, and I'm
not sure we need it.
+ if (use_vmull)
+ {
+/*
+ * Crc32c parallel computation Input data is divided into three
+ * equal-sized blocks. Block length : 42 words(42 * 8 bytes).
+ * CRC0: 0 ~ 41 * 8,
+ * CRC1: 42 * 8 ~ (42 * 2 - 1) * 8,
+ * CRC2: 42 * 2 * 8 ~ (42 * 3 - 1) * 8.
+ */
Shouldn't we surround this with #ifdefs for USE_ARMV8_VMULL*?
if (pg_crc32c_armv8_available())
+ {
+#if defined(USE_ARMV8_VMULL)
+ pg_comp_crc32c = pg_comp_crc32c_with_vmull_armv8;
+#elif defined(USE_ARMV8_VMULL_WITH_RUNTIME_CHECK)
+ if (pg_vmull_armv8_available())
+ {
+ pg_comp_crc32c = pg_comp_crc32c_with_vmull_armv8;
+ }
+ else
+ {
+ pg_comp_crc32c = pg_comp_crc32c_armv8;
+ }
+#else
pg_comp_crc32c = pg_comp_crc32c_armv8;
+#endif
+ }
IMO it'd be better to move the #ifdefs into the functions so that we can
simplify this to something like
if (pg_crc32c_armv8_available())
{
if (pg_crc32c_armv8_crypto_available())
pg_comp_crc32c = pg_comp_crc32c_armv8_parallel;
else
pg_comp_crc32c = pg_comp_crc32c_armv8;
else
pc_comp_crc32c = pg_comp_crc32c_sb8;
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2023-11-10 16:43:32 | Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock |
Previous Message | Heikki Linnakangas | 2023-11-10 15:16:35 | Re: AdvanceXLInsertBuffers() vs wal_sync_method=open_datasync |