Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

From: Andres Freund <andres(at)anarazel(dot)de>
To: "Devulapalli, Raghuveer" <raghuveer(dot)devulapalli(at)intel(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>
Subject: Re: Proposal for Updating CRC32C with AVX-512 Algorithm.
Date: 2024-11-07 16:05:14
Message-ID: rxchh5kqwt4j7bwwgpugx75zq7kmslluqsmaxecw5bksypp2es@2zvbgtcbkf3s
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-10-30 21:03:20 +0000, Devulapalli, Raghuveer wrote:
> v6: Fixing build failure on Windows/MSVC.
>
> Raghuveer

> From b601e7b4ee9f25fd32e9d8d056bb20a03d755a8a Mon Sep 17 00:00:00 2001
> From: Paul Amonson <paul(dot)d(dot)amonson(at)intel(dot)com>
> Date: Mon, 6 May 2024 08:34:17 -0700
> Subject: [PATCH v6 1/6] Add a Postgres SQL function for crc32c testing.
>
> Signed-off-by: Paul Amonson <paul(dot)d(dot)amonson(at)intel(dot)com>
> Signed-off-by: Raghuveer Devulapalli <raghuveer(dot)devulapalli(at)intel(dot)com>
> ---
> src/test/modules/test_crc32c/Makefile | 20 +++++++++
> .../modules/test_crc32c/test_crc32c--1.0.sql | 1 +
> src/test/modules/test_crc32c/test_crc32c.c | 41 +++++++++++++++++++
> .../modules/test_crc32c/test_crc32c.control | 4 ++
> 4 files changed, 66 insertions(+)
> create mode 100644 src/test/modules/test_crc32c/Makefile
> create mode 100644 src/test/modules/test_crc32c/test_crc32c--1.0.sql
> create mode 100644 src/test/modules/test_crc32c/test_crc32c.c
> create mode 100644 src/test/modules/test_crc32c/test_crc32c.control

Needs to be integrated with the meson based build as well.

> +/*
> + * drive_crc32c(count: int, num: int) returns bigint
> + *
> + * count is the nuimber of loops to perform
> + *
> + * num is the number byte in the buffer to calculate
> + * crc32c over.
> + */
> +PG_FUNCTION_INFO_V1(drive_crc32c);
> +Datum
> +drive_crc32c(PG_FUNCTION_ARGS)
> +{
> + int64 count = PG_GETARG_INT64(0);
> + int64 num = PG_GETARG_INT64(1);
> + pg_crc32c crc = 0xFFFFFFFF;
> + const char* data = malloc((size_t)num);

This is computing a crc of uninitialized data. That's
a) undefined behaviour
b) means the return value is basically random
c) often will just CRC a lot of zeroes

> From da26645ec8515e0e6d91e2311a83c3bb6649017e Mon Sep 17 00:00:00 2001
> From: Paul Amonson <paul(dot)d(dot)amonson(at)intel(dot)com>
> Date: Tue, 23 Jul 2024 11:23:23 -0700
> Subject: [PATCH v6 2/6] Move all HW checks to common file.

Would be good to actually include a justification here.

> --- /dev/null
> +++ b/src/port/pg_hw_feat_check.c
> @@ -0,0 +1,159 @@
> +/*-------------------------------------------------------------------------
> + *
> + * pg_hw_feat_check.c
> + * Test for hardware features at runtime on x86_64 platforms.
> + *
> + * Copyright (c) 2024, PostgreSQL Global Development Group
> + *
> + * IDENTIFICATION
> + * src/port/pg_hw_feat_check.c
> + *
> + *-------------------------------------------------------------------------
> + */
> +#include "c.h"
> +
> +#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT)
> +#include <cpuid.h>
> +#endif
> +
> +#include <immintrin.h>
> +
> +#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX)
> +#include <intrin.h>
> +#endif
> +
> +#include "port/pg_hw_feat_check.h"
> +
> +/* Define names for EXX registers to avoid hard to see bugs in code below. */
> +typedef unsigned int exx_t;
> +typedef enum
> +{
> + EAX = 0,
> + EBX = 1,
> + ECX = 2,
> + EDX = 3
> +} reg_name;

Shouldn't this be in some x86 sepcific ifdef?

> +# PGAC_AVX512_CRC32_INTRINSICS
> +# ---------------------------
> +# Check if the compiler supports the x86 CRC instructions added in AVX-512,
> +# using the intrinsic functions:
> +
> +# (We don't test the 8-byte variant, _mm_crc32_u64, but it is assumed to
> +# be present if the other ones are, on x86-64 platforms)
> +#
> +# An optional compiler flag can be passed as arguments (e.g. -msse4.2
> +# -mavx512vl -mvpclmulqdq). If the intrinsics are supported, sets
> +# pgac_avx512_crc32_intrinsics, and CFLAGS_CRC.
> +AC_DEFUN([PGAC_AVX512_CRC32_INTRINSICS],
> +[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx512_crc32_intrinsics_$1])])dnl
> +AC_CACHE_CHECK([for _mm512_clmulepi64_epi128, _mm512_clmulepi64_epi128... with CFLAGS=$1], [Ac_cachevar],
> +[pgac_save_CFLAGS=$CFLAGS
> +CFLAGS="$pgac_save_CFLAGS $1"
> +AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>],
> + [const unsigned long k1k2[[8]] = {
> + 0xdcb17aa4, 0xb9e02b86, 0xdcb17aa4, 0xb9e02b86,
> + 0xdcb17aa4, 0xb9e02b86, 0xdcb17aa4, 0xb9e02b86};
> + unsigned char buffer[[512]];
> + unsigned char *aligned = (unsigned char*)(((size_t)buffer + 64L) & 0xffffffffffc0L);
> + unsigned long val;
> + __m512i x0, x1, x2, x3, x4, x5, x6, x7, x8, y5, y6, y7, y8;
> + __m128i a1, a2;
> + unsigned int crc = 0xffffffff;
> + y8 = _mm512_load_si512((__m512i *)aligned);
> + x0 = _mm512_loadu_si512((__m512i *)k1k2);
> + x1 = _mm512_loadu_si512((__m512i *)(buffer + 0x00));
> + x1 = _mm512_xor_si512(x1, _mm512_castsi128_si512(_mm_cvtsi32_si128(crc)));
> + x5 = _mm512_clmulepi64_epi128(x1, x0, 0x00);
> + x1 = _mm512_ternarylogic_epi64(x1, x5, y5, 0x96);
> + a1 = _mm512_extracti32x4_epi32(x1, 3);
> + a1 = _mm_xor_epi64(a1, _mm512_castsi512_si128(x0));
> + x0 = _mm512_shuffle_i64x2(x1, x1, 0x4E);
> + val = _mm_crc32_u64(0, _mm_extract_epi64(a1, 0));
> + crc = (unsigned int)_mm_crc32_u64(val, _mm_extract_epi64(a1, 1));
> + return crc != 0;])],
> + [Ac_cachevar=yes],
> + [Ac_cachevar=no])
> +CFLAGS="$pgac_save_CFLAGS"])
> +if test x"$Ac_cachevar" = x"yes"; then
> + CFLAGS_CRC="$1"
> + pgac_avx512_crc32_intrinsics=yes
> +fi
> +undefine([Ac_cachevar])dnl
> +])# PGAC_AVX512_CRC32_INTRINSICS
> +

Why is all this stuff needed inside a configure check? We don't need to check
entire algorithms to check if we can build and link sepcific instructions, no?

> From a495124ee42cb8f9f206f719b9f2235aff715963 Mon Sep 17 00:00:00 2001
> From: Nathan Bossart <nathan(at)postgresql(dot)org>
> Date: Wed, 16 Oct 2024 15:57:55 -0500
> Subject: [PATCH v6 5/6] use __attribute__((target(...))) for AVX-512 stuff

Huh, so now we're undoing a bunch of stuff done earlier. Makes this series
pretty hard to review.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2024-11-07 16:09:47 Re: Commit Timestamp and LSN Inversion issue
Previous Message Fujii Masao 2024-11-07 15:54:58 Re: Fix for Extra Parenthesis in pgbench progress message