From: | "Amonson, Paul D" <paul(dot)d(dot)amonson(at)intel(dot)com> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Matthias van de Meent" <boekewurm+postgres(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | RE: Popcount optimization using AVX512 |
Date: | 2024-03-04 21:39:36 |
Message-ID: | BL1PR11MB5304BFE26BAC25624508CFD2DC232@BL1PR11MB5304.namprd11.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
First, apologies on the patch. Find re-attached updated version.
Now I have some questions....
#1
> -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
> +#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31)
> +<< 31))
>
> IME this means that the autoconf you are using has been patched. A quick search on the mailing lists seems to indicate that it might be specific to Debian [1].
I am not sure what the ask is here? I made changes to the configure.ac and ran autoconf2.69 to get builds to succeed. Do you have a separate feedback here?
#2
As for the refactoring, this was done to satisfy previous review feedback about applying the AVX512 CFLAGS to the entire pg_bitutils.c file. Mainly to avoid segfault due to the AVX512 flags. If its ok, I would prefer to make a single commit as the change is pretty small and straight forward.
#3
I am not sure I understand the comment about the SIZE_VOID_P checks. Aren't they necessary to choose which functions to call based on 32 or 64 bit architectures?
#4
Would this change qualify for Workflow A as described in [0] and can be picked up by a committer, given it has been reviewed by multiple committers so far? The scope of the change is pretty contained as well.
[0] https://wiki.postgresql.org/wiki/Submitting_a_Patch
Thanks,
Paul
-----Original Message-----
From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Sent: Friday, March 1, 2024 1:45 PM
To: Amonson, Paul D <paul(dot)d(dot)amonson(at)intel(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>; Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>; Shankaran, Akash <akash(dot)shankaran(at)intel(dot)com>; Noah Misch <noah(at)leadboat(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>; pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Popcount optimization using AVX512
Thanks for the new version of the patch. I didn't see a commitfest entry for this one, and unfortunately I think it's too late to add it for the March commitfest. I would encourage you to add it to July's commitfest [0] so that we can get some routine cfbot coverage.
On Tue, Feb 27, 2024 at 08:46:06PM +0000, Amonson, Paul D wrote:
> After consulting some Intel internal experts on MSVC the linking issue
> as it stood was not resolved. Instead, I created a MSVC ONLY work-around.
> This adds one extra functional call on the Windows builds (The linker
> resolves a real function just fine but not a function pointer of the
> same name). This extra latency does not exist on any of the other
> platforms. I also believe I addressed all issues raised in the
> previous reviews. The new pg_popcnt_x86_64_accel.c file is now the
> ONLY file compiled with the
> AVX512 compiler flags. I added support for the MSVC compiler flag as
> well. Both meson and autoconf are updated with the new refactor.
>
> I am attaching the new patch.
I think this patch might be missing the new files.
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31)
+<< 31))
IME this means that the autoconf you are using has been patched. A quick search on the mailing lists seems to indicate that it might be specific to Debian [1].
-static int pg_popcount32_slow(uint32 word);
-static int pg_popcount64_slow(uint64 word);
+int pg_popcount32_slow(uint32 word);
+int pg_popcount64_slow(uint64 word);
+uint64 pg_popcount_slow(const char *buf, int bytes);
This patch appears to do a lot of refactoring. Would it be possible to break out the refactoring parts into a prerequisite patch that could be reviewed and committed independently from the AVX512 stuff?
-#if SIZEOF_VOID_P >= 8
+#if SIZEOF_VOID_P == 8
/* Process in 64-bit chunks if the buffer is aligned. */
- if (buf == (const char *) TYPEALIGN(8, buf))
+ if (buf == (const char *)TYPEALIGN(8, buf))
{
- const uint64 *words = (const uint64 *) buf;
+ const uint64 *words = (const uint64 *)buf;
while (bytes >= 8)
{
@@ -309,9 +213,9 @@ pg_popcount(const char *buf, int bytes)
bytes -= 8;
}
- buf = (const char *) words;
+ buf = (const char *)words;
}
-#else
+#elif SIZEOF_VOID_P == 4
/* Process in 32-bit chunks if the buffer is aligned. */
if (buf == (const char *) TYPEALIGN(4, buf))
{
Most, if not all, of these changes seem extraneous. Do we actually need to more strictly check SIZEOF_VOID_P?
[0] https://commitfest.postgresql.org/48/
[1] https://postgr.es/m/20230211020042.uthdgj72kp3xlqam%40awork3.anarazel.de
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
v5-0001-Add-support-for-AVX512-implemented-POPCNT.patch | application/octet-stream | 31.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2024-03-04 21:50:48 | Re: Improve the log message output of basic_archive when basic_archive.archive_directory parameter is not set |
Previous Message | Daniel Gustafsson | 2024-03-04 21:03:13 | Re: Adding deprecation notices to pgcrypto documentation |