From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | "Amonson, Paul D" <paul(dot)d(dot)amonson(at)intel(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Popcount optimization using AVX512 |
Date: | 2024-03-01 21:44:57 |
Message-ID: | 20240301214457.GA3024365@nathanxps13 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thanks for the new version of the patch. I didn't see a commitfest entry
for this one, and unfortunately I think it's too late to add it for the
March commitfest. I would encourage you to add it to July's commitfest [0]
so that we can get some routine cfbot coverage.
On Tue, Feb 27, 2024 at 08:46:06PM +0000, Amonson, Paul D wrote:
> After consulting some Intel internal experts on MSVC the linking issue as
> it stood was not resolved. Instead, I created a MSVC ONLY work-around.
> This adds one extra functional call on the Windows builds (The linker
> resolves a real function just fine but not a function pointer of the same
> name). This extra latency does not exist on any of the other platforms. I
> also believe I addressed all issues raised in the previous reviews. The
> new pg_popcnt_x86_64_accel.c file is now the ONLY file compiled with the
> AVX512 compiler flags. I added support for the MSVC compiler flag as
> well. Both meson and autoconf are updated with the new refactor.
>
> I am attaching the new patch.
I think this patch might be missing the new files.
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
IME this means that the autoconf you are using has been patched. A quick
search on the mailing lists seems to indicate that it might be specific to
Debian [1].
-static int pg_popcount32_slow(uint32 word);
-static int pg_popcount64_slow(uint64 word);
+int pg_popcount32_slow(uint32 word);
+int pg_popcount64_slow(uint64 word);
+uint64 pg_popcount_slow(const char *buf, int bytes);
This patch appears to do a lot of refactoring. Would it be possible to
break out the refactoring parts into a prerequisite patch that could be
reviewed and committed independently from the AVX512 stuff?
-#if SIZEOF_VOID_P >= 8
+#if SIZEOF_VOID_P == 8
/* Process in 64-bit chunks if the buffer is aligned. */
- if (buf == (const char *) TYPEALIGN(8, buf))
+ if (buf == (const char *)TYPEALIGN(8, buf))
{
- const uint64 *words = (const uint64 *) buf;
+ const uint64 *words = (const uint64 *)buf;
while (bytes >= 8)
{
@@ -309,9 +213,9 @@ pg_popcount(const char *buf, int bytes)
bytes -= 8;
}
- buf = (const char *) words;
+ buf = (const char *)words;
}
-#else
+#elif SIZEOF_VOID_P == 4
/* Process in 32-bit chunks if the buffer is aligned. */
if (buf == (const char *) TYPEALIGN(4, buf))
{
Most, if not all, of these changes seem extraneous. Do we actually need to
more strictly check SIZEOF_VOID_P?
[0] https://commitfest.postgresql.org/48/
[1] https://postgr.es/m/20230211020042.uthdgj72kp3xlqam%40awork3.anarazel.de
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Jacob Champion | 2024-03-01 21:49:12 | Re: Experiments with Postgres and SSL |
Previous Message | Thomas Munro | 2024-03-01 21:23:37 | Re: pread, pwrite, etc return ssize_t not int |