From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Using POPCNT and other advanced bit manipulation instructions |
Date: | 2019-02-14 22:57:54 |
Message-ID: | 4022.1550185074@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> That leads me to the attached patch. It creates a new file
> pg_popcount.c which is the only one compiled with -mpopcnt (if
> available); if there's no compiler switch to enable POPCNT, we just
> don't compile the file. I'm not sure that's kosher -- in particular I'm
> not sure if it can fail when POPCNT is enabled by other flags and
> -mpopcnt is not needed at all. I think our c-compiler.m4 stuff is a bit
> too simplistic there: it just assumes that -mpopcnt is always required.
Yes, the configure test for this stuff is really pretty broken.
It's conflating two nearly independent questions: (1) does the compiler
have __builtin_popcount(), and (2) does the compiler accept -mpopcnt.
It is certainly the case that (1) may hold without (2); in fact, every
recent non-x86_64 gcc is a counterexample to how that's done in HEAD.
I think we need a clean test for __builtin_popcount(), and to be willing
to use it if available, independently of -mpopcnt. Then separately we
should test to see if -mpopcnt works, probably with the same
infrastructure we use for checking for other compiler flags, viz
# Optimization flags for specific files that benefit from vectorization
PGAC_PROG_CC_VAR_OPT(CFLAGS_VECTOR, [-funroll-loops])
PGAC_PROG_CC_VAR_OPT(CFLAGS_VECTOR, [-ftree-vectorize])
+ # Optimization flags for bit-twiddling
+ PGAC_PROG_CC_VAR_OPT(CFLAGS_POPCNT, [-mpopcnt])
# We want to suppress clang's unhelpful unused-command-line-argument warnings
Then the correct test to see if we want to build pg_popcount.c (BTW,
please pick a less generic name for that) and the choose function
is whether we have *both* HAVE__BUILTIN_POPCOUNT and nonempty
CFLAGS_POPCNT.
I don't think this'd be fooled by user-specified CFLAGS. The worst
possible outcome is that it builds a function that we intended would
use POPCNT but it's falling back to some other implementation, in
case the compiler has a switch named -mpopcnt but it doesn't do what
we think it does, or the user overrode things by adding -mno-popcnt.
That would really be nearly cost-free, other than the overhead of
the choose function the first time through: both of the execution
functions would be reducing to __builtin_popcount(), for whatever
version of that the compiler is giving us, so the choice wouldn't
matter.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2019-02-14 23:06:16 | Re: [PATCH] xlogreader: do not read a file block twice |
Previous Message | Alvaro Herrera | 2019-02-14 22:43:35 | Re: Using POPCNT and other advanced bit manipulation instructions |