From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | pgsql-committers(at)lists(dot)postgresql(dot)org |
Subject: | Re: pgsql: Fix compiler builtin usage in new pg_bitutils.c |
Date: | 2019-02-16 04:30:33 |
Message-ID: | 31413.1550291433@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers |
Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> (BTW, my reading of the articles I cited, as well as my own runs of the
> test programs therein, suggest that in order to get a really good
> performance improvement you need to hand-code calls to the POPCNT
> instruction in assembly rather than rely on the compiler intrinsics.
That observation led me to think about using asm() instead of
__builtin_popcount + -mpopcnt, and I realized there are several
fewer moving parts if we do it that way: we don't need to worry
about the compiler switch, and we don't need to rely on faith that
it actually changes the emitted code, and we don't need a separate
source file to limit the scope of the switch. And really, requiring
__builtin_popcount + -mpopcnt is pretty much restricting the
optimization to GCC-alikes anyway, so requiring asm() probably
doesn't eliminate any toolchains that would've handled the other way.
Hence, I made it work like that. Committed with that and some cosmetic
cleanups.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Meskes | 2019-02-16 10:10:27 | pgsql: Add DECLARE STATEMENT support to ECPG. |
Previous Message | Tom Lane | 2019-02-16 04:23:44 | pgsql: Make use of compiler builtins and/or assembly for CLZ, CTZ, POPC |