From: | Jose Luis Tallon <jltallon(at)adv-solutions(dot)net> |
---|---|
To: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Subject: | Re: Using POPCNT and other advanced bit manipulation instructions |
Date: | 2018-12-20 10:58:43 |
Message-ID: | df95d1e8-8e63-256a-2d3e-ed2f8cff2f37@adv-solutions.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 20/12/18 6:53, David Rowley wrote:
> Back in 2016 [1] there was some discussion about using the POPCNT
> instruction to improve the performance of counting the number of bits
> set in a word. Improving this helps various cases, such as
> bms_num_members and also things like counting the allvisible and
> frozen pages in the visibility map.
>
> [snip]
>
> I've put together a very rough patch to implement using POPCNT and the
> leading and trailing 0-bit instructions to improve the performance of
> bms_next_member() and bms_prev_member(). The correct function should
> be determined on the first call to each function by way of setting a
> function pointer variable to the most suitable supported
> implementation. I've not yet gone through and removed all the
> number_of_ones[] arrays to replace with a pg_popcount*() call.
IMVHO: Please do not disregard potential optimization by the compiler
around those calls.. o_0 That might explain the reduced performance
improvement observed.
Not that I can see any obvious alternative to your implementation right
away ...
> That
> seems to have mostly been done in Thomas' patch [3], part of which
> I've used for the visibilitymap.c code changes. If this patch proves
> to be possible, then I'll look at including the other changes Thomas
> made in his patch too.
>
> What I'm really looking for by posting now are reasons why we can't do
> this. I'm also interested in getting some testing done on older
> machines, particularly machines with processors that are from before
> 2007, both AMD and Intel.
I can offer a 2005-vintage Opteron 2216 rev3 (bought late 2007) to test
on. Feel free to toss me some test code.
cpuinfo flags: fpu de tsc msr pae mce cx8 apic mca cmov pat clflush
mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow
rep_good nopl extd_apicid eagerfpu pni cx16 hypervisor lahf_lm
cmp_legacy 3dnowprefetch vmmcall
> 2007-2008 seems to be around the time both
> AMD and Intel added support for POPCNT and LZCNT, going by [4].
Thanks
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2018-12-20 11:49:58 | Re: Tid scan improvements |
Previous Message | Dmitry Dolgov | 2018-12-20 10:57:14 | Re: Using POPCNT and other advanced bit manipulation instructions |