From: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: A population of population counts |
Date: | 2016-05-07 01:38:27 |
Message-ID: | CAKJS1f9yQ-V1U3=RoQk-cHqy670wGyTZFu82nm67S7wZ7SZtXw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 7 May 2016 at 12:41, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> Hi
>
> I noticed that we have three "number_of_ones" tables under contrib and
> two under src, and some new specially masked variants for visibility
> maps.
>
> Would it be an improvement if we just defined one table with external
> linkage, and accessed it via a macros/functions popcount_uint8, and
> wider versions _uint32, popcount_array(data, len) that sum the
> popcounts of their component bytes?
>
> Then there would be less duplication, and future opportunities to use
> fancy built-ins/assembler instructions/vectorisation in one central
> place, and to work in larger sizes than bytes.
>
> Perhaps we could also get rid of the new special masked popcount
> tables by masking the value we look up instead, eg walk through the
> page calling popcount_uint64(value & FROZEN_BITMASK_64).
>
> As for the rightmost_one_pos table in bitmapset.c, couldn't the
> various bms_XXX functions just use ffs(n) - 1 and work word-at-a-time?
> That generates a bsf instruction at -O2 on this machine.
>
> The micro-optimisation opportunities probably don't matter, but I
> wondered if it might at least be interesting to delete a bunch of
> code, and re-use a standard interface for something that apparently
> several modules need to do.
I remember looking at GCC's __builtin_popcount() about 6 months ago
with thoughts about using it for bms_num_members(). I did see
performance improvements from micro-benchmarks to compare it to the
number_of_ones[] array, and saw a small improvement. Likely the
improvements I did see with those would actually have been better in a
real world case, since not having a number_of_ones[] array in a CPU
cache might be more useful for a workload with a more contended cache.
I'd like to see us using those functions, when they're available and
falling back on the array when they're not. Sounds like that would
just be a new configure test. Perhaps a good home for some shared code
would be numutils.c. I see the functions there declared in builtins.h
which I see used in contrib/spi. I think it could all be made to work
quite nicely with types like bitmapword just with some preprocessor
trickery.
I'd say go for it. Sounds like a like these would be some nice
reusable functions that we already have a suitable home for.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2016-05-07 03:08:34 | Re: pgsql: Add TAP tests for pg_dump |
Previous Message | Kevin Grittner | 2016-05-07 01:28:27 | Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold < |