Quick Links

Re: Macro customizable hashtable / bitmapscan & aggregation perf

From:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Macro customizable hashtable / bitmapscan & aggregation perf
Date:	2016-10-01 18:19:21
Message-ID:	41a178d4-a599-60e6-1657-e2489508858a@2ndquadrant.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 10/01/2016 02:44 AM, Andres Freund wrote:
> Hi,
>
> On 2016-07-26 17:43:33 -0700, Andres Freund wrote:
>> In the attached patch I've attached simplehash.h, which can be
>> customized by a bunch of macros, before being inlined. There's also a
>> patch using this for tidbitmap.c and nodeAgg/nodeSubplan/... via
>> execGrouping.c.
>
> Attached is a significantly improved version of this series. The main
> changes are:
>
> - The hash table now uses robin-hood style hash collision handling. See the
> commit message of 0002 (or simplehash.h) for an introduction. That
> significantly reduces the impact of "clustering" due to linear
> addressing.

Interesting. Have you looked at cuckoo hashing? That seems to be the
go-to hashing in several database-related papers I've read recently, so
I guess there's a reason for that (IIRC it has constant worst case for
lookups, constant amortized time for inserts/deletes). Not sure how it
compares to robin-hood hashing regarding cache-friendliness though.

> - Significant comment and correctness fixes, both in simplehash.h
> - itself, and 0003/0004.
> - a lot of other random performance improvements for the hashing code.
>
>
> Unfortunately I'm running out battery right now, so I don't want to
> re-run the benchmarks posted upthread (loading takes a while). But the
> last time I ran them all the results after the patches were better than
> before.
>

Well, I have rather bad experience with running benchmarks on laptops
anyway - a lot of noise due to power management, etc. What about running
a bigger benchmark - say, TPC-DS - and evaluating the impact?

I think a crucial part of the benchmarking will be identifying and
measuring corner cases, e.g. a lot of rows with the same key, etc.
Although that probably is not a major issue for the two places switched
to the new implementation (e.g. execGrouping merges the duplicates to a
single group, by definition).

>
> This patch series currently consists out of four patches:
> - 0001 - add unlikely/likely() macros. The use here is not entirely
> mandatory, but they do provide a quite consistent improvement. There
> are other threads where they proved to be beneficial, so I see little
> reason not to introduce them.
> - 0002 - the customizable hashtable itself. It now contains documentation.
> - 0003 - convert tidbitmap.c to use the new hashmap. This includes a fix
> for the issue mentioned in [1], to improve peformance in heavily lossy
> scenarios (otherwise we could regress in some extreme cases)
> - 0004 - convert execGrouping.c, e.g. used by hash aggregates
>
>
> While not quite perfect yet, I do think this is at a state where input
> is needed. I don't want to go a lot deeper into this rabbit hole,
> before we have some agreement that this is a sensible course of action.
>

So, is it the right time to do some benchmarking?

>
> I think there's several possible additional users of simplehash.h,
> e.g. catcache.c - which has an open coded, not particularly fast hash
> table and frequently shows up in profiles - but I think the above two
> conversions are plenty to start with.
>

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Re: Macro customizable hashtable / bitmapscan & aggregation perf at 2016-10-01 00:44:54 from Andres Freund

Responses

Re: Macro customizable hashtable / bitmapscan & aggregation perf at 2016-10-01 19:59:56 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Greg Stark	2016-10-01 18:31:54	Re: Hash Indexes
Previous Message	Tom Lane	2016-10-01 18:01:31	Re: [GENERAL] pg_upgrade from 9.5 to 9.6 fails with "invalid argument"