Re: WIP: dynahash replacement for buffer table

From: Ants Aasma <ants(at)cybertec(dot)at>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: WIP: dynahash replacement for buffer table
Date: 2014-10-16 16:59:25
Message-ID: CA+CSw_uRPJY5WReBtrS3oiTX0CYqOPbJh1SUNgZdKrytA0OxMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Oct 15, 2014 7:32 PM, "Ants Aasma" <ants(at)cybertec(dot)at> wrote:
> I'm imagining a bucketized cuckoo hash with 5 item buckets (5-way
> associativity). This allows us to fit the bucket onto 2 regular sized
> cache lines and have 8 bytes left over. Buckets would be protected by
> seqlocks stored in the extra space. On the read side we would only
> need 2 read barriers (basically free on x86), and we are guaranteed to
> have an answer by fetching two pairs of cache lines. We can trade
> memory bandwidth for latency by issuing prefetches (once we add
> primitives for this). Alternatively we can trade insert speed for
> lookup speed by using asymmetrically sized tables.

I was thinking about this. It came to me that we might not even need
BufferTag's to be present in the hash table. In the hash table itself
we store just a combination of 4 byte buffer tag hash-buffer id. This
way we can store 8 pairs in one cacheline. Lookup must account for the
fact that due to hash collisions we may find multiple matches one of
which may be the buffer we are looking for. We can make condition
exceedingly unlikely by taking advantage of the fact that we have two
hashes anyway, in each table we use the lookup hash of the other table
for tagging. (32bit hash collision within 8 items). By having a
reasonably high load factor (75% should work) and 8 bytes per key we
even have a pretty good chance of fitting the hotter parts of the
buffer lookup table in CPU caches.

We use a separate array for the locks (spinlocks should be ok here)
for fine grained locking, this may be 1:1 with the buckets or we can
map many buckets to a single lock. Otherwise the scheme should work
mostly the same. Using this scheme we can get by on the lookup side
using 64 bit atomic reads with no barriers, we only need atomic ops
for pinning the found buffer.

I haven't had the chance to check out how second-chance hashing works
and if this scheme would be applicable to it.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-10-16 18:01:57 Re: UPSERT wiki page, and SQL MERGE syntax
Previous Message Ryan Johnson 2014-10-16 16:26:10 Re: WIP: dynahash replacement for buffer table