Quick Links

Re: allowing broader use of simplehash

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: allowing broader use of simplehash
Date:	2019-12-11 15:50:16
Message-ID:	CA+TgmoZLOE_hJ+OHWmQ906xuUFCF4+tc74-W1qDtrO0mJv=-Yg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Dec 10, 2019 at 4:59 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> 3) For lots of one-off uses of hashtables that aren't performance
> critical, we want a *simple* API. That IMO would mean that key/value
> end up being separately allocated pointers, and that just a
> comparator is provided when creating the hashtable.

I think the simplicity of the API is a key point. Some things that are
bothersome about dynahash:

- It knows about memory contexts and insists on having its own.
- You can't just use a hash table in shared memory; you have to
"attach" to it first and have an object in backend-private memory.
- The usual way of getting a shared hash table is ShmemInitHash(), but
that means that the hash table has its own named chunk and that it's
in the main shared memory segment. If you want to put it inside
another chunk or put it in DSM or whatever, it doesn't work.
- It knows about LWLocks and if it's a shared table it needs its own
tranche of them.
- hash_search() is hard to wrap your head around.

One thing I dislike about simplehash is that the #define-based
interface is somewhat hard to use. It's not that it's a bad design.
It's just you have to sit down and think for a while to figure out
which things you need to #define in order to get it to do what you
want. I'm not sure that's something that can or needs to be fixed, but
it's something to consider. Even dynahash, as annoying as it is, is in
some ways easier to get up and running.

Probably the two most common uses cases are: (1) a fixed-sized shared
memory hash table of fixed-size entries where the key is the first N
bytes of the entry and it never grows, or (2) a backend-private or
perhaps frontend hash table of fixed-size entries where the key is the
first N bytes of the entry, and it grows without limit. I think should
consider having specialized APIs for those two cases and then more
general APIs that you can use when that's not enough.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: allowing broader use of simplehash at 2019-12-10 21:59:54 from Andres Freund

Responses

Re: allowing broader use of simplehash at 2019-12-12 19:51:40 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2019-12-11 15:52:30	Re: BUG #16059: Tab-completion of filenames in COPY commands removes required quotes
Previous Message	Tom Lane	2019-12-11 15:49:13	Re: Optimization of NestLoop join in the case of guaranteed empty inner subtree