From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: allowing broader use of simplehash |
Date: | 2019-12-12 19:51:40 |
Message-ID: | 20191212195140.xmfdweada7nxj6uq@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2019-12-11 10:50:16 -0500, Robert Haas wrote:
> On Tue, Dec 10, 2019 at 4:59 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > 3) For lots of one-off uses of hashtables that aren't performance
> > critical, we want a *simple* API. That IMO would mean that key/value
> > end up being separately allocated pointers, and that just a
> > comparator is provided when creating the hashtable.
>
> I think the simplicity of the API is a key point. Some things that are
> bothersome about dynahash:
>
> - It knows about memory contexts and insists on having its own.
Which is a waste, in a good number of cases.
> - You can't just use a hash table in shared memory; you have to
> "attach" to it first and have an object in backend-private memory.
I'm not quite sure there's all that good an alternative to this,
tbh. For efficiency it's useful to have backend-local state, I
think. And I don't really see how to have that without needing to attach.
> - The usual way of getting a shared hash table is ShmemInitHash(), but
> that means that the hash table has its own named chunk and that it's
> in the main shared memory segment. If you want to put it inside
> another chunk or put it in DSM or whatever, it doesn't work.
I don't think it's quite realistic for the same implementation - although
the code could partially be shared and just specialized for both cases -
to be used for DSM and "normal" shared memory. That's however not an
excuse to have drastically different interfaces for both.
> - It knows about LWLocks and if it's a shared table it needs its own
> tranche of them.
> - hash_search() is hard to wrap your head around.
>
> One thing I dislike about simplehash is that the #define-based
> interface is somewhat hard to use. It's not that it's a bad design.
I agree. It's the best I could come up taking the limitations of C into
account, when focusing on speed and type safety. I really think this
type of hack is a stopgap measure, and we ought to upgrade to a subset
of C++.
> It's just you have to sit down and think for a while to figure out
> which things you need to #define in order to get it to do what you
> want. I'm not sure that's something that can or needs to be fixed, but
> it's something to consider. Even dynahash, as annoying as it is, is in
> some ways easier to get up and running.
I have been wondering about providing one simplehash wrapper in a
central place that uses simplehash to store a {key*, value*}, and has a
creation interface that just accepts a comparator. Plus a few wrapper
creation functions for specific types (e.g. string, oid, int64). While
we'd not want to use that for really performance critical paths, for 80%
of the cases it'd be sufficient.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2019-12-12 19:54:20 | Re: global / super barriers (for checksums) |
Previous Message | Andres Freund | 2019-12-12 19:33:26 | Re: allowing broader use of simplehash |