Re: Change GUC hashtable to use simplehash?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gurjeet Singh <gurjeet(at)singh(dot)im>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Change GUC hashtable to use simplehash?
Date: 2023-11-18 00:10:03
Message-ID: 20231118001003.n2d7tvdgeckun2y6@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-11-17 16:01:31 -0800, Jeff Davis wrote:
> On Fri, 2023-11-17 at 15:27 -0800, Andres Freund wrote:
> > At
> > first I thought that that's largely because you aren't using
> > SH_STORE_HASH.
>
> I might want to use that in the search_path cache, then. The lookup
> wasn't showing up much in the profile the last I checked, but I'll take
> a second look.

It also matters for insertions, fwiw.

> > Then I noticed that memory usage was too large when creating many
> > GUCs - a bit
> > of debugging later, I figured out that that's due to guc_name_hash()
> > being
> > terrifyingly bad. There's no bit mixing whatsoever!
>
> Wow.
>
> It seems like hash_combine() could be more widely used in other places,
> too?

I don't think hash_combine() alone helps that much - you need to actually use
a hash function for the values you are combining. Using a character value
alone as a 32bit hash value unsurprisingly leads to very distribution of bits
set in hashvalues.

> Here it seems like a worse problem because strings really need
> mixing, and maybe ExecHashGetHashValue doesn't. But it seems easier to
> use hash_combine() everywhere so that we don't have to think about
> strange cases.

Yea.

> > I think, independent of this patch, it might be worth requiring that
> > hash
> > table lookups applied the transformation before the lookup. A
> > comparison
> > function this expensive is not great...
>
> The requested name is already case-folded in most contexts. We can do a
> lookup first, and if that fails, case-fold and try again. I'll hack up
> a patch -- I believe that would be measurable for the proconfigs.

I'd just always case fold before lookups. The expensive bit of the case
folding imo is that you need to do awkward things during hash lookups.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-11-18 01:54:43 Re: long-standing data loss bug in initial sync of logical replication
Previous Message Andres Freund 2023-11-18 00:03:23 Re: On non-Windows, hard depend on uselocale(3)