Re: Faster "SET search_path"

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Faster "SET search_path"
Date: 2023-08-07 23:24:16
Message-ID: 1c422b8a7f893550144a7aa4f4989ba1209ccba4.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2023-08-07 at 15:39 -0700, Nathan Bossart wrote:
> 0001 and 0002 LGTM.

I'll probably go ahead with 0001 soon. Simple and effective.

But for 0002, I was thinking about trying to optimize so
check_search_path() only gets called once at the beginning, rather than
for each function invocation. It's a bit of a hack right now, but it
should be correct because it's just doing a syntactic validity check
and that doesn't depend on catalog contents. If I can find a
reasonably-nice way to do this, then there's no reason to optimize the
check itself, and we wouldn't have to maintain that separate parsing
code.

I might just avoid guc.c entirely and directly set
namespace_search_path and baseSearchPathValid=false. The main thing we
lose there is the GUC stack (AtEOXact_GUC()), but there's already a
PG_TRY/PG_FINALLY block in fmgr_security_definer, so we can use that to
change it back safely. (I think? I need to convince myself that it's
safe to skip the work in guc.c, at least for the special case of
search_path, and that it won't be too fragile.)

I started down this road and it gets even closer in performance to the
plain function call. I don't know if we'll ever call it zero cost, but
safety has a cost sometimes.

>   0003 is looking pretty good, too, but I think we
> should get some more eyes on it, given the complexity.

I'm happy with 0003 and I'm fairly sure we want it, but I agree that
another set of eyes would be nice so I'll give it a bit more time. I
could also improve the testing; any suggestions here are welcome.

> I wonder if it'd be faster to downcase first and then pass it to
> hash_bytes().  This would require modifying the key, which might be a
> deal-breaker, but maybe there's some way to improve find_option() for
> all
> GUCs.

I thought about that and I don't think modifying the argument is a good
API.

For the match function, I added a fast path for strcmp()==0 with a
fallback to case folding (on the assumption that the case is consistent
in most cases, especially built-in ones). For the hash function, I used
a stack buffer and downcased the first K bytes into that, then did
hash_bytes(buff, K). Those improved things a bit but it wasn't an
exciting amount so I didn't post it. I made a patch to port guc_hashtab
to simplehash because it seems like a better fit than hsearch.h; but
again, I didn't see exciting numbers from doing so, so I didn't post
it.

>
Maybe we could use perfect hashing for the built-in GUCs, and fall back
to simplehash for the custom GUCs?

Regardless, I'm mostly interested in search_path because I want to get
to the point where we can recommend that users specify it for every
function that depends on it. A special case to facilitate that without
compromising (much) on performance seems reasonable to me.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-08-07 23:25:49 Re: Fix badly generated entries in wait_event_names.txt
Previous Message Andres Freund 2023-08-07 23:08:21 Re: A failure in 031_recovery_conflict.pl on Debian/s390x