Quick Links

Re: Faster "SET search_path"

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc:	Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Faster "SET search_path"
Date:	2023-08-07 23:24:16
Message-ID:	1c422b8a7f893550144a7aa4f4989ba1209ccba4.camel@j-davis.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, 2023-08-07 at 15:39 -0700, Nathan Bossart wrote:
> 0001 and 0002 LGTM.

I'll probably go ahead with 0001 soon. Simple and effective.

But for 0002, I was thinking about trying to optimize so
check_search_path() only gets called once at the beginning, rather than
for each function invocation. It's a bit of a hack right now, but it
should be correct because it's just doing a syntactic validity check
and that doesn't depend on catalog contents. If I can find a
reasonably-nice way to do this, then there's no reason to optimize the
check itself, and we wouldn't have to maintain that separate parsing
code.

I might just avoid guc.c entirely and directly set
namespace_search_path and baseSearchPathValid=false. The main thing we
lose there is the GUC stack (AtEOXact_GUC()), but there's already a
PG_TRY/PG_FINALLY block in fmgr_security_definer, so we can use that to
change it back safely. (I think? I need to convince myself that it's
safe to skip the work in guc.c, at least for the special case of
search_path, and that it won't be too fragile.)

I started down this road and it gets even closer in performance to the
plain function call. I don't know if we'll ever call it zero cost, but
safety has a cost sometimes.

> 0003 is looking pretty good, too, but I think we
> should get some more eyes on it, given the complexity.

I'm happy with 0003 and I'm fairly sure we want it, but I agree that
another set of eyes would be nice so I'll give it a bit more time. I
could also improve the testing; any suggestions here are welcome.

> I wonder if it'd be faster to downcase first and then pass it to
> hash_bytes(). This would require modifying the key, which might be a
> deal-breaker, but maybe there's some way to improve find_option() for
> all
> GUCs.

I thought about that and I don't think modifying the argument is a good
API.

For the match function, I added a fast path for strcmp()==0 with a
fallback to case folding (on the assumption that the case is consistent
in most cases, especially built-in ones). For the hash function, I used
a stack buffer and downcased the first K bytes into that, then did
hash_bytes(buff, K). Those improved things a bit but it wasn't an
exciting amount so I didn't post it. I made a patch to port guc_hashtab
to simplehash because it seems like a better fit than hsearch.h; but
again, I didn't see exciting numbers from doing so, so I didn't post
it.

>
Maybe we could use perfect hashing for the built-in GUCs, and fall back
to simplehash for the custom GUCs?

Regardless, I'm mostly interested in search_path because I want to get
to the point where we can recommend that users specify it for every
function that depends on it. A special case to facilitate that without
compromising (much) on performance seems reasonable to me.

Regards,
Jeff Davis

In response to

Re: Faster "SET search_path" at 2023-08-07 22:39:08 from Nathan Bossart

Responses

Re: Faster "SET search_path" at 2023-08-15 17:04:45 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2023-08-07 23:25:49	Re: Fix badly generated entries in wait_event_names.txt
Previous Message	Andres Freund	2023-08-07 23:08:21	Re: A failure in 031_recovery_conflict.pl on Debian/s390x