From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Faster "SET search_path" |
Date: | 2023-08-07 23:24:16 |
Message-ID: | 1c422b8a7f893550144a7aa4f4989ba1209ccba4.camel@j-davis.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, 2023-08-07 at 15:39 -0700, Nathan Bossart wrote:
> 0001 and 0002 LGTM.
I'll probably go ahead with 0001 soon. Simple and effective.
But for 0002, I was thinking about trying to optimize so
check_search_path() only gets called once at the beginning, rather than
for each function invocation. It's a bit of a hack right now, but it
should be correct because it's just doing a syntactic validity check
and that doesn't depend on catalog contents. If I can find a
reasonably-nice way to do this, then there's no reason to optimize the
check itself, and we wouldn't have to maintain that separate parsing
code.
I might just avoid guc.c entirely and directly set
namespace_search_path and baseSearchPathValid=false. The main thing we
lose there is the GUC stack (AtEOXact_GUC()), but there's already a
PG_TRY/PG_FINALLY block in fmgr_security_definer, so we can use that to
change it back safely. (I think? I need to convince myself that it's
safe to skip the work in guc.c, at least for the special case of
search_path, and that it won't be too fragile.)
I started down this road and it gets even closer in performance to the
plain function call. I don't know if we'll ever call it zero cost, but
safety has a cost sometimes.
> 0003 is looking pretty good, too, but I think we
> should get some more eyes on it, given the complexity.
I'm happy with 0003 and I'm fairly sure we want it, but I agree that
another set of eyes would be nice so I'll give it a bit more time. I
could also improve the testing; any suggestions here are welcome.
> I wonder if it'd be faster to downcase first and then pass it to
> hash_bytes(). This would require modifying the key, which might be a
> deal-breaker, but maybe there's some way to improve find_option() for
> all
> GUCs.
I thought about that and I don't think modifying the argument is a good
API.
For the match function, I added a fast path for strcmp()==0 with a
fallback to case folding (on the assumption that the case is consistent
in most cases, especially built-in ones). For the hash function, I used
a stack buffer and downcased the first K bytes into that, then did
hash_bytes(buff, K). Those improved things a bit but it wasn't an
exciting amount so I didn't post it. I made a patch to port guc_hashtab
to simplehash because it seems like a better fit than hsearch.h; but
again, I didn't see exciting numbers from doing so, so I didn't post
it.
>
Maybe we could use perfect hashing for the built-in GUCs, and fall back
to simplehash for the custom GUCs?
Regardless, I'm mostly interested in search_path because I want to get
to the point where we can recommend that users specify it for every
function that depends on it. A special case to facilitate that without
compromising (much) on performance seems reasonable to me.
Regards,
Jeff Davis
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2023-08-07 23:25:49 | Re: Fix badly generated entries in wait_event_names.txt |
Previous Message | Andres Freund | 2023-08-07 23:08:21 | Re: A failure in 031_recovery_conflict.pl on Debian/s390x |