From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Speed up collation cache |
Date: | 2024-06-14 23:46:39 |
Message-ID: | 7bb9f018d20a7b30b9a7f6231efab1b5e50c7720.camel@j-davis.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
The blog post here (thank you depesz!):
showed an interesting result where the builtin provider is not quite as
fast as "C" for queries like:
SELECT * FROM a WHERE t = '...';
The reason is that it's calling varstr_cmp() many times, which does a
lookup in the collation cache for each call. For sorts, it only does a
lookup in the collation cache once, so the effect is not significant.
The reason looking up "C" is faster is because there's a special check
for C_COLLATION_OID, so it doesn't even need to do the hash lookup. If
you create an equivalent collation like:
CREATE COLLATION libc_c(PROVIDER = libc, LOCALE = 'C');
it will perform the same as a collation with the builtin provider.
Attached is a patch to use simplehash.h instead, which speeds things up
enough to make them fairly close (from around 15% slower to around 8%).
The patch is based on the series here:
https://postgr.es/m/f1935bc481438c9d86c2e0ac537b1c110d41a00a.camel@j-davis.com
which does some refactoring in a related area, but I can make them
independent.
We can also consider what to do about those special cases:
* add a special case for PG_C_UTF8?
* instead of a hardwired set of special collation IDs, have a single-
element "last collation ID" to check before doing the hash lookup?
* remove the special cases entirely if we can close the performance
gap enough that it's not important?
(Note: the special case in lc_ctpye_is_c() is currently required for
correctness because hba.c uses C_COLLATION_OID for regexes before the
syscache is initialized. That can be fixed pretty easily a couple
different ways, though.)
--
Jeff Davis
PostgreSQL Contributor Team - AWS
Attachment | Content-Type | Size |
---|---|---|
v2-0007-Change-collation-cache-to-use-simplehash.h.patch | text/x-patch | 2.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Melanie Plageman | 2024-06-14 23:56:42 | Re: BitmapHeapScan streaming read user and prelim refactoring |
Previous Message | Tom Lane | 2024-06-14 23:46:21 | Re: DROP OWNED BY fails to clean out pg_init_privs grants |