From: | Aleksander Alekseev <aleksander(at)timescale(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | [PATCH] Refactor bytea_sortsupport() |
Date: | 2024-10-09 14:39:22 |
Message-ID: | CAJ7c6TO3X88dGd8C4Tb-Eq2ZDPz+9mP+KOwdzK_82BEz_cMPZg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Tom (cc:'ed) recently pointed out [1] that adt/varlena.c uses common
logic for sorting string types and bytea. There are several problems
with this.
Firstly, this is difficult to reason about. For instance, you have to
keep in mind that when "C" locale is used you might be sorting not
strings but rather bytea for which it is legal to have internal NUL
bytes.
Secondly, this is error-prone. Changing logic for string types may
affect bytea logic and vice versa.
Lastly, the performance and memory consumption could be optimized for
a bytea case. The win is arguably small if noticeable at all, but
still.
Attached is a PoC patch that fixes this. There are some TODOs and
FIXMEs but all in all it works and passes the tests.
The code becomes longer but the new code is simple and it's easier to
understand. If we agree on this refactoring we could decompose
adt/varlena.c into varlena.c and bytea.c - also something proposed by
Tom. IMO it would be a good move but this is not implemented in the
patch.
Thoughts?
[1]: https://postgr.es/m/1502394.1725398354%40sss.pgh.pa.us
--
Best regards,
Aleksander Alekseev
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Refactor-bytea_sortsupport.patch | application/octet-stream | 14.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-10-09 15:03:03 | Re: \watch 0 or \watch 0.00001 doesn't do what I want |
Previous Message | Zhang Mingli | 2024-10-09 14:28:05 | Re: Use MAX_PARALLEL_WORKER_LIMIT consistently in guc_tables.c |