[PATCH] Refactor bytea_sortsupport()

From: Aleksander Alekseev <aleksander(at)timescale(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: [PATCH] Refactor bytea_sortsupport()
Date: 2024-10-09 14:39:22
Message-ID: CAJ7c6TO3X88dGd8C4Tb-Eq2ZDPz+9mP+KOwdzK_82BEz_cMPZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Tom (cc:'ed) recently pointed out [1] that adt/varlena.c uses common
logic for sorting string types and bytea. There are several problems
with this.

Firstly, this is difficult to reason about. For instance, you have to
keep in mind that when "C" locale is used you might be sorting not
strings but rather bytea for which it is legal to have internal NUL
bytes.

Secondly, this is error-prone. Changing logic for string types may
affect bytea logic and vice versa.

Lastly, the performance and memory consumption could be optimized for
a bytea case. The win is arguably small if noticeable at all, but
still.

Attached is a PoC patch that fixes this. There are some TODOs and
FIXMEs but all in all it works and passes the tests.

The code becomes longer but the new code is simple and it's easier to
understand. If we agree on this refactoring we could decompose
adt/varlena.c into varlena.c and bytea.c - also something proposed by
Tom. IMO it would be a good move but this is not implemented in the
patch.

Thoughts?

[1]: https://postgr.es/m/1502394.1725398354%40sss.pgh.pa.us

--
Best regards,
Aleksander Alekseev

Attachment Content-Type Size
v1-0001-Refactor-bytea_sortsupport.patch application/octet-stream 14.5 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-10-09 15:03:03 Re: \watch 0 or \watch 0.00001 doesn't do what I want
Previous Message Zhang Mingli 2024-10-09 14:28:05 Re: Use MAX_PARALLEL_WORKER_LIMIT consistently in guc_tables.c