Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: adam(at)labkey(dot)com, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails
Date: 2024-11-20 15:35:33
Message-ID: Zz4BxVqAZnWHFXFB@nathan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Nov 19, 2024 at 11:23:13PM -0500, Tom Lane wrote:
> Nathan Bossart <nathandbossart(at)gmail(dot)com> writes:
>> I'm admittedly not an expert in the multi-byte code, but since there are
>> encodings like LATIN1 that use a byte per character, don't we need to do
>> multiple lookups any time the NAMEDATALEN-1'th byte is non-ASCII?
>
> I don't think so, but maybe I'm missing something. An important
> property of backend-legal encodings is that all bytes of a multibyte
> character have their high bits set. Thus if the NAMEDATALEN-2'th
> byte does not have that, it is not part of a multibyte character.
> That's also the reason we can stop if we reach a high-bit-clear
> byte while backing up to earlier bytes.

That's good to know. If we can assume that 1) all bytes of a multibyte
character have the high bit set and 2) all multibyte characters actually
require multiple bytes, then there are just a handful of cases that require
multiple lookups, and we can restrict even those to some extent, too.

--
nathan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2024-11-20 15:39:35 Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails
Previous Message Nathan Bossart 2024-11-20 15:32:47 Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails