Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: adam(at)labkey(dot)com, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails
Date: 2024-11-20 16:50:57
Message-ID: Zz4TcYSNWW1_Vj--@nathan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, Nov 20, 2024 at 11:29:56AM -0500, Tom Lane wrote:
> Nathan Bossart <nathandbossart(at)gmail(dot)com> writes:
>> Upthread, you mentioned that we could bypass multiple lookups unless both
>> the NAMEDATALEN-1'th and NAMEDATALEN-2'th bytes are non-ASCII. But if
>> there are encodings with the high bit set that don't require multiple bytes
>> per character, then how can we do that?
>
> Well, we don't know the length of the hypothetically-truncated
> character, but if there was one then all its bytes must have had their
> high bits set. Suppose that the untruncated name has a 4-byte
> multibyte character extending from the NAMEDATALEN-3 byte through the
> NAMEDATALEN'th byte (counting in origin zero here):
>
> [...]
>
> Now as for the shortcut cases: if C3 does not have the high bit set,
> it cannot be part of a multibyte character. Therefore the original
> encoding-aware truncation would have removed C3 and following bytes,
> but no more. The character immediately before might have been one
> byte or several, but it doesn't matter. Similarly, if C2 does not
> have the high bit set, it cannot be part of a multibyte character.
> The original truncation would have removed C3 and following bytes,
> but no more.

Oh, I think I had an off-by-one error in my mental model and was thinking
of the NAMEDATALEN-1'th byte as the last possible byte in the identifier
(i.e., name[NAMEDATALEN - 2]), whereas you meant the location where the
trailing zero would go for the largest possible all-ASCII identifier (i.e.,
name[NAMEDATALEN - 1]). Thank you for elaborating.

--
nathan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Mark Hill 2024-11-20 21:42:00 Just tried to build Postgres 17 on AIX
Previous Message Tom Lane 2024-11-20 16:29:56 Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails