Quick Links

Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc:	Nathan Bossart <nathandbossart(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, adam(at)labkey(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails
Date:	2024-11-21 14:47:56
Message-ID:	Zz9IHPBf-z8MsLdw@momjian.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

On Thu, Nov 21, 2024 at 02:35:50PM +0000, Bertrand Drouvot wrote:
> On Thu, Nov 21, 2024 at 09:21:16AM -0500, Bruce Momjian wrote:
> > I don't understand this logic. Why are two bytes important? If we knew
> > it was UTF8 we could check for non-first bytes always starting with
> > bits 10, but we can't know that.
>
> I think this is because this is a reliable way to detect if the truncation happened
> in the middle of a character, without needing to know the specifics of the encoding.
>
> My understanding is that the key insight is that in any multibyte encoding, all
> bytes within a multibyte character will have their high bits set.
>
> That's just my understanding from the code and Tom's previous explanations: I
> might be wrong as not an expert in this area.

But the logic doesn't make sense. Why would two bytes be any different
than one? I assumed you would just remove all trailing high-bit bytes
and stop and the first non-high-bit byte. Also, do we really expect
there to be trailing multi-byte characters and then some ASCII before
it? Isn't it likely it will be all ASCII or all multi-byte characters?
I guess for Latin1, it would work fine, but I assume for Asian
languages, it will be almost all multi-byte characters. I guess digits
would be ASCII. This all just seems very unfocused.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

When a patient asks the doctor, "Am I going to die?", he means
"Am I going to die soon?"

In response to

Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails at 2024-11-21 14:35:50 from Bertrand Drouvot

Responses

Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails at 2024-11-21 15:14:23 from Nathan Bossart
Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails at 2024-11-22 00:11:09 from Tom Lane

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Erik Wienhold	2024-11-21 14:53:13	Re: AW: Wrong german error message encoding
Previous Message	Bertrand Drouvot	2024-11-21 14:35:50	Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails