Re: Pre-proposal: unicode normalized text

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Nico Williams <nico(at)cryptonector(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, Chapman Flack <chap(at)anastigmatix(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Pre-proposal: unicode normalized text
Date: 2023-10-06 17:42:09
Message-ID: 8bdd84f0666a5ad588776375b199fbe04387a905.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2023-10-05 at 14:52 -0500, Nico Williams wrote:
> This is just how you encode the type of the string.  You have any
> number
> of options.  The point is that already PG can encode binary data, so
> if
> how to encode text of disparate encodings on the wire, building on
> top
> of the encoding of bytea is an option.

There's another significant discussion going on here:

https://www.postgresql.org/message-id/CA+TgmoZ8r8xb_73WzKHGb00cV3tpHV_U0RHuzzMFKvLepdu2Jw@mail.gmail.com

about how to handle binary formats better, so it's not clear to me that
it's a great precedent to expand upon. At least not yet.

I think it would be interesting to think more generally about these
representational issues in a way that accounds for binary formats,
extra_float_digits, client_encoding, etc. But I see that as more of an
issue with how the client expects to receive the data -- nobody has a
presented a reason in this thread that we need per-column encodings on
the server.

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-10-06 17:44:55 Re: New WAL record to detect the checkpoint redo location
Previous Message Nico Williams 2023-10-06 17:38:45 Re: Pre-proposal: unicode normalized text