From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Sutou Kouhei <kou(at)clear-code(dot)com>, Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp> |
Subject: | Re: confusing / inefficient "need_transcoding" handling in copy |
Date: | 2024-02-09 00:36:28 |
Message-ID: | ZcVzjGWFobGpNrxs@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Feb 08, 2024 at 10:25:07AM +0200, Heikki Linnakangas wrote:
> There's no validation, just conversion. I'd suggest:
>
> "Set up encoding conversion info if the file and server encodings differ
> (see also pg_server_to_any)."
>
> Other than that, +1
Cool. I've used your wording and applied that on HEAD.
> BTW, I can see an optimization opportunity even if the encodings differ:
> Currently, CopyAttributeOutText() calls pg_server_to_any(), and then grovels
> through the string to find any characters that need to be quoted. You could
> do it the other way round and handle quoting before the conversion. That has
> two benefits:
>
> 1. You don't need the strlen() call, because you just scanned through the
> string so you already know its length.
> 2. You don't need to worry about 'encoding_embeds_ascii' when you operate on
> the server encoding.
That sounds right, still it looks like there would be cases where
you'd need the strlen() call if !encoding_embeds_ascii.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2024-02-09 00:43:07 | Re: confusing / inefficient "need_transcoding" handling in copy |
Previous Message | Jim Jones | 2024-02-08 23:34:54 | Re: Psql meta-command conninfo+ |