From: | "Daniel Verite" <daniel(at)manitou-mail(dot)org> |
---|---|
To: | "Michael Paquier" <michael(at)paquier(dot)xyz> |
Cc: | "Fabien COELHO" <coelho(at)cri(dot)ensmp(dot)fr>,"PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: csv format for psql |
Date: | 2018-11-09 16:28:07 |
Message-ID: | ea4145e4-8c7c-4541-af24-500f1e47de88@manitou-mail.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Michael Paquier wrote:
> Still what's the point except complicating the code? We don't care
> about anything fancy in the backend-side ProcessCopyOptions() when
> checking cstate->delim, and having some consistency looks like a good
> thing to me.
The backend has its reasons that don't apply to the psql output
format, mostly import performance according to [1]
It's not that nobody wants delimiter outside of US-ASCII,
as people do ask for that sometimes:
https://www.postgresql.org/message-id/f02ulk%242r3u%241%40news.hub.org
https://github.com/greenplum-db/gpdb/issues/1246
> However there is no option to specify
> an escape character, no option to specify a quote character, and it is
> not possible to force quotes for all values. Those are huge advantages
> as any output can be made compatible with other CSV variants. Isn't
> what is presented too limited?
The guidelines that the patch has been following are those of RFC 4180 [2]
with two exceptions on the field separator that we can define
and the end of lines that are OS-dependant instead of the fixed CRLF
that IETF seems to see as the norm.
The only reference to escaping in the RFC is:
"If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote"
The problem with using non-standard QUOTE or ESCAPE is that it's a
violation of the format that goes further than choosing a separator
different than comma, which is already a pain point.
We can always add these options later if there is demand. I suspect it
will never happen.
I looked at the 2004 archives when CSV was added to COPY, that's
around commit 862b20b38 in case anyone cares to look, but
I couldn't find a discussion on these options, all I could find is they were
present from the start.
But again COPY is concerned with importing the data that preexists,
even if it's weird, whereas a psql output formats are not.
[1] https://www.postgresql.org/message-id/4C9D2BC5.1080006%40optonline.net
[2] https://tools.ietf.org/html/rfc4180
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Verite | 2018-11-09 16:35:00 | Re: Alternative to \copy in psql modelled after \g |
Previous Message | Alvaro Herrera | 2018-11-09 16:16:24 | Re: notice processors for isolationtester |