Re: COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines

From: Francisco Olarte <folarte(at)peoplecall(dot)com>
To: Daniel Verite <daniel(at)manitou-mail(dot)org>
Cc: Dominique Devienne <ddevienne(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines
Date: 2022-03-13 08:41:09
Message-ID: CA+bJJbywV2J0sSFxoVhnAb59oask-=RbOctf9BvepROw0PNsjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Daniel:

On Fri, 11 Mar 2022 at 19:38, Daniel Verite <daniel(at)manitou-mail(dot)org> wrote:
> > These values are 'normal'. I'm not use to CSV, but I suppose
> > such newlines
> > must be encoded, perhaps as \n, since AFAIK CSV needs to be 1 line per row,
> > no?
> No, but such fields must be enclosed by double quotes, as documented
> in RFC 4180 https://datatracker.ietf.org/doc/html/rfc4180

CSV is really poiosonous. And in the multiplan days, which was nearly
RFC4180, it was tolerable, but this days where everybody uses excel to
spit "localized csv" it is hell ( in spain it uses ; as delimiter
because it localizes numbers with , as decimal separator, you may have
similar problems ).

Anyway, I was going to point RFC4180 is a bit misleading. In 2.1 it states:
>>>
1. Each record is located on a separate line, delimited by a line
break (CRLF). For example:

aaa,bbb,ccc CRLF
zzz,yyy,xxx CRLF
<<<

Which may lead you to believe you can read by lines, but several lines
after that in 2.6 it says

>>>
6. Fields containing line breaks (CRLF), double quotes, and commas
should be enclosed in double-quotes. For example:

"aaa","b CRLF
bb","ccc" CRLF
zzz,yyy,xxx
<<<

Which somehow contradicts 2.1.

In C/C++ it's easily parsed with a simple state machine reading char
by char, wich is one of the strong points of those languages, but
reading lines as strings usually leads to complex logic.

Francisco Olarte.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Francisco Olarte 2022-03-13 08:53:49 Re: COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines
Previous Message David G. Johnston 2022-03-12 05:23:36 Re: Am I in the same transaction block in complex PLPGSQL?