Re: multiline CSV fields

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Patrick B Kelly <pbk(at)patrickbkelly(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: multiline CSV fields
Date: 2004-11-11 19:56:35
Message-ID: 4193C3F3.9090009@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:

>Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>
>
>>Patrick B Kelly wrote:
>>
>>
>>>Actually, when I try to export a sheet with multi-line cells from
>>>excel, it tells me that this feature is incompatible with the CSV
>>>format and will not include them in the CSV file.
>>>
>>>
>
>
>
>>It probably depends on the version. I have just tested with Excel 2000
>>on a WinXP machine and it both read and wrote these files.
>>
>>
>
>I'd be inclined to define Excel 2000 as broken, honestly, if it's
>writing unescaped newlines as data. To support this would mean throwing
>away most of our ability to detect incorrectly formatted CSV files.
>A simple error like a missing close quote would look to the machine like
>the rest of the file is a single long data line where all the newlines
>are embedded in data fields. How likely is it that you'll get a useful
>error message out of that? Most likely the error message would point to
>the end of the file, or at least someplace well removed from the actual
>mistake.
>
>I would vote in favor of removing the current code that attempts to
>support unquoted newlines, and waiting to see if there are complaints.
>
>
>
>

This feature was specifically requested when we discussed what sort of
CSVs we would handle.

And it does in fact work as long as the newline style is the same.

I just had an idea. How about if we add a new CSV option MULTILINE. If
absent, then on output we would not output unescaped LF/CR characters
and on input we would not allow fields with embedded unescaped LF/CR
characters. In both cases we could error out for now, with perhaps an
8.1 TODO to provide some other behaviour.

Or we could drop the whole multiline "feature" for now and make the
whole thing an 8.1 item, although it would be a bit of a pity when it
does work in what will surely be the most common case.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2004-11-11 20:38:16 Re: multiline CSV fields
Previous Message Tom Lane 2004-11-11 19:32:04 Re: MAX/MIN optimization via rewrite (plus query rewrites generally)

Browse pgsql-patches by date

  From Date Subject
Next Message Greg Stark 2004-11-11 20:38:16 Re: multiline CSV fields
Previous Message Tom Lane 2004-11-11 19:20:19 Re: multiline CSV fields