From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org> |
Subject: | fix CSV multiline parsing - proof of concept |
Date: | 2005-02-06 16:15:56 |
Message-ID: | 420642BC.4000806@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-patches |
Attached is a proof-of-concept patch (i.e. not intended for application
just yet) to fix the problem of parsing CSV multiline fields.
Originally I indicated that the way to solve this IMHO was to the
combine reading and parsing phases of COPY for CSV. However, there's a
lot going on there and I adopted a somewhat less invasive approach,
which detects if a CR and/orNL should be part of a data value and if so
treats it as just another character. Also, it removes the escaping
nature of backslash for NL and CR in CSV, which is clearly a bug.
One thing I noticed is that (unless I misread the code) our standard
detection of the end marker \.<EOL> doesn't seem to require that it be
at the beginning of a line, as the docs say it should. I didn't change
that but did build a test for it into the special CSV code.
comments welcome.
cheers
andrew
Attachment | Content-Type | Size |
---|---|---|
copy-csv-multiline.patch | text/x-patch | 8.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Neil Conway | 2005-02-07 05:55:37 | Re: WIP: pl/pgsql cleanup |
Previous Message | Bruce Momjian | 2005-02-05 23:51:52 | Re: libpq API incompatibility between 7.4 and 8.0 |