From: | Patrick B Kelly <pbk(at)patrickbkelly(dot)org> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: multiline CSV fields |
Date: | 2004-11-12 03:35:07 |
Message-ID: | D97EBB68-345B-11D9-B14C-000A958A3956@patrickbkelly.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
On Nov 11, 2004, at 10:07 PM, Andrew Dunstan wrote:
>
>
> Patrick B Kelly wrote:
>
>>
>>
>>
>> My suggestion is to simply have CopyReadLine recognize these two
>> states (in-field and out-of-field) and execute the current logic only
>> while in the second state. It would not be too hard but as you
>> mentioned it is non-trivial.
>>
>>
>>
>
> We don't know what state we expect the end of line to be in until
> after we have actually read the line. To know how to treat the end of
> line on your scheme we would have to parse as we go rather than after
> reading the line as now. Changing this would be not only be
> non-trivial but significantly invasive to the code.
>
>
Perhaps I am misunderstanding the code. As I read it the code currently
goes through the input character by character looking for NL and EOF
characters. It appears to be very well structured for what I am
proposing. The section in question is a small and clearly defined loop
which reads the input one character at a time and decides when it has
reached the end of the line or file. Each call of CopyReadLine attempts
to get one more line. I would propose that each time it starts out in
the out-of-field state and the state is toggled by each un-escaped
quote that it encounters in the stream. When in the in-field state, it
would only look for the next un-escaped quote and while in the
out-of-field state, it would execute the existing logic as well as
looking for the next un-escaped quote.
I may not be explaining myself well or I may fundamentally
misunderstand how copy works. I would be happy to code the change and
send it to you for review, if you would be interested in looking it
over and it is felt to be a worthwhile capability.
Patrick B. Kelly
------------------------------------------------------
http://patrickbkelly.org
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2004-11-12 04:01:56 | Re: GUC custom variables broken |
Previous Message | Andrew Dunstan | 2004-11-12 03:07:47 | Re: multiline CSV fields |
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2004-11-12 04:47:06 | Re: multiline CSV fields |
Previous Message | Andrew Dunstan | 2004-11-12 03:07:47 | Re: multiline CSV fields |