From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Darcy Buskermolen <darcy(at)wavefire(dot)com> |
Cc: | pgsql-patches(at)postgresql(dot)org |
Subject: | Re: Note that spaces between QUOTE and DELIMITER are included |
Date: | 2005-09-02 22:44:07 |
Message-ID: | 4318D5B7.7040706@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-patches |
I wrote:
> Darcy Buskermolen wrote:
>
>> + CSV mode will include all characters between
>> <literal>QUOTE</> and
>
>> + <literal>DELIMITER</> in the value for the field, this is of
>> special
>> + attention to those who use CSV mode to import data from other
>> RDBMS
>> + systems that create fixed width CSV files.
>>
>
>
> First, this need some grammar cleanup. But more importantly, it's not
> quite a correct formulation. CSV mode splits a line on (unquoted)
> delimiters. Within each chunk dequoting is done, and withing quoted
> sections de-escaping is done. But nothing is discarded.
>
> i.e. with the quote char as '"', 'foo"bar"baz' becomes 'foobarbaz' and
> ' "x" ' becomes ' x '.
>
> I understand Dary's problem has been that Oracle pads CSV lines with
> spaces. Perhaps we need to warn specifically about that - I suspect
> most people for whom it might be important will miss the significance
> otherwise.
>
> I'll work on some better wording.
>
>
How about this?
In CSV mode all characters are significant. A quoted value surrounded by
white space, or any characters other than <literal>DELIMITER</>, will
include those characters. This can cause errors if you import data from
a system that pads CSV lines with white space out to some fixed width.
If such a situation arises you might need to preprocess the CSV file to
remove the trailing white space, before importing the data into Postgres.
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2005-09-03 00:39:52 | Re: [HACKERS] Version number in psql banner |
Previous Message | Bruce Momjian | 2005-09-02 21:55:42 | Re: statement logging / extended query protocol issues |