From: | "Andrew Dunstan" <andrew(at)dunslane(dot)net> |
---|---|
To: | <pgsql-patches(at)postgresql(dot)org> |
Subject: | Re: COPY for CSV documentation |
Date: | 2004-04-11 13:12:56 |
Message-ID: | 2365.24.211.141.25.1081689176.squirrel@www.dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-patches |
Bruce Momjian said:
>> >Yes, my worry is that someone will use a multibyte character that the
>> >system sees as several bytes and enters CSV mode.
>> >
>>
>>
>> How about if we specify it explicitly, like BINARY, instead of it
>> being implied by the length of DELIMITER?
>>
>> COPY a FROM stdin CSV DELIMITER ',"';
>>
>> That would make the patch somewhat more extensive, but maybe not
>> hugely more invasive (I tried to keep it as uninvasive as possible).
>> I could do that, I think.
>
> That's what I was wondering. Is triggering CSV for multi-character
> delimiters a little too clever? This reminds me of the use of LIMIT
> X,Y with no indication which is limit and which is offset.
>
> We certainly could code to prevent the multibyte problem I mentioned,
> but should we?
I confess that in my anglocentric world I have remained lamentably
ignorant of how MBCS works. Just reading up a little, and looking over
some of our code (e.g. the scanner) it looks like the simple solution
would be to check that the delimiter was 8-bit clean. (I assume that ASCII
is a subset of every MBCS we support - is that correct?)
However ...
>
> I am thinking just:
>
>> COPY a FROM stdin WITH CSV ',"';
>
> or
>
>> COPY a FROM stdin WITH DELIMITER "," QUOTE '"' EQUOTE '"';
>
> EQUOTE for embedded quote. These are used in very limited situations
> and don't have to be reserved words or anything.
>
> I can help with these changes if folks like them.
>
I prefer either the first, because it ensures things are specified
together.
If you want to do that I will work on some regression tests.
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Michiel Ephraim | 2004-04-11 19:26:20 | build annoyences |
Previous Message | Bruce Momjian | 2004-04-11 02:52:01 | Re: COPY for CSV documentation |