From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Daniel Verite <daniel(at)manitou-mail(dot)org> |
Cc: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, hlinnaka <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavel Golub <pavel(at)microolap(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com> |
Subject: | Re: raw output from copy |
Date: | 2016-04-02 17:52:52 |
Message-ID: | 570006F4.9090309@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 04/01/2016 11:42 AM, Daniel Verite wrote:
> Andrew Dunstan wrote:
>
>> If someone can make a good case that this is going to be of
>> general use I'll happily go along, but I haven't seen one so far.
> About COPY FROM with a raw format, for instance just yesterday
> there was this user question on stackoverflow:
> http://stackoverflow.com/questions/36317237
>
> which essentially is: how to import contents from a file without any
> particular interpretation of any character?\
There is so much wrong with this it's hard to know where to start.
Inserting the whole contents of a text file unchanged is insanely easy
in psql.
\set file `cat /path/to/file`
insert into mytable(contents) values(:'file');
What is more everyone on SO missed the fact that CSV mode gives you very
considerable control over the quote, delimiter and null settings.
See for example
<http://adpgtech.blogspot.com/2014/09/importing-json-data.html> which
has this example for handling files consisting of 1 json document per line:
copy the_table(jsonfield)
from '/path/to/jsondata'
csv quote e'\x01' delimiter e'\x02';
psql's \copy will work just the same way
(I noticed with amusement this week that CitusData is using pretty much
exactly this in one of their examples.)
>
> With the patch discussed in this thread, a user can do
> \copy table(textcol) from /path/to/file (format raw)
> or the equivalent COPY.
> If it's a binary column, that works just the same.
It would be fairly simple to invent a binary mechanism that did the
equivalent of the above insert. All without any change to SQL or the
backend at all.
>
> Without this, it's not obvious at all how this result can be
> achieved without resorting to external preprocessing,
> and assuming the availability of such preprocessing tools
> in the environment. Notwithstanding the fact that the
> solution proposed on SO (doubling backslashes with sed)
> doesn't even work if the file contains tabs, as they would be
> interpreted as field separators, even if the copy target has only
> one column. You can change the delimiter with COPY but AFAIK
> you can't tell that there is none.
There is arguably a good case for allowing a null delimiter. But that SO
page is just a terrible piece of misinformation, as far too often
happens in my experience.
And I am still waiting for a non-psql use case. But I don't expect to
see one, precisely because most clients have no difficulty at all in
handling binary data.
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-04-02 18:57:21 | Re: Batch update of indexes |
Previous Message | Tom Lane | 2016-04-02 17:20:59 | Re: Transactional enum additions - was Re: Alter or rename enum value |