From: | Chris Browne <cbbrowne(at)acm(dot)org> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: COPY formatting |
Date: | 2004-03-17 19:03:10 |
Message-ID: | 60u10nz47l.fsf@dev6.int.libertyrms.info |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
andrew(at)dunslane(dot)net (Andrew Dunstan) writes:
> Karel Zak wrote:
>
>> Hi,
>>
>> in TODO is item: "* Allow dump/load of CSV format". I don't think
>> it's clean idea. Why CSV and why not something other? :-)
>>
>> A why not allow to users full control of the format by they own
>> function. It means something like:
>> COPY tablename [ ( column [, ...] ) ]
>> TO { 'filename' | STDOUT }
>> [ [ WITH ] [ BINARY ]
>> [ OIDS ]
>> [ DELIMITER [ AS ] 'delimiter' ]
>> [ NULL [ AS ] 'null string' ]
>> [ FORMAT funcname ] ]
>> ^^^^^^^^^^^^^^^^
>> The formatting
>> function API can be pretty simple:
>>
>> text *my_copy_format(text *attrdata, int direction, int
>> nattrs, int attr, oid attrtype, oid relation)
>>
>> -- it's pseudocode of course, it should be use standard fmgr
>> interface.
>> It's probably interesting for non-binary COPY version.
>
> Interesting ... The alternative might be an external program to munge
> CSVs and whatever other format people want to support and then call
> the exisiting COPY- either in bin or contrib. I have seen lots of
> people wanting to import CSVs, and that's even before we get a Windows
> port.
I know Jan Wieck has been working on something like this, with a bit
of further smarts...
- By having, alongside, a table definition, the table can be created
concurrently;
- A set of mapping functions can be used, so that if, for instance,
the program generating the data was Excel, and you have a field with
values like 37985, 38045, or 38061, they can respectively be mapped
to '2004-01-01', '2004-03-01', and '2004-03-17';
- It can load whatever data is loadable, and use Ethernet-like
backoffs when it encounters bad records so that it loads all the data
that is good, and leaves a bundle of 'crud' that is left over.
He had been prototyping it in Tcl; I'm not sure how far a port to C
has gotten. It looked pretty neat; it sure seems better to put the
"cleverness" in userspace than to try to increase the complexity of
the postmaster...
--
output = ("cbbrowne" "@" "cbbrowne.com")
http://cbbrowne.com/info/linuxxian.html
Have you heard of the new Macsyma processor? It has three instructions --
LOAD, STORE, and SKIP IF INTEGRABLE.
From | Date | Subject | |
---|---|---|---|
Next Message | Joe Conway | 2004-03-17 19:19:36 | Re: rapid degradation after postmaster restart |
Previous Message | Jonathan Gardner | 2004-03-17 18:45:57 | Re: Doxygen? |