From: | Emmanuel Cecchet <manu(at)frogthinker(dot)org> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Emmanuel Cecchet <manu(at)asterdata(dot)com>, Emmanuel Cecchet <Emmanuel(dot)Cecchet(at)asterdata(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: COPY enhancements |
Date: | 2009-10-13 13:57:44 |
Message-ID: | 4AD48758.7090502@frogthinker.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane wrote:
> Ultimately, there's always going to be a tradeoff between speed and
> flexibility. It may be that we should just say "if you want to import
> dirty data, it's gonna cost ya" and not worry about the speed penalty
> of subtransaction-per-row. But that still leaves us with the 2^32
> limit. I wonder whether we could break down COPY into sub-sub
> transactions to work around that...
>
Regarding that tradeoff between speed and flexibility I think we could
propose multiple options:
- maximum speed: current implementation fails on first error
- speed with error logging: copy command fails if there is an error but
continue to log all errors
- speed with error logging best effort: no use of sub-transactions but
errors that can safely be trapped with pg_try/catch (no index violation,
no before insert trigger, etc...) are logged and command can complete
- pre-loading (2-phase copy): phase 1: copy good tuples into a [temp]
table and bad tuples into an error table. phase 2: push good tuples to
destination table. Note that if phase 2 fails, it could be retried since
the temp table would be dropped only on success of phase 2.
- slow but flexible: have every row in a sub-transaction -> is there any
real benefits compared to pg_loader?
Tom was also suggesting 'refactoring COPY into a series of steps that
the user can control'. What would these steps be? Would that be per row
and allow to discard a bad tuple?
Emmanuel
--
Emmanuel Cecchet
FTO @ Frog Thinker
Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: manu(at)frogthinker(dot)org
Skype: emmanuel_cecchet
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2009-10-13 14:14:04 | Re: Re: [GENERAL] contrib/plantuner - enable PostgreSQL planner hints |
Previous Message | Peter Eisentraut | 2009-10-13 10:28:10 | Re: SQL Standard Committee |