| From: | Craig Ringer <ringerc(at)ringerc(dot)id(dot)au> |
|---|---|
| To: | David Johnston <polobo(at)yahoo(dot)com> |
| Cc: | 'Rich Shepard' <rshepard(at)appl-ecosys(dot)com>, pgsql-general(at)postgresql(dot)org |
| Subject: | Re: COPY from .csv File and Remove Duplicates |
| Date: | 2011-08-12 03:04:20 |
| Message-ID: | 4E449834.1020405@ringerc.id.au |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
On 12/08/2011 10:32 AM, David Johnston wrote:
> The general structure for the insert would be:
>
> INSERT INTO maintable (cols)
> SELECT cols FROM staging WHERE staging.idcols NOT IN (SELECT
> maintable.idcols FROM maintable);
>
> There may be more efficient ways to write the query but the idea is the
> same.
Yeah... I'd favour an EXISTS test or a join.
INSERT INTO maintable (cols)
SELECT cols FROM staging WHERE NOT EXISTS (SELECT
1 FROM maintable WHERE maintable.idcol = staging.idcol);
... as the NOT IN(...) test can have less than lovely behavior for large
key sets.
--
Craig Ringer
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Greg Smith | 2011-08-12 04:57:15 | Re: Postgres on SSD |
| Previous Message | David Johnston | 2011-08-12 02:32:37 | Re: COPY from .csv File and Remove Duplicates |