From: | François Beausoleil <francois(at)teksol(dot)info> |
---|---|
To: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
Cc: | Forums postgresql <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Deduplication and transaction isolation level |
Date: | 2013-09-25 15:25:22 |
Message-ID: | 0AB54E69-0F62-422C-982D-24A09973C107@teksol.info |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Le 2013-09-25 à 09:04, Merlin Moncure a écrit :
> On Tue, Sep 24, 2013 at 10:19 PM, François Beausoleil
> <francois(at)teksol(dot)info> wrote:
>> Hi all!
>>
>> I import many, many rows of data into a table, from three or more computers, 4 times per hour. I have a primary key, and the query I use to import the data is supposed to dedup before inserting, but I still get primary key violations.
>>
>> The import process is:
>>
>> * Load CSV data into temp table
>> * INSERT INTO dest SELECT DISTINCT (pkey) FROM temp WHERE NOT EXISTS(temp.pkey = dest.pkey)
>>
>> I assumed (erroneously) that this would guarantee no duplicate data could make it into the database. The primary key violations are proving me wrong.
>
> Right. Transaction A and B are interleaved: they both run the same
> check against the same id at the same time. Both checks pass because
> neither transaction is committed. This problem is not solvable by
> adjusting the isolation level.
>
> Typical solutions might be to:
> A. Lock the table while inserting
> B. Retry the transaction following an error.
> C. Import the records to a staging table, then copy the do the
> deduplication check when moving from the staging table
You mean:
COPY TO temp FROM stdin;
INSERT INTO staging SELECT DISTINCT FROM temp;
INSERT INTO production SELECT DISTINCT FROM staging;
DELETE FROM staging WHERE inserted into production;
Right? I assume I would not have the primary key constraint on the staging table.
And steps 3 and 4 can be done in a single statement using UPDATE/DELETE CTE.
Thanks!
François
From | Date | Subject | |
---|---|---|---|
Next Message | Murat Kabilov | 2013-09-25 15:29:30 | Problem with updating hstore extension from 9.2 to 9.3 |
Previous Message | rob stone | 2013-09-25 15:17:07 | Re: Incorrect password when restarting a cluster |