From: | Ivan Voras <ivoras(at)freebsd(dot)org> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Bulk processing & deletion |
Date: | 2011-10-13 12:20:45 |
Message-ID: | j76l2u$d7h$1@dough.gmane.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello,
I have a table with a large number of records (millions), on which the
following should be performed:
1. Retrieve a set of records by a SELECT query with a WHERE condition
2. Process these in the application
3. Delete them from the table
Now, in the default read-committed transaction isolation, I can't just
use the same WHERE condition with a DELETE in step 3 as it might delete
more records than are processed in step 1 (i.e. phantom read). I've
thought of several ways around it and would like some feedback on which
would be the most efficient:
#1: Create a giant DELETE WHERE ... IN (...) SQL command for step #3
with primary keys of records from step 1 - but will it hit a SQL string
length limitation in the database? Is there such a limit (and what is it?)
#2: Same as #1 but with batching the records to e.g. 1000 at a time, all
in one transaction
#2: Use a higher isolation level, probably Repeatable Read (PG 9.0) -
but then the question is will this block other clients from inserting
new data into the table? Also, is Repeatable Read enough?
Any other ideas?
From | Date | Subject | |
---|---|---|---|
Next Message | Raymond O'Donnell | 2011-10-13 12:31:47 | Re: Ideas for query |
Previous Message | Steve Clark | 2011-10-13 11:17:19 | Ideas for query |