From: | Steve Crawford <scrawford(at)pinpointresearch(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org, ivoras(at)freebsd(dot)org |
Subject: | Re: Bulk processing & deletion |
Date: | 2011-10-13 18:08:17 |
Message-ID: | 4E972911.6000101@pinpointresearch.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 10/13/2011 05:20 AM, Ivan Voras wrote:
> Hello,
>
> I have a table with a large number of records (millions), on which the
> following should be performed:
>
> 1. Retrieve a set of records by a SELECT query with a WHERE condition
> 2. Process these in the application
> 3. Delete them from the table
>
Without knowing a bit more, it is difficult to say. A couple questions:
1. Are there conflicting processes - i.e. could multiple applications be
in contention to process the same set of records?
2. Is the processing "all or none" or could individual records fail? If
so, how do you deal with reprocessing or returning those to the main table.
Depending on the nature of your app, it might be feasible to reorder the
actions to move the records to be processed into a "processing" table
and delete them from that table as the records are processed by the
application.
You could move the records into the processing table with:
with foo as (delete from main_table where your_where_clause returning
a,b,c) insert into processing_table (a,b,c) select a,b,c from foo;
In this case I would not recommend temporary or unlogged tables for the
processing table as that becomes the only source of the data once
deleted from the master table.
Cheers,
Steve
From | Date | Subject | |
---|---|---|---|
Next Message | Steve Clark | 2011-10-13 18:23:36 | Re: Ideas for query |
Previous Message | Evan Walter | 2011-10-13 17:22:43 | Re: Trying to use binary replication - from tutorial |