Re: Search then Delete Performance

From: Dann Corbit <DCorbit(at)connx(dot)com>
To: 'John R Pierce' <pierce(at)hogranch(dot)com>, Michael Hull <mikehulluk(at)googlemail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Search then Delete Performance
Date: 2010-09-15 04:15:16
Message-ID: 87F42982BF2B434F831FCEF4C45FC33E345037F2@EXCHANGE.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> -----Original Message-----
> From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-
> owner(at)postgresql(dot)org] On Behalf Of John R Pierce
> Sent: Tuesday, September 14, 2010 8:41 PM
> To: Michael Hull
> Cc: pgsql-general(at)postgresql(dot)org
> Subject: Re: [GENERAL] Search then Delete Performance
>
> On 09/14/10 5:55 PM, Michael Hull wrote:
> > So fairly simply, I have a daemon running on a machine, which
> accesses
> > this DB. Clients connect and request the details for say 1000
> > simulations, at which point the daemon takes 1000 entries from the
> > unassigned table and moves them to the assigned table. The once the
> > client is finished with those jobs, it signals this to the daemon,
> > which then move those jobs from 'assigned' to 'complete'.
> >
> > So this is fairly simple to implement, but my problem is that it is
> very slow.
> >
> >
>
> instead of moving data from one table to another, it might be better to
> just have a table of simulations, then another table which just
> contains
> the PK of each simulation, and a flag that says its assigned or
> unassigned (and maybe the client its assigned to? and anything else
> thats related to this assignment?)... so instead of moving your big
> table rows, which involves deleting them from one table and inserting
> them into another, you just update the row of this small table. if
> you
> create this small table with a fillfactor like 75%, the updates likely
> will easily be handled by HOT

Or just a status integer in the main table along the lines of:
1 = unassigned
2 = assigned
3 = running
4 = completed
Etc.

And then update the status as appropriate and check the status as needed.

If you want until a batch is done, you would also be able to update like this:

UPDATE jobs SET status = 4 WHERE status = 3

As you like, with a single statement.

There are lots of job schedulers on SOURCEFORGE.
http://sourceforge.net/search/?words=scheduler+workflow&type_of_search=soft&sort=latest_file_date&sortdir=desc&limit=100

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sergey Konoplev 2010-09-15 06:28:31 Re: select sql slow inside function
Previous Message John R Pierce 2010-09-15 03:40:30 Re: Search then Delete Performance