Re: maintaining a reference to a fetched row

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Brian Karlak <zenkat(at)metaweb(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: maintaining a reference to a fetched row
Date: 2009-11-05 04:27:27
Message-ID: f67928030911042027h4b613fe5nd2c12809f19d615c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Nov 4, 2009 at 9:41 AM, Brian Karlak <zenkat(at)metaweb(dot)com> wrote:
>
> I do a select for update in a stored proc:
>
> FOR queue_item IN
>
>   SELECT *  FROM queue
>    WHERE status IS NULL AND id >= low_bound_id
>    ORDER BY id LIMIT batch_size
>      FOR UPDATE
>
> LOOP
>
>   UPDATE queue_proc set status = 'proc' where id = queue_item.id ;
>
> The daemons keep track of their last position in the queue with
> low_bound_id.  Also, as you probably notice, I also fetch a batch of (100)
> items at a time.  In practice, it's pretty fast.  The job I'm running now is
> showing an average fetch time of 30ms per 100 actions, which ain't bad.
>
> However, the writing of results back to the row takes ~5ms, which is slower
> than I'd like.

5 ms per each of the 100 actions? With one commit per action?

> > It seems you have an select, and update, and another update.  Where in
> > this process do you commit?  Are you using fsync=off or
> > synchronous_commit=off?
>
> First commit occurs after the stored proc to select/update a batch of items
> is complete.

So one commit per 100 items?

> Second commit occurs on the writing of results back for each
> particular action.

So one commit per 1 item?
If so, this completely explains the difference in speed, I think.

> Two commits are required because the time it takes to
> complete the intervening action can vary wildly: anywhere between 20ms and
> 45min.

Is there any way of knowing/approximating ahead of time how long it will take?

The 45 min monsters must be exceedingly rare, or else the average
could not be ~50ms.

>> Why would the index scan take 1 ms two of the times it is done but 5ms
>> the third time?  Isn't it the same index scan each time?  Or does the
>> change in queue.status change the plan?
>
> The final update is a different query -- just a plain old update by ID:
>
> UPDATE queue_proc set status = 'proc' where id = %s ;

That looks very much like the other UPDATE you showed. The difference
it seems is that you commit after every one, rather than after every
100. Right?

> This update by ID takes ~2.5ms, which means it's where the framework is
> spending most of its overhead.

You said the computation task can take anywhere from 20ms to 45min, so
it seems that this update overhead is at most 1/8 of the irreducible
time. That doesn't seem like it is enough to worry about, to me.

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Carey 2009-11-05 05:47:01 Re: Optimizer + bind variables
Previous Message Craig Ringer 2009-11-05 03:01:26 Re: Followup: vacuum'ing toast