Quick Links

Re: maintaining a reference to a fetched row

From:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To:	Brian Karlak <zenkat(at)metaweb(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: maintaining a reference to a fetched row
Date:	2009-11-05 04:27:27
Message-ID:	f67928030911042027h4b613fe5nd2c12809f19d615c@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Wed, Nov 4, 2009 at 9:41 AM, Brian Karlak <zenkat(at)metaweb(dot)com> wrote:
>
> I do a select for update in a stored proc:
>
> FOR queue_item IN
>
>   SELECT * FROM queue
>    WHERE status IS NULL AND id >= low_bound_id
>    ORDER BY id LIMIT batch_size
>    FOR UPDATE
>
> LOOP
>
>   UPDATE queue_proc set status = 'proc' where id = queue_item.id ;
>
> The daemons keep track of their last position in the queue with
> low_bound_id. Also, as you probably notice, I also fetch a batch of (100)
> items at a time. In practice, it's pretty fast. The job I'm running now is
> showing an average fetch time of 30ms per 100 actions, which ain't bad.
>
> However, the writing of results back to the row takes ~5ms, which is slower
> than I'd like.

5 ms per each of the 100 actions? With one commit per action?

> > It seems you have an select, and update, and another update. Where in
> > this process do you commit? Are you using fsync=off or
> > synchronous_commit=off?
>
> First commit occurs after the stored proc to select/update a batch of items
> is complete.

So one commit per 100 items?

> Second commit occurs on the writing of results back for each
> particular action.

So one commit per 1 item?
If so, this completely explains the difference in speed, I think.

> Two commits are required because the time it takes to
> complete the intervening action can vary wildly: anywhere between 20ms and
> 45min.

Is there any way of knowing/approximating ahead of time how long it will take?

The 45 min monsters must be exceedingly rare, or else the average
could not be ~50ms.

>> Why would the index scan take 1 ms two of the times it is done but 5ms
>> the third time? Isn't it the same index scan each time? Or does the
>> change in queue.status change the plan?
>
> The final update is a different query -- just a plain old update by ID:
>
> UPDATE queue_proc set status = 'proc' where id = %s ;

That looks very much like the other UPDATE you showed. The difference
it seems is that you commit after every one, rather than after every
100. Right?

> This update by ID takes ~2.5ms, which means it's where the framework is
> spending most of its overhead.

You said the computation task can take anywhere from 20ms to 45min, so
it seems that this update overhead is at most 1/8 of the irreducible
time. That doesn't seem like it is enough to worry about, to me.

Jeff

In response to

Re: maintaining a reference to a fetched row at 2009-11-04 17:41:54 from Brian Karlak

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Scott Carey	2009-11-05 05:47:01	Re: Optimizer + bind variables
Previous Message	Craig Ringer	2009-11-05 03:01:26	Re: Followup: vacuum'ing toast