Re: Cost of initiating cursors

From: Matthew Churcher <matthew(dot)churcher(at)realvnc(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Cost of initiating cursors
Date: 2013-04-11 17:05:32
Message-ID: 1645159141.63958.1365699932278.JavaMail.root@realvnc.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks Tom,

That's very enlightening and I really appreciate you taking time to respond.

I've tried cursor_tuple_fraction values as low as 0.0000000001 which by my reckoning should be low enough and also 0 but the planner is still selecting seqscan-and-sort.

Kind regards,
Matthew Churcher

----- Original Message -----
From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Matthew Churcher" <matthew(dot)churcher(at)realvnc(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Sent: Thursday, 11 April, 2013 4:12:25 PM
Subject: Re: [GENERAL] Cost of initiating cursors

Matthew Churcher <matthew(dot)churcher(at)realvnc(dot)com> writes:
> For example:
> FOR curr_foo
> IN
> SELECT foo FROM bar
> WHERE wibble
> ORDER BY wobble
> LOOP
> EXIT; -- always break out of loop
> END LOOP;

> For some reason this is hugely expensive and slow regardless of the selected execution plan and available indexes. The WHERE and particularly the ORDER BY clause appear to be highly significant despite having appropriate indexes in place.

For a full-table query, the planner will frequently decide that a
seqscan-and-sort is cheaper than an indexscan that happens to produce
the right order (but involves a lot of random page access). That's
problematic for this case because the whole scan and sort has to happen
before the first result row can be delivered. Grabbing the remaining
rows would be quite cheap, but since you don't want them, the work
is wasted.

> It's the combination of the following behaviours I find particular perplexing:-
> 1.) Removing the WHERE and ORDER BY clauses results in a very fast query

No surprise. The query devolves to a seqscan, and you stop after
getting the first row.

> 2.) Adding a LIMIT clause also results in a very fast query.

That would encourage the planner to use a fast-start plan, ie an
indexscan in the right order, which wins here since you stop after
fetching the first row.

The basic problem with the above FOR loop is that you're not giving the
planner any visibility into the fact that you just want the first (few)
rows, so it's going for a plan that will be more efficient for actually
performing the whole query it's been given. Adding a LIMIT is a good
thing because it fixes that lack of communication.

> The configuration parameter 'cursor_tuple_fraction' is having no observable effect.

You did not say what values you'd tried, but a sufficiently small value
ought to force a similar result. Whether such a value would be a wise
thing to set globally is a different question. On the whole I think
adding a LIMIT isn't a bad idea here.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2013-04-11 17:40:55 Re: Cost of initiating cursors
Previous Message Steve Atkins 2013-04-11 16:18:55 Re: crash proof for semi-embedded system