Cursor-based results: bafflingly slow

From: Robin Houston <robin(dot)houston(at)gmail(dot)com>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Cursor-based results: bafflingly slow
Date: 2009-07-03 13:40:52
Message-ID: 1b795e7b0907030640m347931c1v64e6039b9b8590b0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

2009/7/3 Oliver Jowett <oliver(at)opencloud(dot)com>:
> They're not really equivalent. A portal with a limited fetchsize is more
> like DECLARE CURSOR, not LIMIT.
>
> What does the plan look like if you plan it without the LIMIT?

Hmm, well... If I do the same as before, but without the limit, then the
plan is:

Sort (cost=353822.56..353835.72 rows=5264 width=612)
Sort Key: polypeptide.feature_id
InitPlan
-> Seq Scan on organism (cost=0.00..1.95 rows=1 width=4)
Filter: ((common_name)::text = $1)
-> Nested Loop (cost=88664.31..352053.24 rows=5264 width=612)
-> Hash Join (cost=88663.00..351946.66 rows=188 width=96)
Hash Cond: (featureprop.feature_id = polypeptide.feature_id)
-> Hash Join (cost=2790.36..264059.63 rows=51268 width=96)
Hash Cond: (featureprop.type_id =
featureprop_type.cvterm_id)
-> Seq Scan on featureprop (cost=0.00..222284.97
rows=10259097 width=56)
-> Hash (cost=2785.51..2785.51 rows=388 width=52)
-> Seq Scan on cvterm featureprop_type
(cost=0.00..2785.51 rows=388 width=52)
Filter: (cv_id = cv_id)
-> Hash (cost=85344.24..85344.24 rows=32192 width=4)
-> Bitmap Heap Scan on feature polypeptide
(cost=827.57..85344.24 rows=32192 width=4)
Recheck Cond: ((organism_id = $0) AND (type_id =
191))
-> Bitmap Index Scan on feature_genedb_idx1
(cost=0.00..819.52 rows=32192 width=0)
Index Cond: ((organism_id = $0) AND (type_id
= 191))
-> Materialize (cost=1.31..1.59 rows=28 width=516)
-> Seq Scan on cv featureprop_type_cv (cost=0.00..1.28
rows=28 width=516)

Surely the server is not so stupid as to use an unnecessary server-side
sort, if I've explicitly asked to fetch the rows 100 at a time?

Is there any easy way to find out what execution plan the server is actually
using? I suppose I could ask the DBAs to enable auto_explain on the dev
database.

If this *is* the problem, do you know any way round it. Obviously I want the
plan that *doesn't* involve sorting everything before returning anything.

If I explicitly create a cursor, with no limit, then it does return the
first 100 rows very quickly. On the other hand, I have to hard-code the
parameter this way (because I don't know any way, in psql, to use a bind
variable with an explicit cursor.)

begin;
declare polypeptide_props_c no scroll cursor for [...query...]
fetch forward 100 from polypeptide_props_c;
rollback;

This fetch completes in 31.313 ms.

Thanks for your help so far!

Robin

In response to

Browse pgsql-jdbc by date

  From Date Subject
Next Message Oliver Jowett 2009-07-03 14:12:59 Re: Cursor-based results: bafflingly slow
Previous Message Oliver Jowett 2009-07-03 13:17:05 Re: Cursor-based results: bafflingly slow