Quick Links

Re: Why query takes soo much time

From:	Denis de Bernardy <ddebernardy(at)yahoo(dot)com>
To:	"pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Why query takes soo much time
Date:	2011-05-16 11:16:44
Message-ID:	630076.87313.qm@web112408.mail.gq1.yahoo.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

[big nestloop with a huge number of rows]

You're in an edge case, and I doubt you'll get things to run much faster: you want the last 1k rows out of an 18M row result set... It will be slow no matter what you do.

What the plan is currently doing, is it's going through these 18M rows using a for each loop, until it returns the 1k requested rows. Without the offset, the plan is absolutely correct (and quite fast, I take it). With the enormous offset, it's a different story as you've noted.

An alternative plan could have been to hash join the tables together, to sort the result set, and to apply the limit/offset on the resulting set. You can probably force the planner to do so by rewriting your statement using a with statement, too:

EXPLAIN ANALYZE
WITH rows AS (
SELECT c.clause, s.subject ,s.object , s.verb, s.subject_type, s.object_type ,s.doc_id ,s.svo_id
FROM clause2 c INNER JOIN svo2 s ON (c.clause_id=s.clause_id AND c.source_id=s.doc_id AND c.sentence_id=s.sentence_id)
INNER JOIN page_content p ON (s.doc_id=p.crawled_page_id)
)
SELECT *
FROM rows
ORDER BY svo_id limit 1000 offset 17929000

I've my doubts that it'll make much of a different, though: you'll still be extracting the last 1k rows out of 18M.

In response to

Re: Why query takes soo much time at 2011-05-16 10:00:03 from Craig Ringer

Responses

Re: Why query takes soo much time at 2011-05-16 14:15:59 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	John Rouillard	2011-05-16 13:17:30	Re: Using pgiosim realistically
Previous Message	Craig Ringer	2011-05-16 10:00:03	Re: Why query takes soo much time