From: | John Smith <john_smith_45678(at)yahoo(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: selects with large offset really slow |
Date: | 2003-02-07 20:54:21 |
Message-ID: | 20030207205421.20693.qmail@web40704.mail.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Thanks, I'll try those suggestions. But...
Why can't PG just use an index? Say, look at the index for 'url', go to entry 90000, then get the next 100 entries? I was suprised that it retrieves *all* records then sorts them (when there's already a sorted index). I'm trying to switch from mysql - the same exact query with it is very fast with 100-500K+ rows, and a large offset doesn't seem to affect the query's speed.
John
Richard Huxton <dev(at)archonet(dot)com> wrote:On Friday 07 Feb 2003 5:01 am, John Smith wrote:
> There are 90K-100K records in each of two tables. This simple join is
> really slow and the larger the offset, the longer it takes. Anything I can
> do to speed it up (a lot)? I've double-checked and there are indexes on
> everything used for joins and ordering.
> QUERY PLAN
> ---------------------------------------------------------------------------
>-----------------------
Limit (cost=19546.62..19546.87 rows=100 width=62)
> (actual time=20557.00..20558.00 rows=100 loops=1)
> -> Sort (cost=19321.62..19571.32 rows=99881 width=62) (actual
> time=19775.00..20410.00 rows=90101 loops=1)
> Sort Key: l.url
> -> Hash Join (cost=2471.00..7662.54 rows=99881 width=62) (actual
> time=3013.00..12002.00 rows=100000 loops=1) Hash Cond: ("outer".id =
> "inner".link_id)
It's the join and sort that's getting you. PG has to fetch and sort all the
rows so it can discard 90,000 of them. I can't think of a good way for it to
optimise this, though you might want to check your sort_mem is set high
enough.
> explain analyze select l.id, l.url
> from links l
> inner join stats s
> on l.id = s.link_id
> and s.referrer_id = 1
> order by l.url
> limit 100
> offset 90000;
There are three options you might want to look at:
1. Use a temporary table, then select from that for each page.
2. Use a cursor, and just fetch 100 records at a time from it.
3. Cheat and fetch where l.url>=X, remembering X as the highest url from the
last set of results. This of course means pages of results will overlap.
--
Richard Huxton
---------------------------------
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now
From | Date | Subject | |
---|---|---|---|
Next Message | Luis Magaña | 2003-02-07 20:57:58 | Start and End Day of a Week |
Previous Message | Jonathan Ellis | 2003-02-07 20:10:38 | corruption bug in 7.2.3-RH |