Re: selects with large offset really slow

From: John Smith <john_smith_45678(at)yahoo(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: selects with large offset really slow
Date: 2003-02-08 00:51:25
Message-ID: 20030208005125.60785.qmail@web40701.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


This is interesting:
explain analyze select * from links order by id limit 100 offset 90000;
Index Scan...Total runtime: 2951.00 msec
explain analyze select * from links order by id limit 100 offset 9000;
Index Scan...Total runtime: 232.00 msec

explain analyze select * from links order by id limit 100 offset 900;
Index Scan...Total runtime: 39.00 msec

explain analyze select * from links order by id limit 100 offset 90;
Index Scan...Total runtime: 6.00 msec

I don't get it - why's it take longer the larger the offset is? Disk seeks to find index entries?

John

Dennis Gearon <gearond(at)cvc(dot)net> wrote:Seems like it should do intelligent retrieval of records for that stage of processing.

2/7/2003 12:54:21 PM, John Smith wrote:

>
>
> Date: Fri, 7 Feb 2003 12:54:21 -0800 (PST)
>
> From: John Smith
> Subject:Re: [GENERAL] selects with large offset really slow
> To: pgsql-general(at)postgresql(dot)org
>
>
>
>
> Thanks, I'll try those suggestions. But...
>
>
> Why can't PG just use an index? Say, look at the index for 'url', go to
> entry 90000, then get the next 100 entries? I was suprised that it retrieves
> *all* records then sorts them (when there's already a sorted index). I'm
> trying to switch from mysql - the same exact query with it is very fast with
> 100-500K+ rows, and a large offset doesn't seem to affect the query's speed.
>
> John
>
>
> Richard Huxton wrote:
>
> On Friday 07 Feb 2003 5:01 am, John Smith wrote:
> > There are 90K-100K records in each of two tables. This simple join is
> > really slow and the larger the offset, the longer it takes. Anything I
> can
> > do to speed it up (a lot)? I've double-checked and there are indexes on
> > everything used for joins and ordering.
>
> > QUERY PLAN
> > -------------------------------------------------------------------------
> --
> >-----------------------
> Limit (cost=19546.62..19546.87 rows=100 width=62)
> > (actual time=20557.00..20558.00 rows=100 loops=1)
> > -> Sort (cost=19321.62..19571.32 rows=99881 width=62) (actual
> > time=19775.00..20410.00 rows=90101 loops=1)
> > Sort Key: l.url
> > -> Hash Join (cost=2471.00..7662.54 rows=99881 width=62) (actual
> > time=3013.00..12002.00 rows=100000 loops=1) Hash Cond: ("outer".id =
> > "in ner".link_id)
>
> It's the join and sort that's getting you. PG has to fetch and sort all the
>
> rows so it can discard 90,000 of them. I can't think of a good way for it
> to
> optimise this, though you might want to check your sort_mem is set high
> enough.
>
> > explain analyze select l.id, l.url
> > from links l
> > inner join stats s
> > on l.id = s.link_id
> > and s.referrer_id = 1
> > order by l.url
> > limit 100
> > offset 90000;
>
> There are three options you might want to look at:
>
> 1. Use a temporary table, then select from that for each page.
> 2. Use a cursor, and just fetch 100 records at a time from it.
> 3. Cheat and fetch where l.url>=X, remembering X as the highest url from
> the
> last set of results. This of course means pages of results will overlap.
>
> --
> Richard Huxton
>
>
>
>
>
> Do you Yahoo!?
> Yahoo! Mail Plus - Powerful. Affordable. Sign up now

---------------------------------
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now

Browse pgsql-general by date

  From Date Subject
Next Message John Smith 2003-02-08 01:07:43 Re: selects with large offset really slow
Previous Message Stephan Szabo 2003-02-08 00:20:53 Re: selects with large offset really slow