From: | Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> |
---|---|
To: | Ivan Voras <ivoras(at)freebsd(dot)org> |
Cc: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, postgres performance list <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: ORDER BY, LIMIT and indexes |
Date: | 2013-08-06 22:56:31 |
Message-ID: | 52017F1F.7020700@catalyst.net.nz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 06/08/13 22:46, Ivan Voras wrote:
> Here are two more unexpected results. Same test table (1 mil. records,
> "id" is SERIAL PRIMARY KEY, PostgreSQL 9.1, VACUUM ANALYZE performed
> before the experiments):
>
> ivoras=# explain analyze select * from lt where id > 900000 limit 10;
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------
> Limit (cost=0.00..1.71 rows=10 width=9) (actual
> time=142.669..142.680 rows=10 loops=1)
> -> Seq Scan on lt (cost=0.00..17402.00 rows=101630 width=9)
> (actual time=142.665..142.672 rows=10 loops=1)
> Filter: (id > 900000)
> Total runtime: 142.735 ms
> (4 rows)
>
> Note the Seq Scan.
>
> ivoras=# explain analyze select * from lt where id > 900000;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on lt (cost=1683.97..7856.35 rows=101630 width=9)
> (actual time=38.462..85.780 rows=100000 loops=1)
> Recheck Cond: (id > 900000)
> -> Bitmap Index Scan on lt_pkey (cost=0.00..1658.56 rows=101630
> width=0) (actual time=38.310..38.310 rows=100000 loops=1)
> Index Cond: (id > 900000)
> Total runtime: 115.674 ms
> (5 rows)
>
> This somewhat explains the above case - we are simply fetching 100,000
> records here, and it's slow enough even with the index scan, so
> planner skips the index in the former case. BUT, if it did use the
> index, it would have been expectedly fast:
>
> ivoras=# set enable_seqscan to off;
> SET
> ivoras=# explain analyze select * from lt where id > 900000 limit 10;
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------------
> Limit (cost=0.00..1.74 rows=10 width=9) (actual time=0.081..0.112
> rows=10 loops=1)
> -> Index Scan using lt_pkey on lt (cost=0.00..17644.17
> rows=101630 width=9) (actual time=0.078..0.100 rows=10 loops=1)
> Index Cond: (id > 900000)
> Total runtime: 0.175 ms
> (4 rows)
>
> It looks like the problem is in the difference between what the
> planner expects and what the Filter or Index operations deliver:
> (cost=0.00..17402.00 rows=101630 width=9) (actual
> time=142.665..142.672 rows=10 loops=1).
>
>
Hmm - I wonder if the lack or ORDER BY is part of the problem here.
Consider a similar query on pgbench_accounts:
bench=# explain analyze select aid from pgbench_accounts where aid >
100000 limit 20;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..0.91 rows=20 width=4) (actual time=0.005..0.464
rows=20 loops=1)
-> Seq Scan on pgbench_accounts (cost=0.00..499187.31 rows=10994846
width=4) (actual time=0.005..0.463 rows=20 loops=1)
Filter: (aid > 100000)
Total runtime: 0.474 ms
(4 rows)
bench=# explain analyze select aid from pgbench_accounts where aid >
10000000 limit 20;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..2.25 rows=20 width=4) (actual time=0.014..0.018
rows=20 loops=1)
-> Index Scan using pgbench_accounts_pkey on pgbench_accounts
(cost=0.00..207204.06 rows=1844004 width=4) (actual time=0.014..0.017
rows=20 loops=1)
Index Cond: (aid > 10000000)
Total runtime: 0.030 ms
(4 rows)
So at some point you get index scans. Now add an ORDER BY:
bench=# explain analyze select aid from pgbench_accounts where aid >
100000 order by aid limit 20;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
--
Limit (cost=0.00..2.25 rows=20 width=4) (actual time=0.008..0.012
rows=20 loops=1)
-> Index Scan using pgbench_accounts_pkey on pgbench_accounts
(cost=0.00..1235355.34 rows=10994846 width=4) (actual time=0.008..0.011
rows=20 loops=1
)
Index Cond: (aid > 100000)
Total runtime: 0.023 ms
(4 rows)
bench=# explain analyze select aid from pgbench_accounts where aid >
10000000 order by aid limit 20;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..2.25 rows=20 width=4) (actual time=0.014..0.018
rows=20 loops=1)
-> Index Scan using pgbench_accounts_pkey on pgbench_accounts
(cost=0.00..207204.06 rows=1844004 width=4) (actual time=0.014..0.016
rows=20 loops=1)
Index Cond: (aid > 10000000)
Total runtime: 0.029 ms
(4 rows)
...and we have index scans for both cases.
Cheers
Mark
From | Date | Subject | |
---|---|---|---|
Next Message | Claudio Freire | 2013-08-06 23:03:59 | Re: ORDER BY, LIMIT and indexes |
Previous Message | Scott Marlowe | 2013-08-06 22:12:18 | Re: subselect requires offset 0 for good performance. |