From: | "Craig A(dot) James" <cjames(at)modgraph-usa(dot)com> |
---|---|
To: | |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Optimizer: limit not taken into account |
Date: | 2006-05-17 15:54:52 |
Message-ID: | 446B474C.2050803@modgraph-usa.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Here's a "corner case" that might interest someone. It tripped up one of our programmers.
We have a table with > 10 million rows. The ID column is indexed, the table has been vacuum/analyzed. Compare these two queries:
select * from tbl where id >= 10000000 limit 1;
select * from tbl where id >= 10000000 order by id limit 1;
The first takes 4 seconds, and uses a full table scan. The second takes 32 msec and uses the index.
Details are below.
I understand why the planner makes the choices it does -- the "id > 10000000" isn't very selective and under normal circumstances a full table scan is probably the right choice. But the "limit 1" apparently doesn't alter the planner's strategy at all. We were surprised by this.
Adding the "order by" was a simple solution.
Craig
pg=> explain analyze select url, url_digest from url_queue where priority >= 10000000 limit 1;
QUERY PLAN
------------------------------------------------------------------------------------------
Limit (cost=0.00..0.65 rows=1 width=108) (actual time=4036.113..4036.117 rows=1 loops=1)
-> Seq Scan on url_queue (cost=0.00..391254.35 rows=606176 width=108) (actual time=4036.101..4036.101 rows=1 loops=1)
Filter: (priority >= 10000000)
Total runtime: 4036.200 ms
(4 rows)
pg=> explain analyze select url, url_digest from url_queue where priority >= 10000000 order by priority limit 1;
QUERY PLAN
--------------------------------------------------------------------------------------
Limit (cost=0.00..2.38 rows=1 width=112) (actual time=32.445..32.448 rows=1 loops=1)
-> Index Scan using url_queue_priority on url_queue (cost=0.00..1440200.41 rows=606176 width=112) (actual time=32.434..32.434 rows=1 loops=1)
Index Cond: (priority >= 10000000)
Total runtime: 32.566 ms
From | Date | Subject | |
---|---|---|---|
Next Message | john_oshea | 2006-05-17 16:18:30 | Re: SQL CPU time usage |
Previous Message | Jim C. Nasby | 2006-05-17 15:51:40 | Re: IMMUTABLE? |