From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Odd Sort/Limit/Max Problem |
Date: | 2002-12-13 19:55:51 |
Message-ID: | 200212131155.51985.josh@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Folks,
Consider this performance quandry brought to me by Elein, which I can replcate
in 7.2.3 and in 7.4 devel:
case_clients is a medium-large table with about 110,000 rows. The field
date_resolved is a timestamp field which is indexed and allows nulls (in
fact, is null for 40% of entries).
First, as expected, a regular aggregate is slow:
jwnet=> explain analyze select max(date_resolved) from case_clients;
NOTICE: QUERY PLAN:
Aggregate (cost=3076.10..3076.10 rows=1 width=4) (actual time=484.24..484.24
rows=1 loops=1)
-> Seq Scan on case_clients (cost=0.00..2804.48 rows=108648 width=4)
(actual time=0.08..379.81 rows=108648 loops=1)
Total runtime: 484.44 msec
So we use the workaround standard for PostgreSQL:
jwnet=> explain analyze select date_resolved from case_clients order by
date_resolved desc limit 1;
NOTICE: QUERY PLAN:
Limit (cost=0.00..1.50 rows=1 width=4) (actual time=0.22..0.23 rows=1
loops=1)
-> Index Scan Backward using idx_caseclients_resolved on case_clients
(cost=0.00..163420.59 rows=108648 width=4) (actual time=0.21..0.22 rows=2
loops=1)
Total runtime: 0.33 msec
... which is fast, but returns NULL, since nulls sort to the bottom! So we
add IS NOT NULL:
jwnet=> explain analyze select date_resolved from case_clients where
date_resolved is not null order by date_resolved desc limit 1;
NOTICE: QUERY PLAN:
Limit (cost=0.00..4.06 rows=1 width=4) (actual time=219.63..219.64 rows=1
loops=1)
-> Index Scan Backward using idx_caseclients_resolved on case_clients
(cost=0.00..163420.59 rows=40272 width=4) (actual time=219.62..219.62 rows=2
loops=1)
Total runtime: 219.76 msec
Aieee! Almost as slow as the aggregate!
Now, none of those times is huge on this test database, but on a larger
database (> 1million rows) the performance problem is much worse. For some
reason, the backward index scan seems to have to transverse all of the NULLs
before selecting a value. I find this peculiar, as I was under the
impression that NULLs were not indexed.
What's going on here?
--
-Josh Berkus
Aglio Database Solutions
San Francisco
From | Date | Subject | |
---|---|---|---|
Next Message | Stephan Szabo | 2002-12-13 20:10:20 | Re: Odd Sort/Limit/Max Problem |
Previous Message | Josh Berkus | 2002-12-13 18:18:40 | Re: Capping CPU usage? |