From: | Gregory Stark <stark(at)enterprisedb(dot)com> |
---|---|
To: | "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: There's random access and then there's random access |
Date: | 2007-12-11 14:32:26 |
Message-ID: | 8763z5xpxh.fsf@oxford.xeocode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"Gregory Stark" <stark(at)enterprisedb(dot)com> writes:
> I think this will be easiest to do for bitmap index scans. Since we gather up
> all the pages we'll need before starting the heap scan we can easily skim
> through them, issue posix_fadvises for at least a certain number ahead of the
> actual read point and then proceed with the rest of the scan unchanged.
I've written up a simple test implementation of prefetching using
posix_fadvise(). Here are some nice results on a query accessing 1,000 records
from a 10G table with 300 million records:
postgres=# set preread_pages=0; explain analyze select (select count(*) from h where h = any (x)) from (select random_array(1000,1,300000000) as x)x;
SET
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan x (cost=0.00..115.69 rows=1 width=32) (actual time=6069.505..6069.509 rows=1 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.058..0.061 rows=1 loops=1)
SubPlan
-> Aggregate (cost=115.66..115.67 rows=1 width=0) (actual time=6069.425..6069.426 rows=1 loops=1)
-> Bitmap Heap Scan on h (cost=75.49..115.63 rows=10 width=0) (actual time=3543.107..6068.335 rows=1000 loops=1)
Recheck Cond: (h = ANY ($0))
-> Bitmap Index Scan on hi (cost=0.00..75.49 rows=10 width=0) (actual time=3542.220..3542.220 rows=1000 loops=1)
Index Cond: (h = ANY ($0))
Total runtime: 6069.632 ms
(9 rows)
postgres=# set preread_pages=300; explain analyze select (select count(*) from h where h = any (x)) from (select random_array(1000,1,300000000) as x)x;
SET
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan x (cost=0.00..115.69 rows=1 width=32) (actual time=3945.602..3945.607 rows=1 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.060..0.064 rows=1 loops=1)
SubPlan
-> Aggregate (cost=115.66..115.67 rows=1 width=0) (actual time=3945.520..3945.521 rows=1 loops=1)
-> Bitmap Heap Scan on h (cost=75.49..115.63 rows=10 width=0) (actual time=3505.546..3944.817 rows=1000 loops=1)
Recheck Cond: (h = ANY ($0))
-> Bitmap Index Scan on hi (cost=0.00..75.49 rows=10 width=0) (actual time=3452.759..3452.759 rows=1000 loops=1)
Index Cond: (h = ANY ($0))
Total runtime: 3945.730 ms
(9 rows)
Note that while the query itself is only 50% faster the bitmap heap scan
specifically is actually 575% faster than without readahead.
It would be nice to optimize the bitmap index scan as well but that will be a
bit trickier and it probably won't be able to cover as many pages. As a result
it probably won't be a 5x speedup like the heap scan.
Also, this is with a fairly aggressive readahead which only makes sense for
queries that look a lot like this and will read all the tuples. For a more
general solution I think it would make sense to water down the performance a
bit in exchange for some protection against doing unnecessary I/O in cases
where the query isn't actually going to read all the tuples.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!
From | Date | Subject | |
---|---|---|---|
Next Message | Martijn van Oosterhout | 2007-12-11 14:33:53 | Re: VACUUM ANALYZE out of memory |
Previous Message | Michael Akinde | 2007-12-11 14:18:54 | Re: VACUUM ANALYZE out of memory |