From: | Gideon Dresdner <gideond(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | can't coax query planner into using all columns of a gist index |
Date: | 2015-08-11 21:38:26 |
Message-ID: | CAPGvaSPhimbRtYTifEjHx4KoQaAca620GYjWfrbaYNDnfvoO5w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Greetings,
I had a discussion on IRC today with RhodiumToad regarding optimizing a
specific query. We didn't manage to figure out how to get postgres to hit a
GIST index.
The bigger picture is that I am trying to do some bioinformatics and
thought that postgres would be a great way of getting the 1000 genomes
project "in the palm of my hand" instead of submitting C++ programs to a
cluster, waiting for the results, etc.
Anyway, there is a multicolumn index that we created:
CREATE INDEX qcregions_chrregion_index ON qcregions USING gist(chr, region)
WHERE type = 'include'
- chr is an int between 1 and 24 (inclusive)
- region is an int4range
- type is an enum with two values, 'include' and 'exclude'.
Here are three relevant explain outputs:
Example 1:
EXPLAIN SELECT * FROM (values (12,5000), (13,5001) ) v(c,r)
WHERE EXISTS (SELECT region FROM qcregions
WHERE type = 'include' and region @> v.r and chr = v.c);
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Nested Loop Semi Join (cost=0.41..8.82 rows=1 width=8)
-> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=8)
-> Index Scan using qcregions_chrregion_index on qcregions
(cost=0.41..3464.26 rows=874 width=17)
Index Cond: ((chr = "*VALUES*".column1) AND (region @>
"*VALUES*".column2))
(4 rows)
Time: 1.284 ms
Example 2:
-- set enable_setbitmapscan = true
EXPLAIN SELECT * FROM (select * from vcf limit 2) AS vcf
WHERE EXISTS (SELECT region FROM qcregions
WHERE qcregions.chr = vcf.chr
AND qcregions.type = 'include'
AND vcf.pos <@ qcregions.region);
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Nested Loop Semi Join (cost=4862.57..18775.78 rows=1 width=64)
-> Limit (cost=0.00..0.04 rows=2 width=64)
-> Seq Scan on vcf (cost=0.00..1894654.40 rows=84801840 width=64)
-> Bitmap Heap Scan on qcregions (cost=4862.57..7873.60 rows=874
width=17)
Recheck Cond: ((vcf.pos <@ region) AND (type =
'include'::qcregiontype) AND (chr = vcf.chr))
-> BitmapAnd (cost=4862.57..4862.57 rows=874 width=0)
-> Bitmap Index Scan on qcregions_chrregion_index
(cost=0.00..977.76 rows=20980 width=0)
Index Cond: (vcf.pos <@ region)
-> Bitmap Index Scan on qcregions_chr_index
(cost=0.00..3884.12 rows=215158 width=0)
Index Cond: (chr = vcf.chr)
(10 rows)
Time: 0.708 ms
Example 3 (same as example 2 but with enable_bitmapscan = false).
-- set enable_bitmapscan = false
EXPLAIN SELECT * FROM (select * from vcf limit 2) AS vcf
WHERE EXISTS (SELECT region FROM qcregions
WHERE qcregions.chr = vcf.chr
AND qcregions.type = 'include'
AND vcf.pos <@ qcregions.region);
QUERY PLAN
--------------------------------------------------------------------------------------------------
Nested Loop Semi Join (cost=0.43..38691.26 rows=1 width=64)
-> Limit (cost=0.00..0.04 rows=2 width=64)
-> Seq Scan on vcf (cost=0.00..1894654.40 rows=84801840 width=64)
-> Index Scan using qcregions_chr_index on qcregions
(cost=0.43..12891.38 rows=874 width=17)
Index Cond: (chr = vcf.chr)
Filter: ((type = 'include'::qcregiontype) AND (vcf.pos <@ region))
(6 rows)
Time: 1.214 ms
I am running psql (PostgreSQL) 9.4.4 on a laptop with 4 cores and 16 GB RAM.
Looking forward to hearing your thoughts,
Gideon.
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2015-08-11 22:07:15 | GIN pending clean up is not interruptable |
Previous Message | Robert Haas | 2015-08-11 21:18:18 | Re: WIP: SCRAM authentication |