From: | Seamus Abshere <seamus(at)abshere(dot)net> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Why does query planner choose slower BitmapAnd ? |
Date: | 2016-02-22 15:18:41 |
Message-ID: | 1456154321.976561.528310154.6A623C0E@webmail.messagingengine.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
hi,
I don't understand why the query planner is choosing a BitmapAnd when an
Index Scan followed by a filter is obviously better.
(Note that "new_york_houses" is a view of table "houses" with one
condition on city - and there is an index idx_houses_city. That is the
Index Scan that I think it should use exclusively.)
Here's a fast query that uses the Index Scan followed by a filter:
> => explain analyze SELECT COUNT(DISTINCT id) FROM new_york_houses WHERE roof_area >= 0 AND roof_area < 278.7091;
> QUERY PLAN
> ---
> Aggregate (cost=167298.10..167298.11 rows=1 width=16) (actual time=141.137..141.137 rows=1 loops=1)
> -> Index Scan using idx_houses_city on households (cost=0.57..167178.87 rows=47694 width=16) (actual time=0.045..105.953 rows=53971 loops=1)
> Index Cond: (city = 'New York'::text)
> Filter: ((roof_area >= 0) AND ((roof_area)::numeric < 278.7091))
> Rows Removed by Filter: 101719
> Planning time: 0.688 ms
> Execution time: 141.250 ms
> (7 rows)
When I add another condition, "phoneable", however, it chooses an
obviously wrong plan:
> => explain analyze SELECT COUNT(DISTINCT id) FROM new_york_houses WHERE roof_area >= 0 AND roof_area < 278.7091 AND phoneable = true;
> QUERY PLAN
> ---
> Aggregate (cost=128163.05..128163.06 rows=1 width=16) (actual time=4564.677..4564.677 rows=1 loops=1)
> -> Bitmap Heap Scan on households (cost=105894.80..128147.78 rows=6106 width=16) (actual time=4456.690..4561.416 rows=5183 loops=1)
> Recheck Cond: (city = 'New York'::text)
> Filter: (phoneable AND (roof_area >= 0) AND ((roof_area)::numeric < 278.7091))
> Rows Removed by Filter: 40103
> Heap Blocks: exact=14563
> -> BitmapAnd (cost=105894.80..105894.80 rows=21002 width=0) (actual time=4453.510..4453.510 rows=0 loops=1)
> -> Bitmap Index Scan on idx_houses_city (cost=0.00..1666.90 rows=164044 width=0) (actual time=16.505..16.505 rows=155690 loops=1)
> Index Cond: (city = 'New York'::text)
> -> Bitmap Index Scan on idx_houses_phoneable (cost=0.00..104224.60 rows=10271471 width=0) (actual time=4384.461..4384.461 rows=10647041 loops=1)
> Index Cond: (phoneable = true)
> Planning time: 0.709 ms
> Execution time: 4565.067 ms
> (13 rows)
On Postgres 9.4.4 with 244gb memory and SSDs
maintenance_work_mem 1000000
work_mem 500000
random_page_cost 1
seq_page_cost 2
The "houses" table has been analyzed recently and has statistics set to
the max.
Thanks,
Seamus
--
Seamus Abshere, SCEA
https://github.com/seamusabshere
https://www.linkedin.com/in/seamusabshere
From | Date | Subject | |
---|---|---|---|
Next Message | Victor Yegorov | 2016-02-22 15:45:04 | bpchar, text and indexes |
Previous Message | Stephen Frost | 2016-02-22 14:58:08 | Re: Why is my database so big? |