From: | Victor Yegorov <vyegorov(at)gmail(dot)com> |
---|---|
To: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Estimates on partial index |
Date: | 2016-08-18 13:52:11 |
Message-ID: | CAGnEbohwiUPY1-x6=u4bJuAvnhr1oNYP_oTpWOPVcdCsyisjyw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Greetings.
I have a question on why planner chooses `IndexScan` for the following
query:
SELECT la.loan_id, la.due_date, la.is_current
FROM loan_agreements la WHERE la.is_current AND '2016-08-11' >
la.due_date;
Relevant (cannot post it all, sorry) table definition is:
Column Type Modifiers
------------------------------ --------------------------- ---------
id bigint not null
...
is_current boolean not null
due_date date not null
loan_id bigint
Indexes:
"loan_agreements_pkey" PRIMARY KEY, btree (id)
...
"idx_loan_agreements_due_date" btree (due_date)
"idx_loan_agreemnets_loan_id_cond_is_current_true" btree (loan_id)
WHERE is_current = true
Some stats:
SELECT relname,reltuples::numeric,relpages FROM pg_class WHERE oid IN
('loan_agreements'::regclass,
'idx_loan_agreemnets_loan_id_cond_is_current_true'::regclass,
'idx_loan_agreements_due_date'::regclass);
relname reltuples relpages
------------------------------------------------ --------- --------
idx_loan_agreements_due_date 664707 1828
idx_loan_agreemnets_loan_id_cond_is_current_true 237910 655
loan_agreements 664707 18117
Settings:
SELECT name,setting,unit FROM pg_settings WHERE name ~
'(buffers|mem|cost)$';
name setting unit
-------------------- -------- ----
autovacuum_work_mem 524288 kB
cpu_index_tuple_cost 0.005 ¤
cpu_operator_cost 0.0025 ¤
cpu_tuple_cost 0.01 ¤
maintenance_work_mem 16777216 kB
random_page_cost 2.5 ¤
seq_page_cost 1 ¤
shared_buffers 1572864 8kB
temp_buffers 8192 8kB
wal_buffers 2048 8kB
work_mem 65536 kB
PostgreSQL 9.4.8 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit
Planner chooses the following plan:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on
loan_agreements la (cost=0.42..16986.53 rows=226145 width=13) (actual
time=0.054..462.394 rows=216530 loops=1)
Filter: ('2016-08-11'::date > due_date)
Rows Removed by Filter: 21304
Buffers: shared hit=208343 read=18399
Planning time: 0.168 ms
Execution time: 479.773 ms
If I disable IndexScans, plan changes likes this:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on loan_agreements la (cost=2884.01..23974.88
rows=226145 width=13) (actual time=38.893..200.376 rows=216530 loops=1)
Recheck Cond: is_current
Filter: ('2016-08-11'::date > due_date)
Rows Removed by Filter: 21304
Heap Blocks: exact=18117
Buffers: shared hit=18212 read=557
-> Bitmap Index Scan on
idx_loan_agreemnets_loan_id_cond_is_current_true (cost=0.00..2827.47
rows=237910 width=0) (actual time=35.166..35.166 rows=237853 loops=1)
Buffers: shared hit=119 read=533
Planning time: 0.171 ms
Execution time: 214.341 ms
Question is — why IndexScan over partial index is estimated less than
BitmapHeap + BitmapIndex scan. And how can I tell Planner, that IndexScan
over 1/3 of table is not a good thing — IndexScan is touching 10x more
pages and in a typical situation those are cold.
Thanks in advance.
--
Victor Y. Yegorov
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-08-18 13:56:10 | Re: Estimates on partial index |
Previous Message | pinker | 2016-08-17 11:45:03 | Big data INSERT optimization - ExclusiveLock on extension of the table |