Question about (probably wrong) index scan cost for conditional indexes

From: Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Question about (probably wrong) index scan cost for conditional indexes
Date: 2012-01-22 23:58:35
Message-ID: CAK-MWwToe+HeJu+494As5a+bbBaHr7dsye_5+N-YwThh5DpP2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I not sure it is bug or just planner work that way.
Postgresql 9.1.2 on Linux.

But it seems that index scan cost for very narrow/selective conditional
indexes is greatly overestimated at least in some cases.

In my case I have an special conditional index like:

"news_dtime_in208section_active_key2" btree (dtime) WHERE status = 1 AND
class::text = 'Sports::News'::text AND sections && '{208}'::integer[]

And query:

db=# EXPLAIN ANALYZE select * from news where (status = 1) and (class =
'Sports::News') and (sections && '{208}') order by dtime limit 10;

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..26.38 rows=10 width=1262) (actual time=0.040..0.082
rows=10 loops=1)
-> Index Scan using news_dtime_in208section_active_key2 on news
(cost=0.00..1429.55 rows=542 width=1262) (actual time=0.038..0.073 rows=10
loops=1)
Total runtime: 0.142 ms
(3 rows)

I see no reasons why cost of that query that high... i think it should be
very close equvalent in cost of query:

"news_pkey" PRIMARY KEY, btree (id)

db=# EXPLAIN ANALYZE select * from news order by id limit 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..0.33 rows=10 width=1262) (actual time=0.043..0.085
rows=10 loops=1)
-> Index Scan using news_pkey on news (cost=0.00..25944.34 rows=775090
width=1262) (actual time=0.041..0.077 rows=10 loops=1)
Total runtime: 0.147 ms
(3 rows)

(and if you compare real execution times - they are same but cost is
different by 2 orders).
No changes of costing setting have an effect that difference.

That problem leads to switching to very slow plan for medium limits:

db=# EXPLAIN ANALYZE select * from news where (status = 1) and (class =
'Sports::News') and (sections && '{208}') order by dtime limit 40;

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=91.97..92.07 rows=40 width=1262) (actual
time=630.865..630.889 rows=40 loops=1)
-> Sort (cost=91.97..93.32 rows=542 width=1262) (actual
time=630.862..630.872 rows=40 loops=1)
Sort Key: dtime
Sort Method: top-N heapsort Memory: 89kB
-> Bitmap Heap Scan on news (cost=6.18..74.83 rows=542
width=1262) (actual time=145.816..412.254 rows=262432 loops=1)
Recheck Cond: ((sections && '{208}'::integer[]) AND (status
= 1) AND ((class)::text = 'Sports::News'::text))
-> Bitmap Index Scan on news_sections_gin2_special
(cost=0.00..6.05 rows=542 width=0) (actual time=98.954..98.954 rows=262754
loops=1)
Index Cond: (sections && '{208}'::integer[])
Total runtime: 632.049 ms
(9 rows)

Kind regards,
Maksym

--
Maxim Boguk
Senior Postgresql DBA.

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

Skype: maxim.boguk
Jabber: maxim(dot)boguk(at)gmail(dot)com

LinkedIn profile: http://nz.linkedin.com/in/maximboguk
"If they can send one man to the moon... why can't they send them all?"

МойКруг: http://mboguk.moikrug.ru/
"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."

Responses

Browse pgsql-general by date

  From Date Subject
Next Message panam 2012-01-23 00:19:14 Best way to create unique primary keys across schemas?
Previous Message Tomas Vondra 2012-01-22 23:40:03 Re: indexes no longer used after shutdown during reindexing