Re: BUG #10144: PostgreSQL cost_index in costsize.c bug? (cann't estimate indexCorrelation correct)

From: 德哥 <digoal(at)126(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #10144: PostgreSQL cost_index in costsize.c bug? (cann't estimate indexCorrelation correct)
Date: 2014-04-27 02:18:27
Message-ID: 27c24b21.968.145a0f9f744.Coremail.digoal@126.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

HI,
Thanks, I think this problem because it's random_page_cost set to 1 equal seq_page_cost, so this query use index scan first then bitmap scan.
When i set random_page_cost large than seq_page_cost like 10 (use stap test the random cost and seq cost) ,
Then this sql use seq scan first, then bitmap scan, then index scan.
Thanks.

[postgres(at)digoal pgdata]$ psql
psql (9.3.4)
Type "help" for help.

digoal=# show seq_page_cost;
seq_page_cost
---------------
1
(1 row)

digoal=# show random_page_cost;
random_page_cost
------------------
4
(1 row)

digoal=# show cpu_tuple_cost;
cpu_tuple_cost
----------------
0.01
(1 row)

digoal=# show cpu_index_tuple_cost;
cpu_index_tuple_cost
----------------------
0.005
(1 row)

digoal=# show cpu_operator_cost;
cpu_operator_cost
-------------------
0.0025
(1 row)

digoal=# show effective_cache_size;
effective_cache_size
----------------------
128MB
(1 row)

digoal=# \dt+ tbl_cost_align
List of relations
Schema | Name | Type | Owner | Size | Description
--------+----------------+-------+----------+--------+-------------
public | tbl_cost_align | table | postgres | 219 MB |
(1 row)

digoal=# \di+ tbl_cost_align_id
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+-------------------+-------+----------+----------------+-------+-------------
public | tbl_cost_align_id | index | postgres | tbl_cost_align | 64 MB |
(1 row)

digoal=# set random_page_cost=10;
SET
digoal=# explain (analyze,costs,buffers,timing,verbose) select * from tbl_cost_align where id>2000000;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.tbl_cost_align (cost=0.00..65538.00 rows=2996963 width=45) (actual time=0.050..1477.028 rows=2997015 loops=1)
Output: id, info, crt_time
Filter: (tbl_cost_align.id > 2000000)
Rows Removed by Filter: 2985
Buffers: shared hit=28038
Total runtime: 2011.742 ms
(6 rows)

digoal=# set enable_seqscan=off;
SET
digoal=# explain (analyze,costs,buffers,timing,verbose) select * from tbl_cost_align where id>2000000;
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
----------------
Bitmap Heap Scan on public.tbl_cost_align (cost=105426.89..170926.93 rows=2996963 width=45) (actual time=1221.104..2911.889 rows=2
997015 loops=1)
Output: id, info, crt_time
Recheck Cond: (tbl_cost_align.id > 2000000)
Rows Removed by Index Recheck: 2105
Buffers: shared hit=36229
-> Bitmap Index Scan on tbl_cost_align_id (cost=0.00..104677.65 rows=2996963 width=0) (actual time=1214.865..1214.865 rows=2997
015 loops=1)
Index Cond: (tbl_cost_align.id > 2000000)
Buffers: shared hit=8191
Total runtime: 3585.699 ms
(9 rows)

digoal=# set enable_bitmapscan=off;
SET
digoal=# explain (analyze,costs,buffers,timing,verbose) select * from tbl_cost_align where id>2000000;
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
----------------------------
Index Scan using tbl_cost_align_id on public.tbl_cost_align (cost=0.43..16601388.04 rows=2996963 width=45) (actual time=0.064..566
2.361 rows=2997015 loops=1)
Output: id, info, crt_time
Index Cond: (tbl_cost_align.id > 2000000)
Buffers: shared hit=3005084
Total runtime: 6173.067 ms
(5 rows)

the wrong plan cost occur when i set random_page_cost to 1, and effective_cache_size big then index size and table size in this case.
digoal=# set random_page_cost=1;
SET
digoal=# explain (analyze,costs,buffers,timing,verbose) select * from tbl_cost_align where id>2000000;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.tbl_cost_align (cost=0.00..65538.00 rows=2996963 width=45) (actual time=0.040..1692.712 rows=2997015 loops=1)
Output: id, info, crt_time
Filter: (tbl_cost_align.id > 2000000)
Rows Removed by Filter: 2985
Buffers: shared hit=28038
Total runtime: 2249.313 ms
(6 rows)

digoal=# set enable_seqscan=off;
SET
digoal=# explain (analyze,costs,buffers,timing,verbose) select * from tbl_cost_align where id>2000000;
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
--------------
Bitmap Heap Scan on public.tbl_cost_align (cost=31446.89..96946.93 rows=2996963 width=45) (actual time=1224.445..2454.797 rows=299
7015 loops=1)
Output: id, info, crt_time
Recheck Cond: (tbl_cost_align.id > 2000000)
Rows Removed by Index Recheck: 2105
Buffers: shared hit=36229
-> Bitmap Index Scan on tbl_cost_align_id (cost=0.00..30697.65 rows=2996963 width=0) (actual time=1220.404..1220.404 rows=29970
15 loops=1)
Index Cond: (tbl_cost_align.id > 2000000)
Buffers: shared hit=8191
Total runtime: 2955.816 ms
(9 rows)

digoal=# set effective_cache_size='280MB';
SET
digoal=# explain (analyze,costs,buffers,timing,verbose) select * from tbl_cost_align where id>2000000;
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
-------------
Bitmap Heap Scan on public.tbl_cost_align (cost=31446.89..96946.93 rows=2996963 width=45) (actual time=963.845..2060.463 rows=2997
015 loops=1)
Output: id, info, crt_time
Recheck Cond: (tbl_cost_align.id > 2000000)
Rows Removed by Index Recheck: 2105
Buffers: shared hit=36229
-> Bitmap Index Scan on tbl_cost_align_id (cost=0.00..30697.65 rows=2996963 width=0) (actual time=959.673..959.673 rows=2997015
loops=1)
Index Cond: (tbl_cost_align.id > 2000000)
Buffers: shared hit=8191
Total runtime: 2515.649 ms
(9 rows)

When effective_cache_size large then table and index's size. then use index scan first than bitmap scan.
digoal=# set effective_cache_size='283MB';
SET
digoal=# explain (analyze,costs,buffers,timing,verbose) select * from tbl_cost_align where id>2000000;
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
-------------------------
Index Scan using tbl_cost_align_id on public.tbl_cost_align (cost=0.43..92030.24 rows=2996963 width=45) (actual time=0.045..5238.3
61 rows=2997015 loops=1)
Output: id, info, crt_time
Index Cond: (tbl_cost_align.id > 2000000)
Buffers: shared hit=3005084
Total runtime: 5689.583 ms
(5 rows)

digoal=# set random_page_cost=10;
SET
digoal=# explain (analyze,costs,buffers,timing,verbose) select * from tbl_cost_align where id>2000000;
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
---------------
Bitmap Heap Scan on public.tbl_cost_align (cost=105426.89..170926.93 rows=2996963 width=45) (actual time=918.225..2195.414 rows=29
97015 loops=1)
Output: id, info, crt_time
Recheck Cond: (tbl_cost_align.id > 2000000)
Rows Removed by Index Recheck: 2105
Buffers: shared hit=36229
-> Bitmap Index Scan on tbl_cost_align_id (cost=0.00..104677.65 rows=2996963 width=0) (actual time=913.935..913.935 rows=299701
5 loops=1)
Index Cond: (tbl_cost_align.id > 2000000)
Buffers: shared hit=8191
Total runtime: 2698.429 ms
(9 rows)

digoal=# set enable_seqscan=on;
SET
digoal=# explain (analyze,costs,buffers,timing,verbose) select * from tbl_cost_align where id>2000000;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.tbl_cost_align (cost=0.00..65538.00 rows=2996963 width=45) (actual time=0.020..1522.791 rows=2997015 loops=1)
Output: id, info, crt_time
Filter: (tbl_cost_align.id > 2000000)
Rows Removed by Filter: 2985
Buffers: shared hit=28038
Total runtime: 2104.057 ms
(6 rows)

--
公益是一辈子的事,I'm Digoal,Just Do It.

At 2014-04-27 03:45:33,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>digoal(at)126(dot)com writes:
>> i see postgresql optimizer don't add column's indexCorrelation in index scan
>> and then shold not choose optimal scannode.
>
>When I try this test case, I find that the planner prefers a plain
>seqscan, then a bitmap scan, and last an indexscan; and the cost estimates
>are not too out of line with reality. I suspect you've changed the
>planner's cost parameters to some non-default settings that don't really
>square very well with your environment.
>
> regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message digoal 2014-04-28 08:37:25 BUG #10155: BUG? Cann't remove new generated tuples after repeatable read transaction start.
Previous Message Tomas Vondra 2014-04-26 21:01:06 Re: LOG: incomplete message from client