Quick Links

Re: Postgresql selecting strange index for simple query

From:	Maxim Boguk <mboguk(at)masterhost(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Postgresql selecting strange index for simple query
Date:	2009-02-26 18:49:46
Message-ID:	49A6E44A.8040106@masterhost.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

> Maxim Boguk <mboguk(at)masterhost(dot)ru> writes:
>> Tom Lane wrote:
>>> It does know better than that. I'm wondering if the single-column index
>>> has become very bloated or something. Have you compared the physical
>>> index sizes?
>
>> Table fresh loaded from dump on test server... So no index bloat for sure...
>> As for comparing physical sizes, right single column index indeed smaller then wrong one:
>
> Huh. I get sane-looking choices when I try a similar case here. Can
> you put together a self-contained test case?

Not full self-contained test case but some minimal setup data which can give you ideas whats going wrong:
Test confirm my theory wrong index selection linked with long rows in table.

My tests contain such queries:

Ok TEST1:
set random_page_cost=1;
drop TABLE if exists test_table ;
SELECT user_id,last_change_time,rpad('a',1,'b') as f1 into test_table from resume;
ANALYZE test_table;
SELECT count(*) from test_table;
CREATE INDEX right_idx on test_table(last_change_time);
CREATE INDEX wrong_idx on test_table(user_id, last_change_time);
EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00';
result:
Index Scan using right_idx on test_table (cost=0.00..42763.35 rows=388718 width=0) (actual time=0.020..342.653 rows=390370 loops=1)
Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
Total runtime: 368.699 ms

Ok TEST2 (but see: cost increased 4x times when real work time increased only by 30%):
same but rpad('a',200,'b') instead of rpad('a',1,'b') when populating test_table:
result:
Index Scan using right_idx on test_table (cost=0.00..179346.09 rows=392268 width=0) (actual time=0.089..422.439 rows=390370 loops=1)
Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
Total runtime: 448.717 ms

!!Not ok TEST3:!!
same but rpad('a',500,'b') instead of rpad('a',1,'b') when populating test_table:
Oops wrong index used:
Index Scan using wrong_idx on test_table (cost=0.00..254918.19 rows=392231 width=0) (actual time=0.067..730.097 rows=390370 loops=1)
Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
Total runtime: 757.930 ms
(3 rows)
(btw if drop wrong_idx query become works almost 2х faster:
drop INDEX wrong_idx;
Index Scan using right_idx on test_table (cost=0.00..259709.09 rows=392231 width=0) (actual time=0.019..416.108 rows=390370 loops=1)
Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
Total runtime: 442.790 ms
)

Again Ok TEST4 (here toast engine removed all long values from test_table):
same but rpad('a',2000,'b') instead of rpad('a',1,'b') when populating test_table:
Index Scan using right_idx on test_table (cost=0.00..64606.50 rows=393002 width=0) (actual time=0.058..371.723 rows=390370 loops=1)
Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
Total runtime: 397.929 ms

So i have two theory (just waving hands ofcourse):
1)integer owerflow somewhere in cost calculation
2)floating rounding errors (because cost very close in wrong situations: 254918.19 vs 259709.09)

PS: second issue is cost increasing with increasing lenght of rows in table complete indepenent with increase real work time.
(probably i need tune other _cost parameters for get more exact cost values)

PPS: sorry for my poor english

Regards, Maxim Boguk

In response to

Re: Postgresql selecting strange index for simple query at 2009-02-26 16:33:58 from Tom Lane

Responses

Re: Postgresql selecting strange index for simple query at 2009-02-26 20:04:04 from Tom Lane

Browse pgsql-general by date

	From	Date	Subject
Next Message	Andrew Gould	2009-02-26 19:04:26	Re: Off Topic: ICD-10 codes in a database table?
Previous Message	Madison Kelly	2009-02-26 18:30:27	Returning null for joined tables when one column non existant