From: | Maxim Boguk <mboguk(at)masterhost(dot)ru> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Postgresql selecting strange index for simple query |
Date: | 2009-02-26 18:49:46 |
Message-ID: | 49A6E44A.8040106@masterhost.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
> Maxim Boguk <mboguk(at)masterhost(dot)ru> writes:
>> Tom Lane wrote:
>>> It does know better than that. I'm wondering if the single-column index
>>> has become very bloated or something. Have you compared the physical
>>> index sizes?
>
>> Table fresh loaded from dump on test server... So no index bloat for sure...
>> As for comparing physical sizes, right single column index indeed smaller then wrong one:
>
> Huh. I get sane-looking choices when I try a similar case here. Can
> you put together a self-contained test case?
Not full self-contained test case but some minimal setup data which can give you ideas whats going wrong:
Test confirm my theory wrong index selection linked with long rows in table.
My tests contain such queries:
Ok TEST1:
set random_page_cost=1;
drop TABLE if exists test_table ;
SELECT user_id,last_change_time,rpad('a',1,'b') as f1 into test_table from resume;
ANALYZE test_table;
SELECT count(*) from test_table;
CREATE INDEX right_idx on test_table(last_change_time);
CREATE INDEX wrong_idx on test_table(user_id, last_change_time);
EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00';
result:
Index Scan using right_idx on test_table (cost=0.00..42763.35 rows=388718 width=0) (actual time=0.020..342.653 rows=390370 loops=1)
Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
Total runtime: 368.699 ms
Ok TEST2 (but see: cost increased 4x times when real work time increased only by 30%):
same but rpad('a',200,'b') instead of rpad('a',1,'b') when populating test_table:
result:
Index Scan using right_idx on test_table (cost=0.00..179346.09 rows=392268 width=0) (actual time=0.089..422.439 rows=390370 loops=1)
Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
Total runtime: 448.717 ms
!!Not ok TEST3:!!
same but rpad('a',500,'b') instead of rpad('a',1,'b') when populating test_table:
Oops wrong index used:
Index Scan using wrong_idx on test_table (cost=0.00..254918.19 rows=392231 width=0) (actual time=0.067..730.097 rows=390370 loops=1)
Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
Total runtime: 757.930 ms
(3 rows)
(btw if drop wrong_idx query become works almost 2х faster:
drop INDEX wrong_idx;
Index Scan using right_idx on test_table (cost=0.00..259709.09 rows=392231 width=0) (actual time=0.019..416.108 rows=390370 loops=1)
Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
Total runtime: 442.790 ms
)
Again Ok TEST4 (here toast engine removed all long values from test_table):
same but rpad('a',2000,'b') instead of rpad('a',1,'b') when populating test_table:
Index Scan using right_idx on test_table (cost=0.00..64606.50 rows=393002 width=0) (actual time=0.058..371.723 rows=390370 loops=1)
Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
Total runtime: 397.929 ms
So i have two theory (just waving hands ofcourse):
1)integer owerflow somewhere in cost calculation
2)floating rounding errors (because cost very close in wrong situations: 254918.19 vs 259709.09)
PS: second issue is cost increasing with increasing lenght of rows in table complete indepenent with increase real work time.
(probably i need tune other _cost parameters for get more exact cost values)
PPS: sorry for my poor english
Regards, Maxim Boguk
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Gould | 2009-02-26 19:04:26 | Re: Off Topic: ICD-10 codes in a database table? |
Previous Message | Madison Kelly | 2009-02-26 18:30:27 | Returning null for joined tables when one column non existant |