Planner estimates cost of 'like' a lot lower than '='??

From: Mats Lofkvist <mal(at)algonet(dot)se>
To: pgsql-general(at)postgresql(dot)org
Subject: Planner estimates cost of 'like' a lot lower than '='??
Date: 2001-07-22 12:36:28
Message-ID: 20010722123628.19278.qmail@kairos.algonet.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I have a multiple-join select that takes ~70 seconds to execute
but if I remove one of the indexes the time drops to ~2 seconds.

In the 70 second case 'explain' estimates the cost to 17.87,
but in the 2 second case it is estimated to 3341.14.

Fiddling around revealed that the problem is that the cost of
'like' is severely underestimated, making the database use
the wrong index.

I simplified my table to a single-column (varchar(64)) 'test2'
table, and with my data select '... where value = ...' is
estimated at a cost of 756.92 but '... where value like ...'
is estimated at 2.60! (both selects return a single row).

I'm running 7.1.2 with multibyte support on FreeBSD (installed
via the ports). Vacuum analyze was run on all tables before
testing.

Details follow below.

_
Mats Lofkvist
mal(at)algonet(dot)se

============================================================

The single column tests

============================================================

testdb=> \d test2
Table "test2"
Attribute | Type | Modifier
-----------+-----------------------+----------
value | character varying(64) |
Index: test2_valueindex

testdb=> \d test2_valueindex
Index "test2_valueindex"
Attribute | Type
-----------+-----------------------
value | character varying(64)
btree

testdb=> select count(*) from test2;
count
--------
118113
(1 row)

testdb=> explain select * from test2 where value = 't10k9999';
NOTICE: QUERY PLAN:

Index Scan using test2_valueindex on test2 (cost=0.00..756.92 rows=645 width=12)

EXPLAIN
testdb=> select * from test2 where value = 't10k9999';
value
----------
t10k9999
(1 row)

testdb=> explain select * from test2 where value like 't10k9999%';
NOTICE: QUERY PLAN:

Index Scan using test2_valueindex on test2 (cost=0.00..2.60 rows=1 width=12)

EXPLAIN
testdb=> select * from test2 where value like 't10k9999%';
value
----------
t10k9999
(1 row)

testdb=>

============================================================

The 'real' tests

============================================================

testdb=> \d data
Table "data"
Attribute | Type | Modifier
---------------+-----------------------+----------
key0 | character(32) | not null
key1 | character(32) | not null
key2 | character(32) | not null
value | character varying(64) | not null
longvalue | text | not null
Indices: datakey2index,
datakey1index,
dataindex,
datavalueindex

testdb=> \d dataindex
Index "dataindex"
Key2 | Type
----------+---------------
key0 | character(32)
key1 | character(32)
key2 | character(32)
unique btree

testdb=> \d datakey1index
Index "datakey1index"
Key2 | Type
------+---------------
key1 | character(32)
btree

testdb=> \d datakey2index
Index "datakey2index"
Key2 | Type
----------+---------------
key2 | character(32)
btree

testdb=> \d datavalueindex
Index "datavalueindex"
Key2 | Type
-------+-----------------------
value | character varying(64)
btree

testdb=>

testdb=> explain select find0.key0, ret0.longValue as v0, ret1.longValue as v1 from (data find0 inner join data find1 using (key0)) left outer join data ret0 on (ret0.key0 = find0.key0 and ret0.key1 = 'uuuu' and ret0.key2 = 'number') left outer join data ret1 on (ret1.key0 = find0.key0 and ret1.key1 = 'test' and ret1.key2 = 'test') where find0.key1 = 'uuuu' and find0.key2 = 'llll' and find0.value like 't10k__' and find1.key1 = 'test' and find1.key2 = 'test' and find1.value like 'test_0';
NOTICE: QUERY PLAN:

Nested Loop (cost=0.00..17.87 rows=1 width=120)
-> Nested Loop (cost=0.00..13.40 rows=1 width=72)
-> Nested Loop (cost=0.00..8.92 rows=1 width=24)
-> Index Scan using datavalueindex on data find0 (cost=0.00..4.46 rows=1 width=12)
-> Index Scan using datavalueindex on data find1 (cost=0.00..4.46 rows=1 width=12)
-> Index Scan using dataindex on data ret0 (cost=0.00..4.46 rows=1 width=48)
-> Index Scan using dataindex on data ret1 (cost=0.00..4.46 rows=1 width=48)

EXPLAIN
testdb=> select now(); select find0.key0, ret0.longValue as v0, ret1.longValue as v1 from (data find0 inner join data find1 using (key0)) left outer join data ret0 on (ret0.key0 = find0.key0 and ret0.key1 = 'uuuu' and ret0.key2 = 'number') left outer join data ret1 on (ret1.key0 = find0.key0 and ret1.key1 = 'test' and ret1.key2 = 'test') where find0.key1 = 'uuuu' and find0.key2 = 'llll' and find0.value like 't10k__' and find1.key1 = 'test' and find1.key2 = 'test' and find1.value like 'test_0'; select now();
now
------------------------
2001-07-22 13:58:14+02
(1 row)

key0 | v0 | v1
----------------------------------+----+--------
8a7967698cae55e66e627969270c34d8 | 3 | test10
7e2d4eb1188d0e114bff6f0ccf658f59 | 3 | test20
f7c97d1ddafacc36faba09ef3be6ac9c | 3 | test30
e59c68a66f83b1fcdd8ec8e58a854fdb | 3 | test40
077cd901c5c9b88219e5c1d14acc7c41 | 3 | test50
36f6af71d8fa1331a3640675c1dd0cf7 | 3 | test60
bc0a3e2064508f70063516eb709c7654 | 3 | test70
34c376648ef62fce58e1d80f70f1327d | 3 | test80
127869c8452da6e1438795509380b946 | 3 | test90
(9 rows)

now
------------------------
2001-07-22 13:59:25+02
(1 row)

testdb=> drop index datavalueindex ;
DROP
testdb=> explain select find0.key0, ret0.longValue as v0, ret1.longValue as v1 from (data find0 inner join data find1 using (key0)) left outer join data ret0 on (ret0.key0 = find0.key0 and ret0.key1 = 'uuuu' and ret0.key2 = 'number') left outer join data ret1 on (ret1.key0 = find0.key0 and ret1.key1 = 'test' and ret1.key2 = 'test') where find0.key1 = 'uuuu' and find0.key2 = 'llll' and find0.value like 't10k__' and find1.key1 = 'test' and find1.key2 = 'test' and find1.value like 'test_0';
NOTICE: QUERY PLAN:

Nested Loop (cost=0.00..3341.14 rows=1 width=120)
-> Nested Loop (cost=0.00..3336.67 rows=1 width=72)
-> Nested Loop (cost=0.00..3332.20 rows=1 width=24)
-> Index Scan using datakey2index on data find0 (cost=0.00..3327.72 rows=1 width=12)
-> Index Scan using dataindex on data find1 (cost=0.00..4.46 rows=1 width=12)
-> Index Scan using dataindex on data ret0 (cost=0.00..4.46 rows=1 width=48)
-> Index Scan using dataindex on data ret1 (cost=0.00..4.46 rows=1 width=48)

EXPLAIN
testdb=> select now(); select find0.key0, ret0.longValue as v0, ret1.longValue as v1 from (data find0 inner join data find1 using (key0)) left outer join data ret0 on (ret0.key0 = find0.key0 and ret0.key1 = 'uuuu' and ret0.key2 = 'number') left outer join data ret1 on (ret1.key0 = find0.key0 and ret1.key1 = 'test' and ret1.key2 = 'test') where find0.key1 = 'uuuu' and find0.key2 = 'llll' and find0.value like 't10k__' and find1.key1 = 'test' and find1.key2 = 'test' and find1.value like 'test_0'; select now();
now
------------------------
2001-07-22 14:00:08+02
(1 row)

key0 | v0 | v1
----------------------------------+----+--------
8a7967698cae55e66e627969270c34d8 | 3 | test10
127869c8452da6e1438795509380b946 | 3 | test90
34c376648ef62fce58e1d80f70f1327d | 3 | test80
bc0a3e2064508f70063516eb709c7654 | 3 | test70
36f6af71d8fa1331a3640675c1dd0cf7 | 3 | test60
077cd901c5c9b88219e5c1d14acc7c41 | 3 | test50
e59c68a66f83b1fcdd8ec8e58a854fdb | 3 | test40
f7c97d1ddafacc36faba09ef3be6ac9c | 3 | test30
7e2d4eb1188d0e114bff6f0ccf658f59 | 3 | test20
(9 rows)

now
------------------------
2001-07-22 14:00:10+02
(1 row)

testdb=>

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Buddy Lee Haystack 2001-07-22 13:50:30 Re: Planner estimates cost of 'like' a lot lower than '='??
Previous Message Ben-Nes Michael 2001-07-22 11:00:52 Re: VACUUM ANALYZE