| From: | Yantao Shi <y(dot)shi(at)larc(dot)nasa(dot)gov> |
|---|---|
| To: | <pgsql-performance(at)postgresql(dot)org> |
| Subject: | wildcard search performance with "like" |
| Date: | 2006-01-17 20:00:37 |
| Message-ID: | 43CD4CE5.3020506@larcb.ecs.nasa.gov |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-performance |
Hi,
I have a postges 8.1.1 table with over 29 million rows in it. The colunm
(file_name) that I need to search on has entries like the following:
MOD04_L2.A2005311.1400.004.2005312013848.hdf
MYD04_L2.A2005311.0700.004.2005312013437.hdf
I have an index on this column. But an index search is performance only
when I give the full file_name for search:
testdbspc=# explain select file_name from catalog where file_name =
'MOD04_L2.A2005311.1400.004.2005312013848.hdf';
QUERY PLAN
Index Scan using catalog_pk_idx on catalog (cost=0.00..6.01 rows=1
width=404)
Index Cond: (file_name =
'MOD04_L2.A2005311.1400.004.2005312013848.hdf'::bpchar)
(2 rows)
What I really need to do most of the time is a multi-wildcard search on
this column, which is now doing a whole table scan without using the
index at all:
testdbspc=# explain select file_name from catalog where file_name like
'MOD04_L2.A2005311.%.004.2005312013%.hdf';
QUERY PLAN
Seq Scan on catalog (cost=0.00..429.00 rows=1 width=404)
Filter: (file_name ~~ 'MOD04_L2.A2005311.%.004.2005312013%.hdf'::text)
(2 rows)
Obviously, the performance of the table scan on such a large table is
not acceptable.
I tried full-text indexing and searching. It did NOT work on this column
because all the letters and numbers are linked together with "." and
considered one big single word by to_tsvector.
Any solutions for this column to use an index search with multiple wild
cards?
Thanks a lot,
Yantao Shi
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Stone | 2006-01-17 20:04:41 | Re: Suspending SELECTs |
| Previous Message | Alessandro Baretta | 2006-01-17 19:56:00 | Re: Suspending SELECTs |