From: | PFC <lists(at)peufeu(dot)com> |
---|---|
To: | "Mark Lewis" <mark(dot)lewis(at)mir3(dot)com>, "James Mansion" <james(at)mansionfamily(dot)plus(dot)com> |
Cc: | "Magnus Hagander" <magnus(at)hagander(dot)net>, "Alexander Staubo" <alex(at)purefiction(dot)net>, Andy <frum(at)ar-sd(dot)net>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: LIKE search and performance |
Date: | 2007-05-24 22:09:15 |
Message-ID: | op.tsuqhpzycigqcu@apollo13 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
> PG could scan the index looking for matches first and only load the
> actual rows if it found a match, but that could only be a possible win
> if there were very few matches, because the difference in cost between a
> full index scan and a sequential scan would need to be greater than the
> cost of randomly fetching all of the matching data rows from the table
> to look up the visibility information.
If you need to do that kind of thing, ie. seq scanning a table checking
only one column among a large table of many columns, then don't use an
index. An index, being a btree, needs to be traversed in order (or else, a
lot of locking problems come up) which means some random accesses.
So, you could make a table, with 2 columns, updated via triggers : your
text field, and the primary key of your main table. Scanning that would be
faster.
Still, a better solution for searching in text is :
- tsearch2 if you need whole words
- trigrams for any substring match
- xapian for full text search with wildcards (ie. John* = Johnny)
Speed-wise those three will beat any seq scan on a large table by a huge
margin.
From | Date | Subject | |
---|---|---|---|
Next Message | Richard Huxton | 2007-05-25 08:13:25 | Re: LIKE search and performance |
Previous Message | Craig James | 2007-05-24 22:08:16 | Re: LIKE search and performance |