Re: Optimizing a like-cause

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Stefan Sturm" <stefan(dot)s(dot)sturm(at)googlemail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Optimizing a like-cause
Date: 2008-07-22 21:28:03
Message-ID: D425483C2C5C9F49B5B7A41F8944154701000F9D@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org
[mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Dann Corbit
Sent: Tuesday, July 22, 2008 1:30 PM
To: Stefan Sturm; pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Optimizing a like-cause

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org
[mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Stefan Sturm
Sent: Tuesday, July 22, 2008 11:31 AM
To: pgsql-general(at)postgresql(dot)org
Subject: [GENERAL] Optimizing a like-cause

Hello,

I'm developing a autocomplete Feature using php and PostgreSQL 8.3.
To fill the autocomplete box I use the following SQL Statement:
select * from _table_ where upper( _field_ ) like '%STRING%';

This SQL Statement takes 900 ms on a Table with 300.000 entries.

What can I do to speed up the Statement? What Index can I set?
>>
If you are searching for words, you could use tsearch2.
If you are searching for arbitrary fragments, an idea like this might
prove helpful:
http://kaiv.wordpress.com/2007/12/11/postgresql-substring-search/

What you are asking for is very difficult, because an ordinary index
won't help (you have a wildcard on the front) and an index on the
reversed word won't help either (you have a wildcard on the back). So
the standard sort of techniques used to solve it are not perfectly on
target.
<<
Second idea:
>>
It seems to me that you might also store strings as arrays of
characters, create a GIN index, and then use the contains operators:
@> contains
<@ is contained by
I did not try it myself, but it seems it could be helpful. I think it
would also return anagrams of STRING, but you would filter those with
the original where clause restriction. It's hardly ideal, as these seem
to qualify under %STRING% using the GIN idea:
gi n r ts
gi n rst
gi n rt s
gi n srt
gi n str
gi n tr s
gi n trs
gi n tsr
gi nrst
gi nrt s
gi nrts
gi ns rt
gi ns tr
gi nst r
gi nstr
gi ntr s
gi ntrs
gi nts r
gi rnt s
gi rtn s
gi rtns
gi snt r
gi stn r
gi strn
gi tnr s
gi tns r
gi trn s
gi tsn r
gins rt
gins tr
gint r s
gints r
girn ts
girt n s
girt ns
girts n
gist n r
git n r s
git nrs
git ns r
git nsr
git rns
git rsn
git snr
git srn
gitn r s
gitr n s
gitr ns
gits n r
gnir ts
gnirts
gnis rt
gnis tr
gnits r
gnr i ts
gnr ist
gnr its
gnr sit
gnr sti
gnr tis
gnr tsi
gnt i r s
gnt isr
gnt ri s
gnt rsi
gnt sri
gri n ts
gri nst
gri nts
gri snt
gri stn
gri tns
gri tsn
grin ts
grinst
grist n
grits n
grn i ts
grn ist
grn its
grn sit
grn sti
grn tis
grn tsi
grnt i s
grs int
grs itn
grs nit
grs nti
grs tni
grt isn
grt n i s
grt ni s
grt nis
grt ns i
grt nsi
grt sni
gs int r
gs intr
gs itn r
gs n i rt
gs n i tr
gs n irt
gs n itr
gs n rit
gs n rti
gs n tir
gs n tri
gs ni rt
gs ni tr
gs nit r
gs nrt i
gs nti r
gs ntr i
gs rint
gs rnt i
gs rtin
gs rtn i
gs tni r
gs tnr i
gs trn i
gsi n rt
gsi n tr
gsi nrt
gsi ntr
gsi rnt
gsi rtn
gsi tnr
gsi trn
gsin rt
gsin tr
gsn i rt
gsn i tr
gsn irt
gsn itr
gsn rit
gsn rti
gsn tir
gsn tri
gst inr
gst irn
gst n i r
gst n ri
gst ni r
gst nir
gst nri
gst rin
gst rni
gsti n r
gt inr s
gt inrs
gt insr
gt irn s
gt isn r
gt n isr
gt n ri s
gt n rsi
gt n sri
gt ni r s
gt nir s
gt nirs
gt nis r
gt nri s
gt nris
gt nrs i
gt ns i r
gt ns ri
gt nsi r
gt nsr i
gt rin s
gt rins
gt rni s
gt rnis
gt rns i
gt rsin
gt rsn i
gt sni r
gt snir
gt snr i
gt srin
gt srn i
gtin r s
gtis n r
gtn i r s
gtn isr
gtn ri s
gtn rsi
gtn sri
gtr isn
gtr n i s
gtr ni s
gtr nis
gtr ns i
gtr nsi
gtr sni
gtri n s
gtri ns
gtrs n i
gtrs ni
gtsi n r
ign r ts
ign rst
ign rt s
ign srt
ign str
ign tr s
ign trs
ign tsr
igr n ts
igr nst
igr nts
igr snt
igr stn
igr tns
igr tsn
igs n rt
igs n tr
igs nrt
igs ntr
igs rnt
igs rtn
igs tnr
igs trn
igst n r
igt n r s
igt nrs
igt ns r
igt nsr
igt rns
igt rsn
igt snr
igt srn
ing r ts
ing rst
ing rt s
ing srt
ing str
ing tr s
ing trs
ing tsr
ingr ts
ings rt
ings tr
instrg
intg r s
intgr s
irng ts
isg n rt
isg n tr
isg nrt
isg ntr
isg rnt
isg rtn
isg tnr
isg trn
istg n r
itg n r s
itg nrs
itg ns r
itg nsr
itg rns
itg rsn
itg snr
itg srn
itnsg r
ngi r ts
ngi rst
ngi rt s
ngi srt
ngi str
ngi tr s
ngi trs
ngi tsr
ngis rt
ngis tr
ngit r s
ngr i ts
ngr ist
ngr its
ngr sit
ngr sti
ngr tis
ngr tsi
ngs i rt
ngs i tr
ngs irt
ngs itr
ngs rit
ngs rti
ngs tir
ngs tri
ngt i r s
ngt isr
ngt ri s
ngt rsi
ngt sri
nig r ts
nig rst
nig rt s
nig srt
nig str
nig tr s
nig trs
nig tsr
nigs rt
nigs tr
nirg ts
nrg i ts
nrg ist
nrg its
nrg sit
nrg sti
nrg tis
nrg tsi
nsg i rt
nsg i tr
nsg irt
nsg itr
nsg rit
nsg rti
nsg tir
nsg tri
nsig rt
nsig tr
nstig r
ntg i r s
ntg isr
ntg ri s
ntg rsi
ntg sri
rg inst
rg int s
rg isnt
rg itn s
rg n i ts
rg n ist
rg n its
rg n sit
rg n sti
rg n tis
rg n tsi
rg ni ts
rg nist
rg nit s
rg nits
rg nsit
rg nst i
rg nsti
rg nti s
rg ntis
rg nts i
rg sint
rg sitn
rg snit
rg snt i
rg stin
rg stn i
rg tins
rg tisn
rg tni s
rg tnis
rg tns i
rg tsn i
rgi n ts
rgi nst
rgi nts
rgi snt
rgi stn
rgi tns
rgi tsn
rgn i ts
rgn ist
rgn its
rgn sit
rgn sti
rgn tis
rgn tsi
rgs int
rgs itn
rgs nit
rgs nti
rgs tni
rgt isn
rgt n i s
rgt ni s
rgt nis
rgt ns i
rgt nsi
rgt sni
rgti n s
rgti ns
rig n ts
rig nst
rig nts
rig snt
rig stn
rig tns
rig tsn
ringt s
rng i ts
rng ist
rng its
rng sit
rng sti
rng tis
rng tsi
rsg int
rsg itn
rsg nit
rsg nti
rsg tni
rstg n i
rstg ni
rtg isn
rtg n i s
rtg ni s
rtg nis
rtg ns i
rtg nsi
rtg sni
rtgs n i
rtgs ni
rtsg n i
rtsg ni
sgit n r
sgr int
sgr itn
sgr nit
sgr nti
sgr tni
sgt inr
sgt irn
sgt n i r
sgt n ri
sgt ni r
sgt nir
sgt nri
sgt rin
sgt rni
sgtr n i
sgtr ni
sign rt
sign tr
sing rt
sing tr
sng i rt
sng i tr
sng irt
sng itr
sng rit
sng rti
sng tir
sng tri
snig rt
snig tr
srg int
srg itn
srg nit
srg nti
srg tni
srting
sting r
stng i r
stng ri
strg n i
strg ni
strig n
string
strng i
tgi n r s
tgi nrs
tgi ns r
tgi nsr
tgi rns
tgi rsn
tgi snr
tgi srn
tgir n s
tgir ns
tgis n r
tgr isn
tgr n i s
tgr ni s
tgr nis
tgr ns i
tgr nsi
tgr sni
tig n r s
tig nrs
tig ns r
tig nsr
tig rns
tig rsn
tig snr
tig srn
tigr n s
tigr ns
tigs n r
ting r s
tings r
tirg n s
tirg ns
tisg n r
trg isn
trg n i s
trg ni s
trg nis
trg ns i
trg nsi
trg sni
trgi n s
trgi ns
trgs n i
trgs ni
trig n s
trig ns
trigs n
tring s
trng i s
tsg inr
tsg irn
tsg n i r
tsg n ri
tsg ni r
tsg nir
tsg nri
tsg rin
tsg rni
tsig n r
tsing r
tsirg n
tsng i r
tsng ri

but it would still be better than searching all 300K, I think. Also,
many permutations may simply not be present.
<<

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2008-07-22 21:46:14 Re: Optimizing a like-cause
Previous Message Tom Lane 2008-07-22 20:51:56 Re: Problems Restarting PostgreSQL Daemon