From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Federico Simonetti (Liveye)" <federico(at)liveye(dot)net> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Query performance on ILIKE with AND operator... |
Date: | 2005-10-10 14:58:15 |
Message-ID: | 20804.1128956295@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
"Federico Simonetti (Liveye)" <federico(at)liveye(dot)net> writes:
> I'm encountering a quite strange performance problem.
The problem stems from the horrid misestimation of the number of rows
fetched from detail0009:
> " -> Seq Scan on detail0009 (cost=0.00..20500.11
> rows=26 width=1005) (actual time=453.000..5983.000 rows=53588 loops=1)"
> " Filter: ((txcontenttype ~~* '%html%'::text)
> AND ((vchost)::text ~~* '%www.%'::text))"
When the planner is off by a factor of two thousand about the number of
rows involved, it's not very likely to produce a good plan :-(
In the OR case the rowcount estimate is 6334, which is somewhat closer
to reality (only about a factor of 10 off, looks like), and that changes
the plan to something that works acceptably well.
Assuming that this is web-log data, the prevalence of www and html
together is hardly surprising, but PG's statistical mechanisms will
never realize it. Not sure about a good workaround. Does it make
sense to combine the two conditions into one?
(vchost || txcontenttype) ilike '%www.%html%'
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Federico Simonetti (Liveye) | 2005-10-10 15:18:04 | Re: Query performance on ILIKE with AND operator... |
Previous Message | Christopher Kings-Lynne | 2005-10-10 14:56:41 | Re: Server misconfiguration??? |