Maybe you can extract stuff like IP addresses and words like 'error'
and put it in a separate column in the table. Full text search is not
a solution for data that is in a wrong format.
On Fri, Sep 10, 2010 at 10:27 AM, Henk van Lingen <H(dot)G(dot)K(dot)vanLingen(at)uu(dot)nl> wrote:
> On Thu, Sep 09, 2010 at 11:16:36AM -0400, Tom Lane wrote:
> > Henk van Lingen <H(dot)G(dot)K(dot)vanLingen(at)uu(dot)nl> writes:
> > > On Thu, Sep 09, 2010 at 10:50:52AM -0400, Tom Lane wrote:
> > >>>> Well, there's your problem: the planner is off by a factor of about 500
> > >>>> on its estimate of the number of rows matching this query, and that's
> > >>>> what's causing it to pick the wrong plan. What you need to look into
> > >>>> is getting that estimate to be more in sync with reality. Probably
> > >>>> increasing the stats target for the message column would help.
> >
> > > But how can I get sane estimates for syslog data? Some searchstrings will
> > > result in only a few hits, others in thousands of records or more.
> >
> > That's what ANALYZE is for ...
>
> Yes, off course. But I don't see how the most_common_vals & freqs and the
> histogram_bounds for a text field with syslog data make any sense when
> doing doing a search for a substring. Increasing the number of entries in
> those stats lists doesn't make any sense also, i presume.
>
> Those stats should be based on analysis of the to_tsvector index, to have
> any meaning, i think.
>
> Today I will look into the multicolumn index suggestion.
>
> Regards,
>
> --
> Henk van Lingen, ICT-SC Netwerk & Telefonie, (o- -+
> Universiteit Utrecht, Jenalaan 18a, room 0.12 /\ |
> phone: +31-30-2538453 v_/_ |
> http://henk.vanlingen.net/ http://www.tuxtown.net/netiquette/
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>