Re: Mnogosearch (Was: Re: website doc search is ... )

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Mnogosearch (Was: Re: website doc search is ... )
Date: 2004-01-01 20:12:36
Message-ID: 29978.1072987956@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Marc G. Fournier" <scrappy(at)postgresql(dot)org> writes:
> On Thu, 1 Jan 2004, Tom Lane wrote:
>> Is there a reason for the
>> concatenation part of the expression?

> Believe it or not, the concatenation was based on a discussion *way* back
> (2 years, maybe?) when we first started using Mnogosearch, in which you
> suggested going that route ... in fact, at the time (bear in mind, this is
> back in 7.2 days), it actually sped things up ...

Hmm, I vaguely remember that ... I think we were deliberately trying to
fool the planner at that time, because it was making some stupid
assumption about the selectivity of the LIKE clause. It looks like that
problem is now mostly fixed, since your second example shows estimate of
236133 vs reality of 304811 rows for the URL condition:

> -> Index Scan using url_url on url (cost=0.00..10768.79 rows=236133 width=4) (actual time=225.243..8353.024 rows=304811 loops=1)
> Index Cond: ((url >= 'http://archives.postgresql.org/'::text) AND (url < 'http://archives.postgresql.org0'::text))
> Filter: (url ~~ 'http://archives.postgresql.org/%%'::text)
> Total runtime: 16796.932 ms
> (12 rows)

> Closer to what you were looking/hoping for?

This probably says that we can stop using the concatenation hack, at
least. I'd still suggest clustering the two tables as per my later
message. (Note that clustering would help this mergejoin plan too,
so it could come out to be a win relative to the nestloop indexscan,
but we ought to try both and see.)

> what does that setting affect, *just* the time it takes to
> analyze the table?

Well, it will also bloat pg_statistic and slow down planning a little.
Can you try 100 and see if that gives reasonable estimates? 1000 is a
rather extreme setting I think; I'd go for 100 to start with.

> is this something that can be set database wide,

Yeah, see default_statistics_target in postgresql.conf.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2004-01-01 20:14:37 Re: Mnogosearch (Was: Re: website doc search is ... )
Previous Message Marc G. Fournier 2004-01-01 20:07:32 Re: Mnogosearch (Was: Re: website doc search is ... )