From: | Mike Rylander <mrylander(at)gmail(dot)com> |
---|---|
To: | Rick Jansen <rick(at)rockingstone(dot)nl> |
Cc: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Tsearch2 performance on big database |
Date: | 2005-03-22 15:30:03 |
Message-ID: | b918cf3d0503220730665fe92c@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Tue, 22 Mar 2005 14:25:19 +0100, Rick Jansen <rick(at)rockingstone(dot)nl> wrote:
>
> ilab=# explain analyze select count(titel) from books where idxfti @@
> to_tsquery('default', 'buckingham | palace');
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=35547.99..35547.99 rows=1 width=56) (actual
> time=125968.119..125968.120 rows=1 loops=1)
> -> Index Scan using idxfti_idx on books (cost=0.00..35525.81
> rows=8869 width=56) (actual time=0.394..125958.245 rows=3080 loops=1)
> Index Cond: (idxfti @@ '\'buckingham\' | \'palac\''::tsquery)
> Total runtime: 125968.212 ms
> (4 rows)
>
> Time: 125969.264 ms
> ilab=#
Ahh... I should have qualified my claim. I am creating a google-esqe
search interface and almost every query uses '&' as the term joiner.
'AND' queries and one-term queries are orders of magnitude faster than
'OR' queries, and fortunately are the expected default for most users.
(Think, "I typed in these words, therefore I want to match these
words"...) An interesting test may be to time multiple queries
independently, one for each search term, and see if the combined cost
is less than a single 'OR' search. If so, you could use UNION to join
the results.
However, the example you originally gave ('terminology') should be
very fast. On a comparable query ("select count(value) from
metabib.full_rec where index_vector @@ to_tsquery('default','jane');")
I get 12ms.
Oleg, do you see anything else on the surface here?
Try:
EXPLAIN ANALYZE
SELECT titel FROM books WHERE idxfti @@
to_tsquery('default', 'buckingham')
UNION
SELECT titel FROM books WHERE idxfti @@
to_tsquery('default', 'palace');
and see if using '&' instead of '|' where you can helps out. I
imagine you'd be surprised by the speed of:
SELECT titel FROM books WHERE idxfti @@
to_tsquery('default', 'buckingham&palace');
>
> > As an example of what I think you *should* be seeing, I have a similar
> > box (4 procs, but that doesn't matter for one query) and I can search
> > a column with tens of millions of rows in around a second.
> >
>
> That sounds very promising, I'd love to get those results.. could you
> tell me what your settings are, howmuch memory you have and such?
16G of RAM on a dedicated machine.
shared_buffers = 15000 # min 16, at least max_connections*2, 8KB each
work_mem = 10240 # min 64, size in KB
maintenance_work_mem = 1000000 # min 1024, size in KB
# big m_w_m for loading data...
random_page_cost = 2.5 # units are one sequential page fetch cost
# fast drives, and tons of RAM
--
Mike Rylander
mrylander(at)gmail(dot)com
GPLS -- PINES Development
Database Developer
http://open-ils.org
From | Date | Subject | |
---|---|---|---|
Next Message | Oleg Bartunov | 2005-03-22 15:45:17 | Re: Tsearch2 performance on big database |
Previous Message | Bruno Wolff III | 2005-03-22 15:17:07 | Re: What about utility to calculate planner cost constants? |