Quick Links

Re: tsearch2 poor performance

From:	Kris Kiger <kris(at)musicrebellion(dot)com>
To:	pgsql-admin(at)postgresql(dot)org
Subject:	Re: tsearch2 poor performance
Date:	2004-10-01 22:01:48
Message-ID:	415DD3CC.3030208@musicrebellion.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-admin pgsql-hackers

Hey all, its me again. If I do not do a count(product_id) on my
tsearch2 queries, its actually really fast, for example;

explain analyze SELECT product_id FROM product, to_tsquery('bear') AS q
WHERE vector @@ q LIMIT 1000;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..2081.60 rows=1000 width=4) (actual
time=2.308..51.522 rows=1000 loops=1)
-> Nested Loop (cost=0.00..6244798.55 rows=3000001 width=4) (actual
time=2.299..45.637 rows=1000 loops=1)
-> Function Scan on q (cost=0.00..12.50 rows=1000 width=32)
(actual time=0.023..0.023 rows=1 loops=1)
-> Index Scan using vector_idx on product (cost=0.00..6207.29
rows=3000 width=36) (actual time=2.253..37.946 rows=1000 loops=1)
Index Cond: (product.vector @@ "outer".q)
Filter: (product.vector @@ "outer".q)
Total runtime: 122.487 ms

explain analyze SELECT product_id FROM product, to_tsquery('complex') AS
q WHERE vector @@ q LIMIT 1000;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..2081.60 rows=1000 width=4) (actual
time=4.943..2325.949 rows=1000 loops=1)
-> Nested Loop (cost=0.00..6244798.55 rows=3000001 width=4) (actual
time=4.933..2319.885 rows=1000 loops=1)
-> Function Scan on q (cost=0.00..12.50 rows=1000 width=32)
(actual time=0.040..0.040 rows=1 loops=1)
-> Index Scan using vector_idx on product (cost=0.00..6207.29
rows=3000 width=36) (actual time=4.868..2311.087 rows=1000 loops=1)
Index Cond: (product.vector @@ "outer".q)
Filter: (product.vector @@ "outer".q)
Total runtime: 2329.389 ms

From what I know, there is only one reason I can offer why a count
takes approximately 30~40 seconds longer on these same queries... that
is that count has to evaluate whether a value is null or not. There
probably is a better reason, if anyone has any ideas, I would much
appreciate you sharing! Also, why the big difference in query times in
the above?

bear appears 780963 times in 696668 documents
complex appears 468669 times in 440339 documents.

Again, thanks all!

Kris

In response to

Re: tsearch2 poor performance at 2004-09-27 18:56:36 from Kris Kiger

Responses

slow count() was: tsearch2 poor performance at 2004-10-03 20:30:38 from Oleg Bartunov

Browse pgsql-admin by date

	From	Date	Subject
Next Message	Christopher Browne	2004-10-01 22:10:12	Re: Does PostgreSQL Stores its database in multiple disks?
Previous Message	Steve Crawford	2004-10-01 21:47:01	Re: PLEASE GOD HELP US!

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Chris Browne	2004-10-01 22:02:35	Re: AIX and V8 beta 3
Previous Message	Tom Lane	2004-10-01 21:17:28	Re: date_trunc'd timestamp index possible?