From: | Benjamin Arai <benjamin(at)araisoft(dot)com> |
---|---|
To: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL <pgsql-general(at)postgresql(dot)org>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: [PERFORM] Slow TSearch2 performance for table with 1 million documents. |
Date: | 2007-10-05 22:57:31 |
Message-ID: | 6936504E-3560-4BE9-86BE-BEDAB3CD355A@araisoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-performance |
On Oct 5, 2007, at 8:32 AM, Oleg Bartunov wrote:
> On Fri, 5 Oct 2007, Tom Lane wrote:
>
>> Benjamin Arai <benjamin(at)araisoft(dot)com> writes:
>>> # explain analyze select * FROM fulltext_article, to_tsquery
>>> ('simple','dog') AS q WHERE idxfti @@ q ORDER BY rank(idxfti, q)
>>> DESC;
>>
>>> QUERY PLAN
>>> --------------------------------------------------------------------
>>> ----
>>> --------------------------------------------------------------------
>>> ----
>>> ------------
>>> Sort (cost=6576.74..6579.07 rows=933 width=774) (actual
>>> time=12969.237..12970.490 rows=5119 loops=1)
>>> Sort Key: rank(fulltext_article.idxfti, q.q)
>>> -> Nested Loop (cost=3069.79..6530.71 rows=933 width=774)
>>> (actual time=209.513..12955.498 rows=5119 loops=1)
>>> -> Function Scan on q (cost=0.00..0.01 rows=1 width=32)
>>> (actual time=0.005..0.006 rows=1 loops=1)
>>> -> Bitmap Heap Scan on fulltext_article
>>> (cost=3069.79..6516.70 rows=933 width=742) (actual
>>> time=209.322..234.390 rows=5119 loops=1)
>>> Recheck Cond: (fulltext_article.idxfti @@ q.q)
>>> -> Bitmap Index Scan on fulltext_article_idxfti_idx
>>> (cost=0.00..3069.56 rows=933 width=0) (actual time=208.373..208.373
>>> rows=5119 loops=1)
>>> Index Cond: (fulltext_article.idxfti @@ q.q)
>>> Total runtime: 12973.035 ms
>>> (9 rows)
>>
>> The time seems all spent at the join step, which is odd because it
>> really hasn't got much to do. AFAICS all it has to do is compute the
>> rank() values that the sort step will use. Is it possible that
>> rank() is really slow?
>
> can you try rank_cd() instead ?
>
Using Rank:
-# ('simple','dog') AS q WHERE idxfti @@ q ORDER BY rank(idxfti, q)
DESC;
QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
------------
Sort (cost=6576.74..6579.07 rows=933 width=774) (actual
time=98083.081..98084.351 rows=5119 loops=1)
Sort Key: rank(fulltext_article.idxfti, q.q)
-> Nested Loop (cost=3069.79..6530.71 rows=933 width=774)
(actual time=479.122..98067.594 rows=5119 loops=1)
-> Function Scan on q (cost=0.00..0.01 rows=1 width=32)
(actual time=0.003..0.004 rows=1 loops=1)
-> Bitmap Heap Scan on fulltext_article
(cost=3069.79..6516.70 rows=933 width=742) (actual
time=341.739..37112.110 rows=5119 loops=1)
Recheck Cond: (fulltext_article.idxfti @@ q.q)
-> Bitmap Index Scan on fulltext_article_idxfti_idx
(cost=0.00..3069.56 rows=933 width=0) (actual time=321.443..321.443
rows=5119 loops=1)
Index Cond: (fulltext_article.idxfti @@ q.q)
Total runtime: 98087.575 ms
(9 rows)
Using Rank_cd:
# explain analyze select * FROM fulltext_article, to_tsquery
('simple','cat') AS q WHERE idxfti @@ q ORDER BY rank_cd(idxfti, q)
DESC;
QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
-------------
Sort (cost=6576.74..6579.07 rows=933 width=774) (actual
time=199316.648..199324.631 rows=26054 loops=1)
Sort Key: rank_cd(fulltext_article.idxfti, q.q)
-> Nested Loop (cost=3069.79..6530.71 rows=933 width=774)
(actual time=871.428..199244.330 rows=26054 loops=1)
-> Function Scan on q (cost=0.00..0.01 rows=1 width=32)
(actual time=0.006..0.007 rows=1 loops=1)
-> Bitmap Heap Scan on fulltext_article
(cost=3069.79..6516.70 rows=933 width=742) (actual
time=850.674..50146.477 rows=26054 loops=1)
Recheck Cond: (fulltext_article.idxfti @@ q.q)
-> Bitmap Index Scan on fulltext_article_idxfti_idx
(cost=0.00..3069.56 rows=933 width=0) (actual time=838.120..838.120
rows=26054 loops=1)
Index Cond: (fulltext_article.idxfti @@ q.q)
Total runtime: 199338.297 ms
(9 rows)
>
>>
>> regards, tom lane
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 5: don't forget to increase your free space map settings
>>
>
> Regards,
> Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>
From | Date | Subject | |
---|---|---|---|
Next Message | Laurent ROCHE | 2007-10-06 00:25:16 | Very asynchrnous replication system |
Previous Message | Aroon Pahwa | 2007-10-05 21:33:20 | valid query runs forever? |
From | Date | Subject | |
---|---|---|---|
Next Message | Shane Ambler | 2007-10-06 01:19:08 | Re: Problems with + 1 million record table |
Previous Message | Joshua D. Drake | 2007-10-05 17:45:46 | Re: Problems with + 1 million record table |