From: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
---|---|
To: | Scott Rankin <srankin(at)motus(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Inconsistent query times and spiky CPU with GIN tsvector search |
Date: | 2018-09-04 19:15:19 |
Message-ID: | 2633cf81b451451ebcb4bb213a256ad5975388ca.camel@cybertec.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Scott Rankin wrote:
> We are running postgresql 9.4 and we have a table where we do some
> full-text searching using a GIN index on a tsvector column:
>
> CREATE INDEX location_search_tsvector_idx
> ON public.location_search USING gin
> (search_field_tsvector)
> TABLESPACE pg_default;
>
> This setup has been running very well, but as our load is getting heavier,
> the performance seems to be getting much more inconsistent.
> Our searches are run on a dedicated read replica, so this server is only
> doing queries against this one table. IO is very low, indicating to me
> that the data is all in memory. However, we're getting some queries taking
> upwards of 15-20 seconds, while the average is closer to 1 second.
>
> A sample query that's running slowly is
>
> explain (analyze, buffers)
> SELECT ls.location AS locationId FROM location_search ls
> WHERE ls.client = 1363
> AND ls.favorite = TRUE
> AND search_field_tsvector @@ to_tsquery('CA-94:* &E &San:*')
> LIMIT 4;
>
> And the explain analyze is:
>
> Limit (cost=39865.85..39877.29 rows=1 width=8) (actual time=4471.120..4471.120 rows=0 loops=1)
> Buffers: shared hit=25613
> -> Bitmap Heap Scan on location_search ls (cost=39865.85..39877.29 rows=1 width=8) (actual time=4471.117..4471.117 rows=0 loops=1)
> Recheck Cond: (search_field_tsvector @@ to_tsquery('CA-94:* &E &San:*'::text))
> Filter: (favorite AND (client = 1363))
> Rows Removed by Filter: 74
> Heap Blocks: exact=84
> Buffers: shared hit=25613
> -> Bitmap Index Scan on location_search_tsvector_idx (cost=0.00..39865.85 rows=6 width=0) (actual time=4470.895..4470.895 rows=84 loops=1)
> Index Cond: (search_field_tsvector @@ to_tsquery('CA-94:* &E &San:*'::text))
> Buffers: shared hit=25529
> Planning time: 0.335 ms
> Execution time: 4487.224 ms
Not sure, but maybe you are suffering from bad performance because of a
long "GIN pending list".
If yes, then the following can help:
ALTER INDEX location_search_tsvector_idx SET (gin_pending_list_limit = 512);
Or you can disable the feature altogether:
ALTER INDEX location_search_tsvector_idx SET (fastupdate = off);
Then clean the pending list with
SELECT gin_clean_pending_list('location_search_tsvector_idx'::regclass);
Disabling the pending list will slow down data modification, but should
keep the SELECT performance stable.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2018-09-05 00:35:19 | Re: Query is slow when run for first time; subsequent execution is fast |
Previous Message | Scott Rankin | 2018-09-04 18:09:10 | Inconsistent query times and spiky CPU with GIN tsvector search |