Inconsistent query times and spiky CPU with GIN tsvector search

From: Scott Rankin <srankin(at)motus(dot)com>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Inconsistent query times and spiky CPU with GIN tsvector search
Date: 2018-09-04 18:09:10
Message-ID: E919673F-1BC3-4525-84CA-51FD854F3D0C@motus.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello all,

We are running postgresql 9.4 and we have a table where we do some full-text searching using a GIN index on a tsvector column:

CREATE TABLE public.location_search
(
id bigint NOT NULL DEFAULT nextval('location_search_id_seq'::regclass),
<snip some columns>…
search_field_tsvector tsvector
)

and

CREATE INDEX location_search_tsvector_idx
ON public.location_search USING gin
(search_field_tsvector)
TABLESPACE pg_default;

The search_field_tsvector column contains the data from the location's name and address:

to_tsvector('pg_catalog.english', COALESCE(NEW.name, '')) || to_tsvector(COALESCE(address, ''))

This setup has been running very well, but as our load is getting heavier, the performance seems to be getting much more inconsistent. Our searches are run on a dedicated read replica, so this server is only doing queries against this one table. IO is very low, indicating to me that the data is all in memory. However, we're getting some queries taking upwards of 15-20 seconds, while the average is closer to 1 second.

A sample query that's running slowly is

explain (analyze, buffers)
SELECT ls.location AS locationId FROM location_search ls
WHERE ls.client = 1363
AND ls.favorite = TRUE
AND search_field_tsvector @@ to_tsquery('CA-94:* &E &San:*')
LIMIT 4;

And the explain analyze is:

Limit (cost=39865.85..39877.29 rows=1 width=8) (actual time=4471.120..4471.120 rows=0 loops=1)
Buffers: shared hit=25613
-> Bitmap Heap Scan on location_search ls (cost=39865.85..39877.29 rows=1 width=8) (actual time=4471.117..4471.117 rows=0 loops=1)
Recheck Cond: (search_field_tsvector @@ to_tsquery('CA-94:* &E &San:*'::text))
Filter: (favorite AND (client = 1363))
Rows Removed by Filter: 74
Heap Blocks: exact=84
Buffers: shared hit=25613
-> Bitmap Index Scan on location_search_tsvector_idx (cost=0.00..39865.85 rows=6 width=0) (actual time=4470.895..4470.895 rows=84 loops=1)
Index Cond: (search_field_tsvector @@ to_tsquery('CA-94:* &E &San:*'::text))
Buffers: shared hit=25529
Planning time: 0.335 ms
Execution time: 4487.224 ms

I'm a little bit at a loss to where to start at this - any suggestions would be hugely appreciated!

Thanks,
Scott

This email message contains information that Motus, LLC considers confidential and/or proprietary, or may later designate as confidential and proprietary. It is intended only for use of the individual or entity named above and should not be forwarded to any other persons or entities without the express consent of Motus, LLC, nor should it be used for any purpose other than in the course of any potential or actual business relationship with Motus, LLC. If the reader of this message is not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, please notify sender immediately and destroy the original message.

Internal Revenue Service regulations require that certain types of written advice include a disclaimer. To the extent the preceding message contains advice relating to a Federal tax issue, unless expressly stated otherwise the advice is not intended or written to be used, and it cannot be used by the recipient or any other taxpayer, for the purpose of avoiding Federal tax penalties, and was not written to support the promotion or marketing of any transaction or matter discussed herein.

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Laurenz Albe 2018-09-04 19:15:19 Re: Inconsistent query times and spiky CPU with GIN tsvector search
Previous Message jimmy 2018-09-04 07:16:10 RE: Query is slow when run for first time; subsequent execution is fast