Re: Long running query in new production, not so long in old

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Mark Steben <mark(dot)steben(at)drivedominion(dot)com>, pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Long running query in new production, not so long in old
Date: 2019-03-28 12:56:16
Message-ID: 55e1328ed76390092c1ea5f1af4b59ce593e732f.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Wed, 2019-03-27 at 19:27 -0400, Mark Steben wrote:
> We are moving to a new VM environment (expedient) and have one query that typically runs in 22 - 25
> seconds in our old environment, but is running in about 1 hour, 20 minutes in our new.
> I'd like some insight as to why the explain is showing shared buffer hits numbering over 113 milliion
> in the new environment and only 445 thousand in the old. I have sent the explains along with the
> table descriptions, row counts, the one function that I know causes the bottleneck, the query,
> some relevant configuration settings in postgresql conf (identical in both environments)
> and a listing from top in both environments, showing memory, shared memory, and cpu.
>
> Everything seems to be identical or close, except for the shared buffer count in the explain.
> Any insight would be appreciated.

Slow plan:

-> Index Scan using emailrcpts_4columns on emailrcpts (cost=0.56..119078.48 rows=5 width=29) (actual time=4873080.765..4873080.765 rows=0 loops=1)
Index Cond: ((cid = 1784539) AND (removed = false) AND (active = true) AND (bounce < 3))
Filter: ((NOT removed) AND active AND (email IS NOT NULL) AND (((fname IS NOT NULL) AND (length((fname)::text) <> 0)) OR ((lname IS NOT NULL) AND (length((lname)::text) <> 0))) AND
(number_of_sends(id, 30, 1) < 3) AND (lower(get_make(cid, (vin)::text, (make_purchased)::text)) = 'ohgoshnonotthebees!!!'::text) AND (NOT (SubPlan 1)))
Rows Removed by Filter: 19952
Buffers: shared hit=113768530 read=6244
SubPlan 1
-> Limit (cost=0.28..15.58 rows=1 width=0) (never executed)

Fast plan:

-> Index Scan using emailrcpts_4columns on emailrcpts (cost=0.56..113162.26 rows=5 width=29) (actual time=21086.555..21086.555 rows=0 loops=1)
Index Cond: ((cid = 1784539) AND (removed = false) AND (active = true) AND (bounce < 3))
Filter: ((NOT removed) AND active AND (email IS NOT NULL) AND (((fname IS NOT NULL) AND (length((fname)::text) <> 0)) OR ((lname IS NOT NULL) AND (length((lname)::text) <> 0))) AND
(number_of_sends(id, 30, 1) < 3) AND (lower(get_make(cid, (vin)::text, (make_purchased)::text)) = 'ohgoshnonotthebees!!!'::text) AND (NOT (SubPlan 1)))
Rows Removed by Filter: 19952
Buffers: shared hit=445188 read=61756
SubPlan 1
-> Limit (cost=0.28..15.58 rows=1 width=0) (never executed)

Hmm. These are the ideas I can come up with:

1. There are many index tuples belonging to dead heap tuples.
Then re-running the query should produce way fewer buffer hits.
VACUUM would fix that issue.

2. The index is terribly fragmented.
REINDEX INDEX emailrcpts_4columns
would improve that one.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message MichaelDBA 2019-03-28 13:23:55 Re: Long running query in new production, not so long in old
Previous Message Mark Steben 2019-03-28 12:39:29 Re: Long running query in new production, not so long in old