Re: Specific query taking time to process

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Fahiz Mohamed <fahiz(at)netwidz(dot)com>
Cc: Michael Lewis <mlewis(at)entrata(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Specific query taking time to process
Date: 2019-12-10 20:42:06
Message-ID: 20191210204206.GE2082@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Dec 09, 2019 at 10:39:38PM +0000, Fahiz Mohamed wrote:
> Thank you very much for your prompt responses.
>
> I have analysed more regarding this and found the long running query.
>
> I ran "explain analyse" on this query and I got following result. (We have 2 identical DB instances and they consist of same data. Instane 1 took 20+ second to process and instance 2 took less than a second)
>
> Instance 1: (This is used by regular User - More than 600,000 request a day) - The result is same even when there is no user in the server.
> EXPLAIN ANALYZE
> Nested Loop Semi Join (cost=998547.53..3319573.36 rows=1 width=8) (actual time=10568.217..22945.971 rows=22 loops=1)
> -> Hash Semi Join (cost=998546.96..3319545.95 rows=41 width=16) (actual time=10568.198..22945.663 rows=22 loops=1)
> Hash Cond: (node.id = prop.node_id)
> -> Bitmap Heap Scan on alf_node node (cost=995009.97..3303978.85 rows=4565737 width=8) (actual time=3304.419..20465.551 rows=41109751 loops=1)
> Recheck Cond: ((store_id = 6) AND (type_qname_id = 240))
> Rows Removed by Index Recheck: 54239131
> Filter: (NOT (hashed SubPlan 1))
> Rows Removed by Filter: 2816
> Heap Blocks: exact=24301 lossy=1875383
...

This is doing a bitmap scan on alf_node where the second is doing an Index Scan.
Is alf_node well-correlated on the 2nd server on store_id or type_qname_id ?
If you CLUSTER and ANALYZE on the 1st server, maybe it would perform similarly.
(But, that could hurt other queries).

> We are running Postgres 9.6.9 on Amazon RDS (db.m5.4xlarge instance)
> With the fresh DB with the restore of the DATA without any indexes Search query performs relatively quick and most of the time its less than a second.
> But after 3 weeks of use of the DB it sudenly started to slowdown only for this perticular query and it takes 20+ seconds to respond. If I do a restore the DB again then it continues to work fine and the symptom pops out after 3 weeks time.

> Instance 2: (Only by testers - 250 request a day)
>
> Nested Loop Semi Join (cost=6471.94..173560841.08 rows=2 width=8) (actual time=0.162..0.464 rows=17 loops=1)
> -> Nested Loop (cost=6471.37..173560684.36 rows=45 width=16) (actual time=0.154..0.387 rows=17 loops=1)
> -> HashAggregate (cost=3508.15..3516.80 rows=865 width=8) (actual time=0.041..0.047 rows=18 loops=1)
> Group Key: prop.node_id
> -> Index Only Scan using idx_alf_nprop_s on alf_node_properties prop (cost=0.70..3505.99 rows=866 width=8) (actual time=0.020..0.035 r
> ows=18 loops=1)
> Index Cond: ((qname_id = '242'::bigint) AND (string_value = 'E292432'::text))
> Heap Fetches: 18
> -> Index Scan using alf_node_pkey on alf_node node (cost=2963.22..200644.11 rows=1 width=8) (actual time=0.019..0.019 rows=1 loops=18)
> Index Cond: (id = prop.node_id)
> Filter: ((type_qname_id <> 145) AND (store_id = 6) AND (type_qname_id = 240) AND (NOT (SubPlan 1)))
> Rows Removed by Filter: 0
> SubPlan 1

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Guillaume Lelarge 2019-12-11 15:23:05 Re: unexpected result for wastedbytes query after vacuum full
Previous Message Michael Lewis 2019-12-10 20:15:08 Re: Specific query taking time to process