Re: Postgresql RDS DB Latency Chossing Hash join Plan

From: github kran <githubkran(at)gmail(dot)com>
To: Michael Lewis <mlewis(at)entrata(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Postgresql RDS DB Latency Chossing Hash join Plan
Date: 2019-02-13 15:49:38
Message-ID: CACaZr5T7JFhwAp3+OxVG6xmD-0XYaAWWc-epCZAZzrxxsxFAsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-sql

On Tue, Feb 12, 2019 at 12:55 PM github kran <githubkran(at)gmail(dot)com> wrote:

>
>
> On Tue, Feb 12, 2019 at 12:33 PM Michael Lewis <mlewis(at)entrata(dot)com> wrote:
>
>> Did you update the stats by running ANALYZE on the tables involved, or
>> perhaps the entire database on the 'Non prod Aurora RDS instance'? Can you
>> share the two execution plans?
>>
>> *Michael Lewis | Software Engineer*
>> *Entrata*
>> *c: **619.370.8697 <619-370-8697>*
>>
>
> Here is the plan for both of the DB instances.
>

Michael - Did you get any chance to look at this issue. Also we see there
is a sequential scan being done instead of index scan.

>
>>
>> On Tue, Feb 12, 2019 at 11:27 AM github kran <githubkran(at)gmail(dot)com>
>> wrote:
>>
>>>
>>>
>>> On Tue, Feb 12, 2019 at 7:07 AM github kran <githubkran(at)gmail(dot)com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Feb 11, 2019 at 6:00 PM github kran <githubkran(at)gmail(dot)com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Feb 11, 2019 at 3:29 PM Michael Lewis <mlewis(at)entrata(dot)com>
>>>>> wrote:
>>>>>
>>>>>> Are default statistics target the same on both prod and AWS? Have you
>>>>>> analyzed all tables being used in this query to ensure stats are up proper?
>>>>>> If the optimizer is choosing a different plan, then the stats must be
>>>>>> different IMO.
>>>>>>
>>>>>>
>>>>>> *Michael Lewis | Software Engineer*
>>>>>> *Entrata*
>>>>>>
>>>>>
>>>>>
>>>>> Thanks for your reply I have verified few of the tables and their
>>>>> default statistics target and they seem to be same but is there anything in
>>>>> particular you want me to look at it to differentiate Prod and Non prod
>>>>> databases ?. ( Also the DB instance size is same but there is little
>>>>> more data in the Non prod Aurora RDS instance compared to Prod
>>>>> instance).
>>>>>
>>>>> Query used.
>>>>>> select * from pg_stats where tablename = 'tableName'
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 11, 2019 at 2:15 PM github kran <githubkran(at)gmail(dot)com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Postgres Team,
>>>>>>> We are using AWS Aurora RDS Postgres DB 9.6.8 Engine. We recently
>>>>>>> deleted few million rows from the database and ran into a issue in one of
>>>>>>> our dev account where the
>>>>>>> DB was not normal after this deletion. We did re index, vacuuming
>>>>>>> entire database but we couldnt bring it to the same state as earlier. So
>>>>>>> next steps we deleted the database and
>>>>>>> recreated the database by copying the snapshot from a production
>>>>>>> instance. Further did vacumming, re-index on the database.
>>>>>>>
>>>>>>> After this now the dev database seems to be in a better state than
>>>>>>> earlier but we are seeing few of our DB calls are taking more than 1 minute
>>>>>>> when we are fetching data and we observed
>>>>>>> this is because the query plan was executing a hash join as part of
>>>>>>> the query whereas a similar query on prod instance is not doing any hash
>>>>>>> join and is returning faster.
>>>>>>>
>>>>>>> Also we did not want to experiment by modifing the DB settings by
>>>>>>> doing enable_hash_join to off or random_page_count to 1 as we dont have
>>>>>>> these settings in Prod instance.
>>>>>>>
>>>>>>> Note:
>>>>>>> The partition table sizes we have here is between 40 GB to 75 GB and
>>>>>>> this is our normal size range, we have a new partition table for every 7
>>>>>>> days.
>>>>>>>
>>>>>>> Appreciate your ideas on what we could be missing and what we can
>>>>>>> correct here to reduce the query latency.
>>>>>>>
>>>>>>> Thanks
>>>>>>> githubKran
>>>>>>>
>>>>>>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2019-02-13 16:01:33 Re: Subquery to select max(date) value
Previous Message Adrian Klaver 2019-02-13 15:46:42 Re: Subquery to select max(date) value

Browse pgsql-sql by date

  From Date Subject
Next Message Michael Lewis 2019-02-13 17:37:53 Re: Postgresql RDS DB Latency Chossing Hash join Plan
Previous Message Tom Lane 2019-02-12 23:46:47 Re: Getting wrong time using now()