Re: Specific query taking time to process

From: Fahiz Mohamed <fahiz(at)netwidz(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Duncan Whitham <dwhitham(at)zaizi(dot)com>
Cc: Michael Lewis <mlewis(at)entrata(dot)com>, Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Specific query taking time to process
Date: 2020-01-30 09:44:41
Message-ID: 53622ec9-bf97-4066-ac52-9b8e5698c3ea@Spark
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Jeff,

Thank you for your email, Sorry I couldn’t respond back to you. I am not working on this project at the moment. I have copied one of my colleague who working on this. He has some progress on this, he will update the email thread with those findings

Appreciate your support.

Thank you,
Fahiz
On 12 Dec 2019, 02:25 +0000, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, wrote:
> On Wed, Dec 11, 2019 at 5:21 PM Fahiz Mohamed <fahiz(at)netwidz(dot)com> wrote:
> > > There is a slight different in both instance’s data. Inastanbce 1 contains latest data and instance 2 consists of data which is 3 weeks older than instance 1.
> >
> > In knowing where to look for differences in performance, there is a big difference between them being identical, and being generally similar, but not identical.
> >
> > >
> > > I hope the above data difference can make a drastic difference. Please correct me if I am wrong.
> >
> > They are similar in scale, but we know there is a big difference in distribution of some values.  For example, we still know the slow plan has 4697 rows in aspect_1 where qname_id = 251, while the other plan has 85 rows in aspect_1 meeting that same criterion. That is a big difference, and it is real difference in the data, not just a difference in planning or estimation.  Is this difference driving the difference in plan choice?  Probably not (plan choice is driven by estimated rows, not actual, and estimates are quite similar), but it does demonstrate the data is quite different between the two systems when you look under the hood.  It is likely that there are other, similar differences in the distribution of particular values which is driving the difference in plans.  It is just that we can't see those differences, because the EXPLAIN ANALYZE only reports on the plan it ran, not other plans it could have ran but didn't.
> >
> > Your query is now using the index named  idx_alf_node_tqn in a way which is equally unproductive as the previous use of  idx_alf_node_mdq was.  It looks like they have the same columns, just in a different order.  My previous advice to try  "type_qname_id+0 = 240" should still apply.
> >
> > If you can't get that to work, then another avenue is to run "explain (analyze, buffers) select count(*) from alf_node where (type_qname_id = 240) AND (store_id = 6)" on both instances.
> >
> >
> >
> > >
> > > I did execute vacuum manually and I noticed the below in the output
> > >
> > > "INFO: vacuuming "public.alf_node_aspects"
> > > INFO: "alf_node_aspects": found 0 removable, 150264654 nonremovable row versions in 812242 pages
> > > DETAIL: 0 dead row versions cannot be removed yet.
> > > CPU 13.53s/33.35u sec elapsed 77.88 sec.
> >
> > I'm not really sure what that means.  I certainly would not have expected 0 removable. There should have been some prior output, something like:
> >
> > INFO:  scanned index "fk_alf_nasp_qn" to remove 500000 row versions
> >
> > It could be that autovacuum had already gotten around to vacuuming after your initial email but before you did the above, meaning there was not much for your manual to do.
> >
> > But you can see that the vacuum did have an effect, by comparing these lines (despite them finding about same number of rows)
> >
> > Heap Blocks: exact=40765
> >
> > Heap Blocks: exact=1774
> >
> > It wasn't all that large of an effect in this case, but it is still something worth fixing.
> >
> > Cheers,
> >
> > Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Duncan Whitham 2020-01-30 12:22:58 Re: Specific query taking time to process
Previous Message Laurenz Albe 2020-01-28 08:37:42 Re: Query optimization advice for beginners