Re: Problem with a Query

From: Siraj G <tosiraj(dot)g(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Problem with a Query
Date: 2024-08-26 10:29:57
Message-ID: CAC5iy62+3TSUo_wys0izZsN=LL2SqQMwj_NiYY4SOq9_hrd=KA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks Tom. Collecting full stats on the tables involved corrected the
execution.

On Tue, Aug 13, 2024 at 9:57 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Siraj G <tosiraj(dot)g(at)gmail(dot)com> writes:
> > We migrated a PgSQL database from Cloud SQL to compute engine and since
> > then there is a SQL we observed taking a long time. After some study, I
> > found that the SQL is using NESTED LOOP where the cost is too high.
>
> The core of your problem seems to be here:
>
> > -> Index Scan using marketing_a_cancel__55ffff_idx
> on
> > marketing_app_leadhistory w0 (cost=0.57..4274.30 rows=1 width=8) (actual
> > time=46.678..51.232 rows=44 loops=1)
> > Index Cond: ((cancel_event_id IS NOT NULL) AND
> > (cancel_event_type = 1))
> > Filter: ((status_id = 93) AND
> > ((followup_date)::date >= '2024-08-01'::date) AND ((followup_date)::date
> <=
> > '2024-08-07'::date))
> > Rows Removed by Filter: 22268
> > Buffers: shared hit=9170 read=19
>
> If the planner had estimated 40-some rows out of this step, rather
> than one, it would certainly not have chosen to use nestloop joins
> atop this. So the big problem to focus on is making that estimate
> better.
>
> A secondary problem is that the choice of index seems poor: the
> index itself is selecting 44+22268 = 22312 rows and then the filter
> condition is throwing away 99.8% of those rows. Probably, using
> an index on (status_id, followup_date) would have worked better.
>
> I suspect that both of these things are tied to the non-normalization
> of your "cancel" condition. The planner probably believes that
> "cancel_event_id IS NOT NULL" is statistically independent of
> "cancel_event_type = 1"; but I'll bet it isn't, and thus the index
> condition selects many more rows than the planner guessed. You might
> be able to improve that estimate by creating extended stats on both of
> those columns, but really a better idea would be to take a step back
> and figure out if those two columns can't be merged into one.
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kashif Zeeshan 2024-08-26 10:30:29 Re: Issue Installing PostgreSQL 17 on Ubuntu 22.04 (Jammy)
Previous Message Jyoti Saxena 2024-08-26 10:29:45 Re: Issue Installing PostgreSQL 17 on Ubuntu 22.04 (Jammy)