Re: Optimizing count(), but Explain estimates wildly off

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Chema <chema(at)interneta(dot)org>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Optimizing count(), but Explain estimates wildly off
Date: 2024-02-27 07:40:43
Message-ID: 4bba24a3f55291fd8a17460ed0b3239e3a022c24.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, 2024-02-26 at 18:25 -0600, Chema wrote:
> I'm trying to optimize simple queries on two tables (tenders & items) with a couple
> million records.  Besides the resulting records, the app also displays the count of
> total results.  Doing count() takes as much time as the other query (which can be
> 30+ secs), so it's an obvious target for optimization.
>
> Reading around, seems many people are still using this 2005 snippet to obtain the
> row count estimate from Explain:

I recommend using FORMAT JSON and extracting the top row count from that. It is
simpler and less error-prone.

> Is this still the current best practice?  Any tips to increase precision?
> Currently it can estimate the actual number of rows for over or under a million,
> as seen on the sample queries (1,955,297 instead of 1,001,200; 162,080 instead
> of 1,292,010).

Looking at the samples you provided, I get the impression that the statistics for
the table are quite outdated. That will affect the estimates. Try running ANALYZE
and see if that improves the estimates.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Alvaro Herrera 2024-02-27 13:11:42 Re: Optimizing count(), but Explain estimates wildly off
Previous Message Vitalii Tymchyshyn 2024-02-27 04:36:13 Re: Optimizing count(), but Explain estimates wildly off