Re: Optimizing count(), but Explain estimates wildly off

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Chema <chema(at)interneta(dot)org>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Optimizing count(), but Explain estimates wildly off
Date: 2024-02-27 13:11:42
Message-ID: 202402271311.qd7z46bod64u@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Chema,

On 2024-Feb-26, Chema wrote:

> Dear pgsqlers,
>
> I'm trying to optimize simple queries on two tables (tenders & items) with
> a couple million records. Besides the resulting records, the app also
> displays the count of total results. Doing count() takes as much time as
> the other query (which can be 30+ secs), so it's an obvious target for
> optimization. I'm already caching count() results for the most common
> conditions (country & year) in a material table, which practically halves
> response time. The tables are updated sparingly, and only with bulk
> COPYs. Now I'm looking for ways to optimize queries with other conditions.

It sounds like this approach might serve your purposes:
https://www.postgresql.eu/events/pgconfeu2023/schedule/session/4762-counting-things-at-the-speed-of-light-with-roaring-bitmaps/

> I already raised default_statistics_target up to 2k (the planner wasn't
> using indexes at all with low values). Gotta get it even higher? These are
> my custom settings:

I would recommend to put the default_statistics_target back to its
original value and modify the value with ALTER TABLE .. SET STATISTICS
only for columns that need it, only on tables that need it; then ANALYZE
everything. The planner gets too slow if you have too many stats for
everything.

> shared_buffers = 256MB # min 128kB

This sounds far too low, unless your server is a Raspberry Pi or
something. See "explain (buffers, analyze)" of your queries to see how
much buffer traffic is happening for them.

> Functions: 33
> Options: Inlining true, Optimization true, Expressions true, Deforming true
> Timing: Generation 14.675 ms, Inlining 383.349 ms, Optimization 1023.521
> ms, Emission 651.442 ms, Total 2072.987 ms
> Execution Time: 63378.033 ms

Also maybe experiment with turning JIT off. Sometimes it brings no
benefit and slows down execution pointlessly. Here you spent two
seconds JIT-compiling the query; were they worth it?

Cheers

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Syntax error: function hell() needs an argument.
Please choose what hell you want to involve.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message James Pang 2024-02-27 13:54:48 extend statistics help reduce index scan a lot of shared buffer hits.
Previous Message Laurenz Albe 2024-02-27 07:40:43 Re: Optimizing count(), but Explain estimates wildly off