Re: performance of analytical query

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Jiří Fejfar <jurafejfar(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: performance of analytical query
Date: 2021-11-23 19:42:35
Message-ID: 20211123194234.GV17618@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Nov 12, 2021 at 09:12:38PM +0100, Jiří Fejfar wrote:
> * I know that PG is focused on OLTP rather then analytics, but we are happy
> with it at all and do not wish to use another engine for analytical
> queries... isn't somewhere some "PG analytical best practice" available?

It's a good question. Here's some ideas:

I don't think we know what version you're using - that's important, and there's
other ideas here:
https://wiki.postgresql.org/wiki/Slow_Query_Questions

You said that your query was slow "probably after VACUUM ANALYZE".
Is it really faster without stats ? You can do this to see if there was really
a better plan "before":
| begin; DELETE FROM pg_statistic WHERE starelid='thetable'::regclass; explain analyze ...; rollback;

Try enable_nestloop=off for analytic queries;

Test whether jit=off helps you or hurts you (you said that it's already disabled);

You can do other things that can improve estimates, by sacrificing planning time
(which for an analytic query is a small component of the total query time, and
pays off at runtime if you can get a btter plan):
- FKs can help with estimates since pg9.6;
- CREATE STATISTICS;
- ALTER SET STATISTICS or increase default_statistics_target;
- increase from_collapse_limit and join_collapse_limit. But I don't think it
will help your current query plan.
- partitioning data increases planning time, and (if done well) can allow
improved execution plans;

You can REINDEX or maybe CLUSTER during "off hours" to optimize indexes/tables.

BRIN indexes (WITH autoanalyze) are very successful for us, here.

You can monitor your slow queries using auto_explain and/or pg_stat_statements.

You can reduce autovacuum_analyze_threshold to analyze more often.

I'd be interested to hear if others have more suggestions.

--
Justin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message aditya desai 2021-11-24 05:27:37 Out of memory error
Previous Message Robert Creager 2021-11-19 18:47:45 Re: Need help identifying a periodic performance issue.