Re: Poor Performance running Django unit tests after upgrading from 10.6

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Roger Hunwicks <roger(at)tonic-solutions(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Poor Performance running Django unit tests after upgrading from 10.6
Date: 2020-10-15 06:56:47
Message-ID: f23b090fd45cfc965010e025ae5cf2b8778cf4cc.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, 2020-10-15 at 01:21 -0400, Roger Hunwicks wrote:
> We have a large Django application running against a Postgresql database.
>
> When we run the test suite using Postgresql 10.7 in a Docker container
> we consistently get:
>
> Ran 1166 tests in 1291.855s
>
> When we first started running the same test suite against Postgresql
> 12.4 we got:
>
> Ran 1166 tests in 8502.030s
>
> I think that this reduction in performance is caused by the lack of
> accurate statistics [...]
>
> We have since managed to get the performance of the test run using
> 12.4 back to approximately the normal range by customizing the
> Postgresql parameters. `seq_page_cost=0.1` and `random_page_cost=0.11`
> seem to be key, but we are also setting `shared_buffers`, etc. and all
> the other typical parameters. With Postgresql 10.7 we weren't setting
> anything and performance was fine using just the defaults, given the
> tiny data volumes.
>
> However, even though we have similar performance for 12.4 for most
> test runs, it remains very variable. About 30% of the time we get
> something like:
>
> I think we have narrowed down the problem to a single, very complex,
> materialized view using CTEs; the unit tests create the test data and
> then refresh the materialized view before executing the actual test
> code.
>
> Database logging using autoexplain shows things like:
>
> db_1 | 2020-10-14 10:27:59.692 UTC [255] LOG: duration:
> 4134.625 ms plan:
> db_1 | Query Text: REFRESH MATERIALIZED VIEW
> price_marketpricefacts_materialized
> db_1 | Merge Join
> (cost=14141048331504.30..9635143213364288.00 rows=116618175994107184
> width=3302) (actual time=4134.245..4134.403 rows=36 loops=1)
>
> For comparison, the equivalent query on 10.7 has:
>
> db_1 | 2020-10-15 03:28:58.382 UTC [163] LOG: duration:
> 10.500 ms plan:
> db_1 | Query Text: REFRESH MATERIALIZED VIEW
> price_marketpricefacts_materialized
> db_1 | Hash Left Join (cost=467650.55..508612.80
> rows=199494 width=3302) (actual time=10.281..10.341 rows=40 loops=1)
>
> I can get performance almost identical to 10.7 by altering the unit
> tests so that in each test that refreshes the materialized view prior
> to executing the query, we execute `ANALYZE;` prior to refreshing the
> view.
>
> Is it worth us trying to debug the plan for situations with low row
> counts and poor statistics? Or is this use case not really covered:
> the general advice is obviously to make sure that statistics are up to
> date before troubleshooting performance problems. On the other hand,
> it is not easy for us to make sure that we run analyze inside the
> transaction in each unit test; it also seems a bit wasteful.
>
> Opinions and advice gratefully received.

Yes, the query plan for the query that defines the materialized view
is the interesting data point. Run an EXPLAIN (ANALYZE, BUFFERS) on
that query.

If your statistics are off because the data have just been imported a
second ago, run an explicit ANALYZE on the affected tables after import.

If your statistics are off because they are not calculated often enough,
consider lowering "autovacuum_analyze_scale_factor".

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Andrew Dunstan 2020-10-15 10:59:39 Re: Poor Performance running Django unit tests after upgrading from 10.6
Previous Message Roger Hunwicks 2020-10-15 05:21:50 Poor Performance running Django unit tests after upgrading from 10.6