Unaccounted regression from postgresql 11 in later versions

From: Bob Jolliffe <bobjolliffe(at)gmail(dot)com>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Unaccounted regression from postgresql 11 in later versions
Date: 2023-05-31 08:43:42
Message-ID: CACd=f9chV1E6Z_oi+4UMLs3g6967jx83QFL73V1R9A4K1PS34Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello

We have an application (https://dhis2.org) which has been using postgresql
as a backend for the past 15 years or so. Gradually moving through pg
versions 8,9,10 etc as the years went by. At the moment a large number of
our implementations are using versions 13, 14 and 15. Unfortunately we
have recently discovered that, despite most operations performing
considerably better on later versions, there is a particular type of query
that is very much slower (up to 100x) than it was on postgresql 11. We
have seen this regression in 13, 14 and 15. Unfortunately I dont have
stats on version 12 yet.

The query is not beautifully crafted. It is automatically generated from a
custom expression language. We know that it can probably be improved, but
at the moment we would really like to know if there is anything we can
configure with the SQL as-is to get performance like we had back on pg11.

The example below is a typical such query. I've attached below that, links
to the results of EXPLAIN (ANALYZE, BUFFERS). for pg11 and pg15 on the same
physical environment loaded with the same database. I would appreciate
some help trying to understand what we are seeing with the EXPLAIN output
and whether there is anything to be done.

EXPLAIN ANALYZE
select
count(pi) as value,
'2022W21' as Weekly
from
analytics_enrollment_gr3uwzvzpqt as ax
where
cast(
(
select
"IEMtgZapP2s"
from
analytics_event_gr3uWZVzPQT
where
analytics_event_gr3uWZVzPQT.pi = ax.pi
and "IEMtgZapP2s" is not null
and ps = 'XO45JBGJcXJ'
order by
executiondate desc
limit
1
) as date
) < cast('2022-05-30' as date)
and cast(
(
select
"IEMtgZapP2s"
from
analytics_event_gr3uWZVzPQT
where
analytics_event_gr3uWZVzPQT.pi = ax.pi
and "IEMtgZapP2s" is not null
and ps = 'XO45JBGJcXJ'
order by
executiondate desc
limit
1
) as date
) >= cast('2022-05-23' as date)
and (uidlevel1 = 'Plmg8ikyfrK')
and (
coalesce(
(
select
"QEbYS2QOXLf"
from
analytics_event_gr3uWZVzPQT
where
analytics_event_gr3uWZVzPQT.pi = ax.pi
and "QEbYS2QOXLf" is not null
and ps = 'XO45JBGJcXJ'
order by
executiondate desc
limit
1
):: text,
''
) = 'FIN_CASE_CLASS_CONFIRM_LAB'
)
limit 1;
-- limit 200001;

The EXPLAIN result for postgresql 11 is here:
https://explain.depesz.com/s/3QfC

The same query on postgresql 15 is here:
https://explain.depesz.com/s/BzpA#html

Whereas the first example takes 23s, the pg15 one takes 243s (this time
difference is even more stark when you remove BUFFERS from the explain).
During execution the pg15 query consumes 100% of a CPU core throughout
indicating it is probably cpu bound rather than IO.

The plan selected in both cases seems to be exactly the same. But pg15
seems to make a lot of work of the final aggregation step. Anecdotally I
understand that the same difference is there with pg13 and 14. The only
significant factor I could think of relating to new behaviour in pg13 is
the new hash_mem_multiplier configuration and it its relation to work_mem
availbale for hash tables. I have attempted to turn up both
hash_mem_multilier and work_mem to ridiculous values and I see no change
whatsoever on pg15.

I also removed the LIMIT and tested again with no significant difference:
https://explain.depesz.com/s/K9Lq

Does anyone have a theory of why pg15 should behave so differently to pg11
here? Better still, any suggestions for configuration that might make pg15
behave more like pg10. I am really dreading the prospect of stepping our
many live implementations back to pg11 :-(.

Regards
Bob

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Michael Christofides 2023-05-31 10:11:29 Re: Unaccounted regression from postgresql 11 in later versions
Previous Message Merlin Moncure 2023-05-30 20:56:36 Re: Fsync IO issue