RE: Performance issue after migration from 9.4 to 15

From: Christian Schröder <Christian(dot)Schroeder(at)solvians(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: RE: Performance issue after migration from 9.4 to 15
Date: 2023-04-11 14:41:05
Message-ID: FR2P281MB3341D33FC0BC62B53484614C8A9A9@FR2P281MB3341.DEUP281.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi all,
Thanks for all your tips!
I have meanwhile run "ANALYZE" on all tables. Either that was as important as everybody tells me 😉 or it is just a coincidence (e.g., fewer other queries on the database). In any case, the performance has indeed improved considerably. I will observe the situation tomorrow in the morning, when there is typically more load on the system.
I also noticed that, after analyzing, the sequential scan node shows fewer columns in the "output" section (https://explain.dalibo.com/plan/b8bfa5a3d2dc33bc#plan/node/18) I am not sure if this has an impact on the performance of this particular operation.
I will also follow the suggestion and increase the number of parallel workers. The new machine has enough cores, so this should be possible.

Best,
Christian

-----Original Message-----
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Sent: Tuesday, April 11, 2023 4:09 PM
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Christian Schröder <Christian(dot)Schroeder(at)solvians(dot)com>; pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Performance issue after migration from 9.4 to 15

David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> On Tue, 11 Apr 2023 at 23:03, Christian Schröder
> <Christian(dot)Schroeder(at)solvians(dot)com> wrote:
>> Particularly interesting are the sequential scans. In the old plan, we have node #21, which took 32 seconds. Almost all of the time goes into actual I/O. In the new plan, the corresponding node is #34. It took 55 seconds, but less than 1 second was actual I/O (because most of the data was already in memory). Why did this step still take about twice the time?

> Perhaps your 15 server is under more load than 9.4 due to all
> concurrent plans being slower from bad statistics? Load averages might
> be a good indicator. (I assume the server is busy due to the "Workers
> Launched": 0)

I think the extra time is due to useless overhead from trying and failing to parallelize: the leader has to do all the work, but there's probably overhead added anyway. 9.4 of course knew nothing of parallelism.

My guess is that the OP is trying to run with a large number of backends and has not raised the max number of parallel workers to match. It does look like the stats might need updating (since 9.4's rowcount estimate is OK and 15's less so) but that is not why we see "Workers
planned: 2, Workers launched: 0". Either provision enough parallel workers to fix that, or disable parallelism.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Evgeny Morozov 2023-04-11 16:44:54 Re: "PANIC: could not open critical system index 2662" - twice
Previous Message Tom Lane 2023-04-11 14:08:55 Re: Performance issue after migration from 9.4 to 15