Extremely slow to establish connection when user has a high number of roles

From: Michal Charemza <michal(at)charemza(dot)name>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Extremely slow to establish connection when user has a high number of roles
Date: 2024-04-20 11:55:12
Message-ID: CAJe2WWhxzbt_uszVYnyfw4Y+OdeHXdue0GD+Od5uk15XD_FL5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

We're running PostgreSQL as essentially a data warehouse, and we have a few
thousand roles, which are used to grant permissions on a table-by-table
basis to a few thousand users, so a user would typically have say between 1
and 2 thousand roles. There is also quite a lot of "churn" in terms of
tables being created/removed, and permissions changed.

The issue is that we're hitting a strange performance problem on
connection. Sometimes it can take ~25 to 40 seconds just to connect,
although it's often way quicker. There seems to be no middle ground - never
have I seen a connection take between 0.5 and 25 seconds for example. We
suspect it's related to the number of roles the connecting user has
(including via other roles), because if we remove all roles but one from
the connecting user (the one that grants connection permissions),
connecting is always virtually instantaneous.

The closest issue that I can find that's similar is
https://www.postgresql.org/message-id/flat/CAGvXd3OSMbJQwOSc-Tq-Ro1CAz%3DvggErdSG7pv2s6vmmTOLJSg%40mail.gmail.com,
which reports that GRANT role is slow with a high number of roles - but in
our case, it's connecting that's the problem, before (as far as we can
tell) even one query is run. The database is busy, say up to 60-80% on a 16
VCPU machine - even if it's a "good amount" below 100%, the issue occurs.

Is there anything we can do to investigate (or hopefully fix!) the issue?

Thanks,

Michal

------

A description of what you are trying to achieve and what results you
expect.:
We would like to connect to the database - expect it to connect in less
than 1 second, but sometimes 25 - 40s.

PostgreSQL version number you are running:
PostgreSQL 14.10 on aarch64-unknown-linux-gnu, compiled by
aarch64-unknown-linux-gnu-gcc (GCC) 9.5.0, 64-bit

How you installed PostgreSQL:
Via AWS/Amazon Aurora

Changes made to the settings in the postgresql.conf file
In attached CSV file

Operating system and version:
Unknown

What program you're using to connect to PostgreSQL:
Python + SQLAlchemy, psql, or also via Amazon Quicksight (Unsure which
client they use under the hood, but it surfaces connection timeout errors,
which we suspect is due to the issue described above)

Is there anything relevant or unusual in the PostgreSQL server logs?:
No

For questions about any kind of error:
N/A

Attachment Content-Type Size
server_configuration.csv text/csv 6.0 KB

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tomas Vondra 2024-04-20 12:57:20 Re: Extremely slow to establish connection when user has a high number of roles
Previous Message Dimitrios Apostolou 2024-04-12 11:12:55 Re: Performance implications of 8K pread()s