Re: scalability bottlenecks with (many) partitions (and more)

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: scalability bottlenecks with (many) partitions (and more)
Date: 2024-09-05 17:33:42
Message-ID: 7e809984-0544-4518-ab00-ca2f0f91ab58@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/4/24 13:15, Tomas Vondra wrote:
> On 9/4/24 11:29, Jakub Wartak wrote:
>> Hi Tomas!
>>
>> ...
>>
>> My $0.02 cents: the originating case that triggered those patches,
>> actually started with LWLock/lock_manager waits being the top#1. The
>> operator can cross check (join) that with a group by pg_locks.fastpath
>> (='f'), count(*). So, IMHO we have good observability in this case
>> (rare thing to say!)
>>
>
> That's a good point. So if you had to give some instructions to users
> what to measure / monitor, and how to adjust the GUC based on that, what
> would your instructions be?
>

After thinking about this a bit more, I'm actually wondering if this is
source of information is sufficient. Firstly, it's just a snapshot of a
single instance, and it's not quite trivial to get some summary for
longer time period (people would have to sample it often enough, etc.).
Doable, but much less convenient than the cumulative counters.

But for the sampling, doesn't this produce skewed data? Imagine you have
a workload with very short queries (which is when fast-path matters), so
you're likely to see the backend while it's obtaining the locks. If the
fast-path locks take much faster acquire (kinda the whole point), we're
more likely to see the backend while it's obtaining the regular locks.

Let's say the backend needs 1000 locks, and 500 of those fit into the
fast-path array. We're likely to see the 500 fast-path locks already
acquired, and a random fraction of the 500 non-fast-path locks. So in
the end you'll se backends needing 500 fast-path locks and 250 regular
locks. That doesn't seem terrible, but I guess the difference can be
made even larger.

regards

--
Tomas Vondra

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2024-09-05 17:34:31 Re: Statistics Import and Export
Previous Message Corey Huinker 2024-09-05 17:29:44 Re: Statistics Import and Export