From: | Paul Friedman <paul(dot)friedman(at)streetlightdata(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | RE: LWLocks by LockManager slowing large DB |
Date: | 2021-04-12 22:15:05 |
Message-ID: | fe22f61a175c68f2a52ae0f63eab57d5@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Thanks for the quick reply!
These queries take ~1hr and are the only thing running on the system (all
60 are launched at the same time and the tables/files are fully-primed into
memory so iowaits are basically zero).
Yes, that’s the same query I’ve been running to analyze the locks and this
is the problem:
SELECT state, backend_type, wait_event_type, wait_event, count(*) FROM
pg_stat_activity GROUP BY 1, 2, 3, 4 ORDER BY count(*) DESC;
State backend_type wait_event_type wait_event count
active client backend LWLock LockManager 28
active client backend 21
autovacuum launcher Activity AutoVacuumMain 1
logical replication launcher Activity
LogicalLauncherMain 1
checkpointer Activity CheckpointerMain 1
idle client backend Client ClientRead 1
background writer Activity BgWriterMain 1
walwriter Activity WalWriterMain 1
Thanks again for any advice you have.
---Paul
Paul Friedman
CTO
677 Harrison St | San Francisco, CA 94107
M: (650) 270-7676
E-mail: paul(dot)friedman(at)streetlightdata(dot)com
-----Original Message-----
From: Andres Freund <andres(at)anarazel(dot)de>
Sent: Monday, April 12, 2021 2:58 PM
To: Paul Friedman <paul(dot)friedman(at)streetlightdata(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: LWLocks by LockManager slowing large DB
Hi,
On 2021-04-12 12:37:42 -0700, Paul Friedman wrote:
> Boiling the complex queries down to their simplest form, we test
> running 60 of this query simultaneously:
How long does one execution of these queries take (on average)? The likely
bottlenecks are very different between running 60 concurrent queries that
each complete in 0.1ms and ones that take > 1s.
Could you show the results for a query like SELECT state, backend_type,
wait_event_type, wait_event, count(*) FROM pg_stat_activity GROUP BY 1, 2,
3, 4 ORDER BY count(*) DESC; ?
Without knowing the proportion of LockManager wait events compared to the
rest it's hard to know what to make of it.
> Does anyone have any advice on how to alleviate LockManager’s LWlock
issue?
It'd be useful to get a perf profile for this. Both for cpu time and for
what ends up blocking on kernel-level locks. E.g. something like
# cpu time
perf record --call-graph dwarf -F 500 -a sleep 5 perf report --no-children
--sort comm,symbol
To send it to the list you can use something like perf report --no-children
--sort comm,symbol|head -n 500 > somefile
# kernel level blocking on locks
perf record --call-graph dwarf -e syscalls:sys_enter_futex -a sleep 3 perf
report --no-children --sort comm,symbol
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2021-04-12 22:22:21 | Re: LWLocks by LockManager slowing large DB |
Previous Message | Andres Freund | 2021-04-12 21:57:38 | Re: LWLocks by LockManager slowing large DB |