Re: weird issue with occasional stuck queries

From: Adam Scott <adam(dot)c(dot)scott(at)gmail(dot)com>
To: spiral <spiral(at)spiral(dot)sh>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: weird issue with occasional stuck queries
Date: 2022-04-02 19:09:15
Message-ID: CA+s62-M3bCtos=ocwNoyo0s8rEx0Q_Pw+BiURNXf55hdCaq=PA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

The logs were helpful. You may want to see the statements around the
errors, as more detail may be there such as the SQL statement associated
with the error.

Deadlocks are an indicator that the client code needs to be examined for
improvement. See
https://www.cybertec-postgresql.com/en/postgresql-understanding-deadlocks/
about deadlocks. They will slow things down and could cause a queue of SQL
statements eventually bogging down the system.

It definitely looks like locking issues which is why you don't see high
CPU. IIRC you might see high system CPU usage, as opposed to userspace
CPU, where the kernel is getting overloaded. The `top` command will help to
show that.

The disks could be saturated by the write ahead log (WAL) handling of all
the transactions. More about WAL here:
https://www.postgresql.org/docs/10/wal-internals.html You could consider
moving that directory somewhere else using a symbolic link (conf. the link)

Anyway, these are the things I would look at.

Adam

On Sat, Apr 2, 2022 at 5:23 AM spiral <spiral(at)spiral(dot)sh> wrote:

> Hey,
>
> > That wait event according to documentation is "Waiting to access the
> > multixact member SLRU cache." SLRU = segmented least recently used
> > cache
>
> I see, thanks!
>
> > if you are low on memory, it can slow down the allocation of
> > buffers. Do you have a query that is a "select for update" running
> > somewhere? If your disk is low on space `df -h` that might explain
> > the issue.
>
> - There aren't any queries that are running for longer than the selects
> shown earlier; definitely not "select for update" since I don't ever
> use that in my code.
> - Both disk and RAM utilization is relatively low.
>
> > Is there an ERROR: multixact something in your postgres log?
>
> There isn't, but while checking I saw some other concerning errors
> including "deadlock detected", "could not map dynamic shared memory
> segment" and "could not attach to dynamic shared area".
> (full logs here:
> https://paste.sr.ht/blob/9ced99b119c3fce1ecfd71e8554946e7845a44dd )
>
> > Another thing to look at is `iostat -x -y` and look at disk util %.
> > This is an indicator, but not definitive, of how much disk access is
> > going on. It may be your drives are just saturated although your
> > IOWait looks ok in your attachment.
>
> I didn't specifically look at that, but I did notice *very* high disk
> utilization in at least one instance of the stuck queries, as I
> mentioned previously. Why would the disks be getting saturated? The
> query count isn't noticeably higher than average, and the database
> is not autovacuuming, so not sure what could cause that.
>
> spiral
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message overland 2022-04-03 02:08:34 Re: weird issue with occasional stuck queries
Previous Message Steve Midgley 2022-04-02 16:38:41 Re: How long does iteration over 4-5 million rows usually take?