Re: weird issue with occasional stuck queries

From: Adam Scott <adam(dot)c(dot)scott(at)gmail(dot)com>
To: spiral <spiral(at)spiral(dot)sh>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: weird issue with occasional stuck queries
Date: 2022-04-01 22:53:40
Message-ID: CA+s62-MExP8HqTsdddbsSXNLHBRD0ABR81fJQ8zJnDMeQyVMug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

If you get a chance, showing the `top` output might be useful as well. For
instance if you are low on memory, it can slow down the allocation of
buffers. Another thing to look at is `iostat -x -y` and look at disk util
%. This is an indicator, but not definitive, of how much disk access is
going on. It may be your drives are just saturated although your IOWait
looks ok in your attachment.

That wait event according to documentation is "Waiting to access the
multixact member SLRU cache." SLRU = segmented least recently used cache

Do you have a query that is a "select for update" running somewhere?

If your disk is low on space `df -h` that might explain the issue.

Is there an ERROR: multixact something in your postgres log?

Adam

On Fri, Apr 1, 2022 at 6:28 AM spiral <spiral(at)spiral(dot)sh> wrote:

> Hey,
>
> I'm having a weird issue where a few times a day, any query that hits a
> specific index (specifically a `unique` column index) gets stuck for
> anywhere between 1 and 15 minutes on a LWLock (mostly
> MultiXactOffsetSLRU - not sure what that is, I couldn't find anything
> about it except for a pgsql-hackers list thread that I didn't really
> understand).
> Checking netdata history, these stuck queries coincide with massive
> disk read; we average ~2MiB/s disk read and it got to 40MiB/s earlier
> today.
>
> These queries used to get stuck for ~15 minutes at worst, but I turned
> down the query timeout. I assume the numbers above would be worse if I
> let the queries run for as long as they need, but I don't have any logs
> from before that change and I don't really want to try that again as it
> would impact production.
>
> I asked on the IRC a few days ago and got the suggestion to increase
> shared_buffers, but that doesn't seem to have helped at all. I also
> tried deleting and recreating the index, but that seems to have changed
> nothing as well.
>
> Any suggestions are appreciated since I'm really not sure how to debug
> this further. I'm also attaching a couple screenshots that might be
> useful.
>
> spiral
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Shaozhong SHI 2022-04-02 01:34:26 How long does iteration over 4-5 million rows usually take?
Previous Message Adrian Klaver 2022-04-01 18:04:49 Re: Does PGDG apt repository support ARM64?