Re: Max connections reached without max connections reached

From: Rob Sargent <robjsargent(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Max connections reached without max connections reached
Date: 2021-12-01 22:52:01
Message-ID: 0fba028d-68da-349c-0996-e83f707824b7@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 12/1/21 3:29 PM, James Sewell wrote:
>
> Looking again into the back trace[1], it appeared that the backend is
> getting stuck while getting the subtransaction's parent information,
> and that can only happen if the snapshot is getting marked as
> overflowed.  So it seems that some of the scripts are using a lot of
> sub-transaction (> 64) within a transaction and in such case, we can
> not cache the sub-transaction information and the snapshot will be
> marked overflowed.  After that, for checking the visibility of any
> tuple we need to consult the SLRU to get TopMost transaction
> information and if the subtransaction SLRU cache is full then it will
> have to do the I/O.  So in this case it appeared that a lot of
> parallel scans/new connections are trying to read the pg_subtrans and
> due to limited cache size that is not even able to load the data in
> the cache and getting frequently blocked.  So I assume that when you
> say the startup is getting stuck forever, how long did you wait, maybe
> it is moving but getting stuck again for the next tuple, so to confirm
> that attach the process with gdb put breakpoint in XidInMVCCSnapshot()
> and check whether it is hitting that breakpoint again or not.  Also,
> check whether you are seeing "SubtransControlLock" in
> pg_stat_activity.  There is one nice blog[2] explaining this issue so
> you can refer to that as well.
>
>
> Ah. Snap. This makes a lot of sense. The developers have a nasty habit
> of using LOOP over rows which has never been stamped out. I bet they
> have applied this to something with a lot of rows.
>
> Is there a way of seeing which statements / backends / processes have
> greater than 64 stxns? I'd like to monitor this.
>
> - James
>
> ------------------------------------------------------------------------
> The contents of this email are confidential and may be subject to
> legal or professional privilege and copyright. No representation is
> made that this email is free of viruses or other defects. If you have
> received this communication in error, you may not copy or distribute
> any part of it or otherwise disclose its contents to anyone. Please
> advise the sender of your incorrect receipt of this correspondence.
Also be on the look-out for begin trans; begin trans; etc  I read
Kumar's report as nested transactions.  If your gang is doing a
transaction per row, they need a kick in the rear.  Anyone not closing
those needs a pink slip.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-12-02 00:28:48 Re: ssl_crl_file Certificate Revocation List doesn't work for postgresql 11
Previous Message James Sewell 2021-12-01 22:29:20 Re: Max connections reached without max connections reached