RE: recovery snapshot waiting for non-overflowed snapshot or until oldest active xid on standby is at least 4739126 (now 1422751)

From: "Li EF Zhang" <bjzhangl(at)cn(dot)ibm(dot)com>
To: magnus(at)hagander(dot)net
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: RE: recovery snapshot waiting for non-overflowed snapshot or until oldest active xid on standby is at least 4739126 (now 1422751)
Date: 2021-01-04 10:17:11
Message-ID: OF8EDAD2FB.237B76FC-ON00258653.00382FBA-00258653.00388196@notes.na.collabserv.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

<div class="socmaildefaultfont" dir="ltr" style="font-family:Arial, Helvetica, sans-serif;font-size:10pt" ><div dir="ltr" >Thanks Magnus!</div>
<div dir="ltr" >Are there any SQL or method to query the running transactions with subtransaction that exceed 64?</div>
<div dir="ltr" >&nbsp;</div>
<blockquote data-history-content-modified="1" data-history-expanded="1" dir="ltr" style="border-left:solid #aaaaaa 2px; margin-left:5px; padding-left:5px; direction:ltr; margin-right:0px" >----- Original message -----<br>From: Magnus Hagander &lt;magnus(at)hagander(dot)net&gt;<br>To: Li EF Zhang &lt;bjzhangl(at)cn(dot)ibm(dot)com&gt;<br>Cc: pgsql-admin(at)lists(dot)postgresql(dot)org<br>Subject: [EXTERNAL] Re: recovery snapshot waiting for non-overflowed snapshot or until oldest active xid on standby is at least 4739126 (now 1422751)<br>Date: Sat, Jan 2, 2021 7:50 PM<br>&nbsp;
<div><font size="2" face="Default Monospace,Courier New,Courier,monospace" >On Sat, Jan 2, 2021 at 12:38 PM Li EF Zhang &lt;bjzhangl(at)cn(dot)ibm(dot)com&gt; wrote:<br>&gt;<br>&gt; We have a postgresql database cluster with 3 node, one is primary, the other 2 are secondary. The cluster is managed by patroni. The 3 databases are in respective containers. When we update one secondary container image, the database in this container can not start. It reports "database system is starting up". We tried to delete the container and recover it from primary, the same error is reported.<br>&gt;<br>&gt; I checked the db log, there is a message:<br>&gt; DEBUG: &nbsp;recovery snapshot waiting for non-overflowed snapshot or until oldest active xid on standby is at least 4739126 (now 1422751)#0122020-12-15 07:40:08.513 UTC [146-5182] CONTEXT: &nbsp;WAL redo at 3/ED11B020 for Standby/RUNNING_XACTS: nextXid 4739482 latestCompletedXid 4739475 oldestRunningXid 1422751; 16 xacts: 2716862 2721890 4665244 2495592 2289138 2288820 2287653 1422751 4280517 2288510 2287620 3297674 1757103 4739326 3320989 2259670; subxid ovf<br>&gt;<br>&gt;<br>&gt; Seems snapshot overflowed which causes the secondary can not start up. I am newer to postgresql. I do not know very clearly how this happened and how to fix it. Thanks!<br><br>AFAICT it looks like you have a very old running transaction on the<br>system (transaction id 1422751 is 3 million transactions ago, and<br>there are other transactions also very old). &nbsp;As mentioned on<br><a href="https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-ADMIN" target="_blank">https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-ADMIN</a>&nbsp;,<br>having very long running transactions in combinations with many<br>subtransactions can cause it to take a long time for the system to<br>reach a consistent state. See also the caveat list at the bottom of<br>that page.<br><br>If those transactions are such that they're needed, you have no other<br>choice but to wait I believe. But in (the more likely, I'd say) they<br>represent hung or otherwise broken clients, then terminating those<br>sessions on the primary should help the process along.<br><br>--<br>&nbsp;Magnus Hagander<br>&nbsp;Me: <a href="https://www.hagander.net/" target="_blank">https://www.hagander.net/</a>&nbsp;<br>&nbsp;Work: <a href="https://www.redpill-linpro.com/" target="_blank">https://www.redpill-linpro.com/</a>&nbsp;</font><br>&nbsp;</div></blockquote>
<div dir="ltr" >&nbsp;</div></div><BR>

Attachment Content-Type Size
unknown_filename text/html 3.3 KB

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Amine Tengilimoglu 2021-01-04 10:19:45 pg_rewind restore_command issue in PG12
Previous Message David G. Johnston 2021-01-04 03:13:06 Re: Postgres 12.4 inner join with where statement = 'string' returning error