Re: Error: server closed the connection unexpectedly

From: Sachin Kotwal <kotsachin(at)gmail(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: Anil <anilkumar15(dot)mf(at)gmail(dot)com>, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Error: server closed the connection unexpectedly
Date: 2021-10-22 13:02:02
Message-ID: CA+N_YAdgsgeah1SPxH6hpPNm8v_0zRYytyMixBtmssbAFdmjTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi All,

I am also facing a similar issue. In my case I am connecting to AWS Aurora
PostgreSQL.

When my psql session is idle for a while ( for 5 - 10 minutes ) , It gets
terminated and reconnect but query/transaction started by earlier session
remain in `idle in transaction` state and the only option left to clean it
is terminate that session with `pg_terminate_backend()`. Issue happens when
we are connecting from either psql or DBeaver client but not with the
application.

Here is example :
```
test=> begin;
BEGIN
test=*> LOCK TABLE test.dependency_queue IN EXCLUSIVE MODE;
LOCK TABLE
test=*>
test=*> rollback;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset:
Succeeded.
psql (13.4, server 11.9)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384,
bits: 256, compression: off)

test=> select now()-query_start, pid,state, usename, application_name,
client_addr, wait_event_type, query from pg_stat_activity where
state!='idle';
?column? | pid | state | usename |
application_name | client_addr | wait_event_type |
query

-----------------+-------+---------------------+----------+------------------------+--------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------
00:08:22.762573 | 20233 | active | postgres | PostgreSQL JDBC
Driver | 10.0.174.30 | Lock | LOCK TABLE test.dependency_queue
IN EXCLUSIVE MODE
00:09:28.101577 | 14830 | idle in transaction | postgres | psql
| 10.0.173.200 | Client | LOCK TABLE test.dependency_queue
IN EXCLUSIVE MODE;
00:00:00 | 10198 | active | postgres | psql
| 10.0.103.144 | | select now()-query_start,
pid,state, usename, application_name, client_addr, wait_event_type, query
from pg_stat_activity where state!='idle';
(3 rows)

test=> rollback;
WARNING: there is no transaction in progress
ROLLBACK
```

Nothing in the postgresql logs apart from below statements multiple times:
```
cat postgresql.log.2021-10-22-1200 | grep '10.0.173.200'
2021-10-22 12:00:09 UTC:10.0.173.200(4082):[unknown](at)[unknown]:[729]:LOG:
connection received: host=10.0.173.200 port=4082
2021-10-22 12:00:09 UTC:10.0.173.200(4082):[unknown](at)[unknown]:[729]:LOG:
incomplete startup packet
```

Now because the psql session was terminated incorrectly, another session
from the application trying to get explicit lock on the same table is
waiting for a long time, until we are not terminating the hanging session.

Below are timeout/keepalive settings on my PostgreSQL instance:
```
select name, setting, unit from pg_settings where name ilike '%timeout%';
name | setting | unit
-------------------------------------+----------+------
archive_timeout | 300 | s
authentication_timeout | 60 | s
checkpoint_timeout | 60 | s
deadlock_timeout | 1000 | ms
idle_in_transaction_session_timeout | 86400000 | ms
lock_timeout | 0 | ms
statement_timeout | 0 | ms
wal_receiver_timeout | 30000 | ms
wal_sender_timeout | 60000 | ms
(9 rows)

select name, setting, unit from pg_settings where name ilike '%keep%';
name | setting | unit
-------------------------+---------+------
tcp_keepalives_count | 9 |
tcp_keepalives_idle | 7200 | s
tcp_keepalives_interval | 75 | s
wal_keep_segments | 0 |
(4 rows)
```

macOS Catelina 10.15.7 timeout settings :
```
sysctl -A | grep keep

net.inet.tcp.keepidle: 7200000
net.inet.tcp.keepintvl: 75000
net.inet.tcp.keepinit: 75000
net.inet.tcp.keepcnt: 8
net.inet.tcp.always_keepalive: 0
net.inet.mptcp.keepalive: 840
net.link.ether.inet.keep_announcements: 1
net.key.natt_keepalive_interval: 20
net.inet6.ip6.keepfaith: 0
net.necp.pass_keepalives: 1
```

I can set ` lock_timeout` and ` idle_in_transaction_session_timeout` to
lower value so that the hanging session can get terminated automatically by
postgres but I would like to know root cause of the issue and fix it there
instead of adjusting postgresql timeout setting values.

Please let me know which setting on my client or server is not set properly
and because of that the issue is happening.

Regards,
Sachin

On Mon, Oct 11, 2021 at 1:38 PM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
wrote:

> On Mon, 2021-10-11 at 00:12 +0530, Anil wrote:
> > Please find the server log, if you can find something (due to some
> reasons not being able to get OS system log, will be shared soon).
>
> Nothing to see.
>
> Perhaps an ill-configured network component closes idle connection?
> In that case, try keepalive.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>
>
>

--

Thanks and Regards,
Sachin Kotwal

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ninad Shah 2021-10-22 15:12:32 Re: spannerdb migration to PostgreSQL
Previous Message Abhishek Bhola 2021-10-22 08:11:13 [Major version upgrade] pg_upgrade fails despite passing check mode