Idle backends outside a transaction holding onto large amounts of memory / swap space?

From: Tobias Gierke <tobias(dot)gierke(at)code-sourcery(dot)de>
To: PostgreSQL mailing lists <pgsql-performance(at)postgresql(dot)org>
Subject: Idle backends outside a transaction holding onto large amounts of memory / swap space?
Date: 2019-02-25 10:36:40
Message-ID: 1395b3e3-a76f-7ec0-a7b3-8e293dd7b75e@code-sourcery.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

Recently we started seeing the Linux OOM killer kicking in and killing
PostgreSQL processes on one of our development machines.

The PostgreSQL version we're using was compiled by us, is running on
CentOS 7 and is

PostgreSQL 10.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-28), 64-bit

While looking at the machine I saw the following peculiar thing: Swap is
almost completely full while buff/cache still has ~3GB available.

root(at)demo:/etc/systemd/system # free -m
              total        used        free      shared buff/cache  
available
Mem:           7820        3932         770        1917 3116        1548
Swap:          4095        3627         468

Running the following one-liner shows that two PostgreSQL processes are
using most of the swap:

for proc in /proc/*;   do echo $proc ; cat $proc/smaps 2>/dev/null | awk
'/Swap/{swap+=$2}END{print swap "\tKB\t'`echo $proc|awk '{print $1}' `'"
}'; done | sort -n | awk '{total+=$1}/[0-9]/;END{print total "\tKB\tTotal"}'

1387496 KB      /proc/22788
1837872 KB      /proc/22789

I attached the memory mappings of these processes to the mail. Both
processes inside PostgreSQL show up as idle outside of any transaction
and belong to a JDBC (Java) connection pool.

voip=# select * from pg_stat_activity where pid in (22788,22789);
-[ RECORD 1 ]----+------------------------------
datid            | 16404
pid              | 22789
usesysid         | 10
usename          | postgres
client_addr      | 127.0.0.1
client_hostname  |
client_port      | 45649
backend_start    | 2019-02-25 00:17:15.246625+01
xact_start       |
query_start      | 2019-02-25 10:52:07.729096+01
state_change     | 2019-02-25 10:52:07.748077+01
wait_event_type  | Client
wait_event       | ClientRead
state            | idle
backend_xid      |
backend_xmin     |
query            | COMMIT
backend_type     | client backend
-[ RECORD 2 ]----+------------------------------
datid            | 16404
pid              | 22788
usesysid         | 10
usename          | postgres
client_addr      | 127.0.0.1
client_hostname  |
client_port      | 45648
backend_start    | 2019-02-25 00:17:15.24631+01
xact_start       |
query_start      | 2019-02-25 10:55:42.577158+01
state_change     | 2019-02-25 10:55:42.577218+01
wait_event_type  | Client
wait_event       | ClientRead
state            | idle
backend_xid      |
backend_xmin     |
query            | ROLLBACK
backend_type     | client backend
--------->8------------------>8------------------>8------------------>8---------

I attached the postgresql.conf we're using to this mail as well.

Is this expected behaviour ? Did we over-commission the machine in our
postgresql.conf ?

Thanks,
Tobias

Attachment Content-Type Size
22788_smaps.txt text/plain 72.4 KB
22789_smaps.txt text/plain 72.9 KB
postgresql.conf text/plain 687 bytes

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kim 2019-02-25 10:41:18 Query slow for new participants
Previous Message kimaidou 2019-02-25 09:44:45 Re: Aggregate and many LEFT JOIN