Re: why would postgres be throttling a streaming replication slot's sending?

From: Achilleas Mantzios <achill(at)matrix(dot)gatewaynet(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: why would postgres be throttling a streaming replication slot's sending?
Date: 2018-12-13 08:56:27
Message-ID: 04d4fc3a-5f53-adc8-d9ff-3336efd2d762@matrix.gatewaynet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 12/12/18 3:45 μ.μ., Chris Withers wrote:
> On 11/12/2018 14:48, Achilleas Mantzios wrote:
>> On 11/12/18 4:00 μ.μ., Chris Withers wrote:
>>>
>>> I'm looking after a multi-tenant PG 9.4 cluster, and we've started getting alerts for the number of WALs on the server.
>>> It'd be great to understand what's generating all that WAL and what's likely to be causing any problems.\
>>
>> Regarding you wals in pg_wal,  a good threshold could be anything more than a e.g. 10% increase from wal_keep_segments with a trend to go up. If this number goes up chances are something bad is
>> happening.
>
> wal_keep_segments is 0 here, so I went hunting in pg_replication_slots and found that it's the barman slot that's behind on the WAL.
>

Then your number WALs depends on the traffic generated after the previous checkpoint, the status of archiving and IMO the status of replication slots. The doc :
https://www.postgresql.org/docs/11/runtime-config-replication.html doesn't specifically imply this last one, but it is documented here : https://www.postgresql.org/docs/10/wal-configuration.html , and
it has happened to me. If a replication client with replication_slot dies, WALs accumulate till it wakes up again. (case in pgsql 10)

> strace'ing on the barman receiving side shows it waiting on the socket, so no problem on that side.
>
> on the sending side, it's a little confusing, since the server is basically idle, no cpu or disk activity, and yet strace'ing the sending process attached to the other end of the socket shows time
> being spent waiting on a poll which, while it includes the socket being sent to, also includes a bunch of pipes. I've attached a chunk of the output below in the hope that someone on this list could
> offer an explanation as to what might cause the WAL to be trickling over this port rather than catching up as fast as it can?

pg_stat_replication is what you should use for monitoring. IIRC you wont see some cols in the row of the barman.
Also you could query pg_replication_slots as :

select pg_wal_lsn_diff(pg_current_wal_lsn(),confirmed_flush_lsn),pg_wal_lsn_diff(pg_current_wal_lsn(),restart_lsn) from pg_replication_slots where active_pid=<the pid reported above by
pg_stat_replication>;
 pg_wal_lsn_diff | pg_wal_lsn_diff
-----------------+-----------------
             312 |         1917736
(1 row)

In the first column you should see diff values (in bytes) close to zero. In the second you should see the length (window) of your replication slot in bytes, which means that this converted to 16MB
should indicate an excess of the number of wals due to the replication slot in addition to the other factors (archiving, checkpoints). In my system it never reaches the size of a single wal.

>
> cheers,
>
> Chris
>
> strace output for the streaming_barman slot servicing the barman replication slot:

IMO no need to go deep to gdb and stack traces unless you are sure there is a problem further down.

>
> --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=3577, si_uid=26} ---
> write(4, "\0", 1)                       = 1
> rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
> read(3, "\0", 16)                       = 1
> read(6, 0x7fffffffdcd7, 1)              = -1 EAGAIN (Resource temporarily unavailable)
> recvfrom(10, 0x7fffffffdcaf, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
> read(5, "\3\1\0\0\7\326\355\2\343\0\0\0(at)\n\0\0\0\17\1\230\313\4\0\0)\347\372l\0\0\0\0"..., 4272) = 4272
> sendto(10, "d\0\0\20\315w\0\0\4\313\230\1\17P\0\0\4\313\230\1 \0\0\2\37\321\343\345\306\20\3\1"..., 4302, 0, NULL, 0) = 4302
> read(3, 0x7fffffffdc90, 16)             = -1 EAGAIN (Resource temporarily unavailable)
> poll([{fd=10, events=POLLIN}, {fd=3, events=POLLIN}, {fd=6, events=POLLIN}], 3, 26076) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
> --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=3577, si_uid=26} ---
> write(4, "\0", 1)                       = 1
> rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
> read(3, "\0", 16)                       = 1
> read(6, 0x7fffffffdcd7, 1)              = -1 EAGAIN (Resource temporarily unavailable)
> recvfrom(10, 0x7fffffffdcaf, 1, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
> read(5, "~\320\5\0\3\0\0\0\0 \1\230\313\4\0\0\3\0\0\0\0\0\0\0\323\316\314\0\0\0\0\0"..., 8192) = 8192
> sendto(10, "d\0\0 \35w\0\0\4\313\230\1 \0\0\0\4\313\230\1(at)\0\0\2\37\321\343\361\376\226~\320"..., 8222, 0, NULL, 0) = 8222
> read(3, 0x7fffffffdc90, 16)             = -1 EAGAIN (Resource temporarily unavailable)
> poll([{fd=10, events=POLLIN}, {fd=3, events=POLLIN}, {fd=6, events=POLLIN}], 3, 25275) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
> --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=3577, si_uid=26} ---
>
> Interesting numbers:
> - process 3577 is the wal writer process
> - fd 10 is the socket, 3 and 6 are pipes
> - time is visibly spent sitting on that poll
> - the network is only seeing 2Mbit/sec, which is nothing
>
> cheers,
>
> Chris
>

--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Andrew Gierth 2018-12-13 09:03:02 Re: How to build a btree index with integer values on jsonb data?
Previous Message Johann Spies 2018-12-13 08:27:38 Re: How to build a btree index with integer values on jsonb data?