Re: UDP buffer drops / statistics collector

From: Tim Kane <tim(dot)kane(at)gmail(dot)com>
To: PostgreSQL mailing lists <pgsql-general(at)postgresql(dot)org>
Subject: Re: UDP buffer drops / statistics collector
Date: 2017-04-18 13:53:48
Message-ID: CADVWZZK8eyAC9c=ha0tTbRQ9u0x5+L4aQdDp2hWsBLWzppbUkA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Okay, so I've run an strace on the collector process during a buffer drop
event.
I can see evidence of a recvfrom loop pulling in a *maximum* of 142kb.

While I've had already increased rmem_max, it would appear this is not
being observed by the kernel.
rmem_default is set to 124kb, which would explain the above read maxing out
just slightly beyond this (presuming a ring buffer filling up behind the
read).

I'm going to try increasing rmem_default and see if it has any positive
effect.. (and then investigate why the kernel doesn't want to consider
rmem_max)..

On Tue, Apr 18, 2017 at 8:05 AM Tim Kane <tim(dot)kane(at)gmail(dot)com> wrote:

> Hi all,
>
> I'm seeing sporadic (but frequent) UDP buffer drops on a host that so far
> I've not been able to resolve.
>
> The drops are originating from postgres processes, and from what I know -
> the only UDP traffic generated by postgres should be consumed by the
> statistics collector - but for whatever reason, it's failing to read the
> packets quickly enough.
>
> Interestingly, I'm seeing these drops occur even when the system is idle..
> but every 15 minutes or so (not consistently enough to isolate any
> particular activity) we'll see in the order of ~90 packets dropped at a
> time.
>
> I'm running 9.6.2, but the issue was previously occurring on 9.2.4 (on the
> same hardware)
>
>
> If it's relevant.. there are two instances of postgres running (and
> consequently, 2 instances of the stats collector process) though 1 of those
> instances is most definitely idle for most of the day.
>
> In an effort to try to resolve the problem, I've increased (x2) the UDP
> recv buffer sizes on the host - but it seems to have had no effect.
>
> cat /proc/sys/net/core/rmem_max
> 1677216
>
> The following parameters are configured
>
> track_activities on
> track_counts on
> track_functions none
> track_io_timing off
>
>
> There are approximately 80-100 connections at any given time.
>
> It seems that the issue started a few weeks ago, around the time of a
> reboot on the given host... but it's difficult to know what (if anything)
> has changed, or why :-/
>
>
> Incidentally... the documentation doesn't seem to have any mention of UDP
> whatsoever. I'm going to use this as an opportunity to dive into the
> source - but perhaps it's worth improving the documentation around this?
>
> My next step is to try disabling track_activities and track_counts to see
> if they improve matters any, but I wouldn't expect these to generate enough
> data to flood the UDP buffers :-/
>
> Any ideas?
>
>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2017-04-18 14:03:23 Re: QGIS Loads Black Screen For PostGIS Out-Db Raster Data
Previous Message Melvin Davidson 2017-04-18 13:48:54 Re: Clone PostgreSQL schema