Re: pgsql: Add kqueue(2) support to the WaitEventSet API.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date: 2020-02-20 23:55:43
Message-ID: 6610.1582242943@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

I wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
>> ... like coypu, where NUM_RESERVED_FDS is
>> the only thing ensuring we have some spare fds. I don't know the
>> history but it looks like NUM_RESERVED_FDS was deliberately or
>> accidentally tuned to be just enough to be able to run the regression
>> tests (the interesting ones being the ones that use sockets, like
>> postgres_fdw.sql), but with a new long lived kqueue() fd in the
>> picture, it might have to be increased to cover it, no?

> No. NUM_RESERVED_FDS was set decades ago, long before any of those tests
> existed, and it has never been changed AFAIK. It is a bit striking that
> we just started seeing it be insufficient with this patch. Maybe that's
> just happenstance, but I wonder whether there is a plain old FD leak
> involved in addition to the design issue? I'll take a closer look at
> exactly what's open when we hit the error.

Hmm ... looks like I'm wrong: we've been skating way too close to the edge
for awhile. Here's a breakdown of the open FDs in the backend at the time
of the failure, excluding stdin/stdout/stderr (which set_max_safe_fds
accounted for) and FDs pointing to database files:

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
postmaste 2657 postgres 3r FIFO 0,8 0t0 20902158 pipe postmaster_alive_fds[0]
postmaste 2657 postgres 4u REG 0,9 0 3877 [eventpoll] FeBeWaitSet's epoll_fd
postmaste 2657 postgres 7u unix 0xffff880878e21880 0t0 20902664 socket socket for a PGconn owned by postgres_fdw
postmaste 2657 postgres 8u IPv6 20902171 0t0 UDP localhost:40795->localhost:40795 pgStatSock
postmaste 2657 postgres 9u unix 0xffff880876903c00 0t0 20902605 /tmp/.s.PGSQL.5432 MyProcPort->sock
postmaste 2657 postgres 10r FIFO 0,8 0t0 20902606 pipe selfpipe_readfd
postmaste 2657 postgres 11w FIFO 0,8 0t0 20902606 pipe selfpipe_writefd
postmaste 2657 postgres 105u unix 0xffff880878e21180 0t0 20902647 socket socket for a PGconn owned by postgres_fdw
postmaste 2657 postgres 118u unix 0xffff8804772c88c0 0t0 20902650 socket socket for a PGconn owned by postgres_fdw

One thing to notice is that there are only nine, though NUM_RESERVED_FDS
should have allowed ten. That's because there are 116 open FDs pointing
at database files, though max_safe_fds is 115. I'm not sure whether
that's OK or an off-by-one error in fd.c's accounting. One of the 116
is pointing at a WAL segment, and I think we might not be sending that
through the normal VFD path, so it might be "expected".

But anyway, what this shows is that over time we've eaten enough of
the "reserved" FDs that only three are available for random usage like
postgres_fdw's, if the process's back is against the wall FD-wise.
The postgres_fdw regression test is using all three, meaning there's
exactly no daylight in that test.

Clearly, we gotta do something about that too. Maybe bumping up
NUM_RESERVED_FDS would be a good idea, but I feel like more-honest
accounting for permanently-eaten FDs would be a better idea. And
in any case we can't suppose that there's a fixed upper limit on
the number of postgres_fdw connections. I'm liking the idea I floated
earlier of letting postgres_fdw forcibly close the oldest LRU entry.

BTW, you don't need anything very exotic to provoke this error.
The results above are from a Linux box; I just did "ulimit -n 128"
before starting the postmaster.

regards, tom lane

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Michael Paquier 2020-02-21 03:08:13 pgsql: Doc: Fix instructions to control build environment with MSVC
Previous Message Tom Lane 2020-02-20 22:05:14 Re: pgsql: Add kqueue(2) support to the WaitEventSet API.

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2020-02-21 00:35:03 Re: Protocol problem with GSSAPI encryption?
Previous Message Thomas Munro 2020-02-20 23:40:06 Re: Improve heavyweight locks instead of building new lock managers?