Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup
Date: 2025-02-11 21:14:21
Message-ID: 7u7dbn6s2i6bf3hjzkbqaexj2bpoblqxwbkffbetl4rjv6dcom@s2uickjc5z53
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-02-11 21:04:25 +0100, Tomas Vondra wrote:
> I agree the defaults may be pretty low for current systems, but do we
> want to get into the business of picking a value and overriding whatever
> value is set by the sysadmin? I don't think a high hard limit should be
> seen as an implicit permission to just set is as the soft limit.

As documented in the links sent by Jelte, that's *explicitly* the reasoning
for the difference between default soft and hard limits. For safety programs
should opt in into using higher FD limits, rather than being opted into it.

> Imagine you're a sysadmin / DBA who picks a low soft limit (for whatever
> reason - there may be other stuff running on the system, ...). And then
> postgres starts and just feels like bumping the soft limit. Sure, the
> sysadmin can lower the hard limit and then we'll respect that, but I don't
> recall any other tool requiring this approach, and it would definitely be
> quite surprising to me.

https://codesearch.debian.net/search?q=setrlimit&literal=1

In a quick skim I found that at least gimp, openjdk, libreoffice, gcc, samba,
dbus increase the soft limit to something closer to the hard limit. And that
was at page 62 out of 1269.

> I did run into bottlenecks due to "too few file descriptors" during a
> recent experiments with partitioning, which made it pretty trivial to
> get into a situation when we start trashing the VfdCache. I have a
> half-written draft of a blog post about that somewhere.
>
> But my conclusion was that it's damn difficult to even realize that's
> happening, especially if you don't have access to the OS / perf, etc. So my
> takeaway was we should improve that first, so that people have a chance to
> realize they have this issue, and can do the tuning. The improvements I
> thought about were:

Hm, that seems something orthogonal to me. I'm on board with that suggestion,
but I don't see why that should stop us from having code to adjust the rlimit.

My suggestion would be to redefine max_files_per_process as the number of
files we try to be able to open in backends. I.e. set_max_safe_fds() would
first count the number of already open fds (since those will largely be
inherited by child processes) and then check if we can open up to
max_files_per_process files in addition. Adjusting the RLIMIT_NOFILE if
necessary.

That way we don't increase the number of FDs we use in the default
configuration drastically, but increasing the number only requires a config
change, it doesn't require also figuring out how to increase the settings in
whatever starts postgres. Which, e.g. in cloud environments, typically won't
be possible.

And when using something like io_uring for AIO, it'd allow to
max_files_per_process in addition to the files requires for the io_uring
instances.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-02-11 21:18:37 Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup
Previous Message Daniel Gustafsson 2025-02-11 21:06:07 Re: PATCH: Disallow a netmask of zero unless the IP is also all zeroes