Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup
Date: 2025-02-11 22:48:38
Message-ID: c2b5298f-93f3-4b25-a7d5-13e209f4e7b5@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/11/25 22:14, Andres Freund wrote:
> Hi,
>
> On 2025-02-11 21:04:25 +0100, Tomas Vondra wrote:
>> I agree the defaults may be pretty low for current systems, but do we
>> want to get into the business of picking a value and overriding whatever
>> value is set by the sysadmin? I don't think a high hard limit should be
>> seen as an implicit permission to just set is as the soft limit.
>
> As documented in the links sent by Jelte, that's *explicitly* the reasoning
> for the difference between default soft and hard limits. For safety programs
> should opt in into using higher FD limits, rather than being opted into it.
>

OK, I guess I was mistaken in how I understood the hard/soft limits.

>
>> Imagine you're a sysadmin / DBA who picks a low soft limit (for whatever
>> reason - there may be other stuff running on the system, ...). And then
>> postgres starts and just feels like bumping the soft limit. Sure, the
>> sysadmin can lower the hard limit and then we'll respect that, but I don't
>> recall any other tool requiring this approach, and it would definitely be
>> quite surprising to me.
>
> https://codesearch.debian.net/search?q=setrlimit&literal=1
>
> In a quick skim I found that at least gimp, openjdk, libreoffice, gcc, samba,
> dbus increase the soft limit to something closer to the hard limit. And that
> was at page 62 out of 1269.
>

Ack

>
>
>> I did run into bottlenecks due to "too few file descriptors" during a
>> recent experiments with partitioning, which made it pretty trivial to
>> get into a situation when we start trashing the VfdCache. I have a
>> half-written draft of a blog post about that somewhere.
>>
>> But my conclusion was that it's damn difficult to even realize that's
>> happening, especially if you don't have access to the OS / perf, etc. So my
>> takeaway was we should improve that first, so that people have a chance to
>> realize they have this issue, and can do the tuning. The improvements I
>> thought about were:
>
> Hm, that seems something orthogonal to me. I'm on board with that suggestion,
> but I don't see why that should stop us from having code to adjust the rlimit.
>

Right, it is somewhat orthogonal.

I was mentioning that mostly in the context of tuning the parameters we
already have (because how would you know you you have this problem /
what would be a good value to set). I'm not demanding that we do nothing
until the monitoring bits get implemented, but being able to monitor
this seems pretty useful even if we adjust the soft limit based on some
heuristic.

>
> My suggestion would be to redefine max_files_per_process as the number of
> files we try to be able to open in backends. I.e. set_max_safe_fds() would
> first count the number of already open fds (since those will largely be
> inherited by child processes) and then check if we can open up to
> max_files_per_process files in addition. Adjusting the RLIMIT_NOFILE if
> necessary.
>
> That way we don't increase the number of FDs we use in the default
> configuration drastically, but increasing the number only requires a config
> change, it doesn't require also figuring out how to increase the settings in
> whatever starts postgres. Which, e.g. in cloud environments, typically won't
> be possible.
>

Seems reasonable, +1 to this. But that just makes the monitoring bits
more important, because how would you know you need to increase the GUC?

regards

--
Tomas Vondra

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-02-11 22:55:39 Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup
Previous Message Nathan Bossart 2025-02-11 22:42:26 Re: Track the amount of time waiting due to cost_delay