From: | Tomas Vondra <tomas(at)vondra(dot)me> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup |
Date: | 2025-02-11 22:33:45 |
Message-ID: | a4c0388f-02f8-4e5a-9638-616aabf3f9e3@vondra.me |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2/11/25 21:18, Tom Lane wrote:
> Tomas Vondra <tomas(at)vondra(dot)me> writes:
>> I did run into bottlenecks due to "too few file descriptors" during a
>> recent experiments with partitioning, which made it pretty trivial to
>> get into a situation when we start trashing the VfdCache. I have a
>> half-written draft of a blog post about that somewhere.
>
>> But my conclusion was that it's damn difficult to even realize that's
>> happening, especially if you don't have access to the OS / perf, etc.
>
> Yeah. fd.c does its level best to keep going even with only a few FDs
> available, and it's hard to tell that you have a performance problem
> arising from that. (Although I recall old war stories about Postgres
> continuing to chug along just fine after it'd run the kernel out of
> FDs, although every other service on the system was crashing left and
> right, making it difficult e.g. even to log in. That scenario is why
> I'm resistant to pushing our allowed number of FDs to the moon...)
>
>> So
>> my takeaway was we should improve that first, so that people have a
>> chance to realize they have this issue, and can do the tuning. The
>> improvements I thought about were:
>
>> - track hits/misses for the VfdCache (and add a system view for that)
>
> I think what we actually would like to know is how often we have to
> close an open FD in order to make room to open a different file.
> Maybe that's the same thing you mean by "cache miss", but it doesn't
> seem like quite the right terminology. Anyway, +1 for adding some way
> to discover how often that's happening.
>
We can count the evictions (i.e. closing a file so that we can open a
new one) too, but AFAICS that's about the same as counting "misses"
(opening a file after not finding it in the cache). After the cache
warms up, those counts should be about the same, I think.
Or am I missing something?
>> - maybe have wait event for opening/closing file descriptors
>
> Not clear that that helps, at least for this specific issue.
>
I don't think Jelte described any specific issue, but the symptoms I've
observed were that a query was accessing a table with ~1000 relations
(partitions + indexes), trashing the vfd cache, getting ~0% cache hits.
And the open/close calls were taking a lot of time (~25% CPU time).
That'd be very visible as a wait event, I believe.
>> - show max_safe_fds value somewhere, not just max_files_per_process
>> (which we may silently override and use a lower value)
>
> Maybe we should just assign max_safe_fds back to max_files_per_process
> after running set_max_safe_fds? The existence of two variables is a
> bit confusing anyhow. I vaguely recall that we had a reason for
> keeping them separate, but I can't think of the reasoning now.
>
That might work. I don't know what were the reasons for not doing that,
I suppose there were reasons not to do that.
regards
--
Tomas Vondra
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2025-02-11 22:41:51 | Re: describe special values in GUC descriptions more consistently |
Previous Message | Peter Smith | 2025-02-11 22:31:44 | Re: describe special values in GUC descriptions more consistently |