From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Fabio Ugo Venchiarutti <f(dot)venchiarutti(at)ocado(dot)com> |
Cc: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Michael Curry <curry(at)cs(dot)umd(dot)edu>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: perf tuning for 28 cores and 252GB RAM |
Date: | 2019-06-18 18:10:11 |
Message-ID: | 20190618181011.hspcoxmi6mu4cq5h@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi,
On 2019-06-18 17:13:20 +0100, Fabio Ugo Venchiarutti wrote:
> Does the backend mmap() data files when that's possible?
No. That doesn't allow us to control when data is written back to disk,
which is crucial for durability/consistency.
> I've heard the "use the page cache" suggestion before, from users and
> hackers alike, but I never quite heard a solid argument dismissing potential
> overhead-related ill effects of the seek() & read() syscalls if they're
> needed, especially on many random page fetches.
We don't issue seek() for reads anymore in 12, instead do a pread() (but
it's not a particularly meaningful performance improvement). The read
obviously has cost, especially with syscalls getting more and more
expensive due to the mitigation for intel vulnerabilities.
I'd say that a bigger factor than the overhead of the read itself is
that for many workloads we'll e.g. incur additional writes when s_b is
smaller, that the kernel has less information about when to discard
data, that the kernel pagecaches have some scalability issues (partially
due to their generality), and double buffering.
> Given that shmem-based shared_buffers are bound to be mapped into the
> backend's address space anyway, why isn't that considered always
> preferable/cheaper?
See e.g. my point in my previous email in this thread about
drop/truncate.
> I'm aware that there are other benefits in counting on the page cache (eg:
> staying hot in the face of a backend restart), however I'm considering
> performance in steady state here.
There's also the issue that using a large shared buffers setting means
that each process' page table gets bigger, unless you configure
huge_pages. Which one definitely should - but that's an additional
configuration step that requires superuser access on most operating
systems.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2019-06-18 18:20:11 | Re: delete inside for plpgsql loop on same relation? |
Previous Message | Michael Curry | 2019-06-18 18:08:30 | Re: perf tuning for 28 cores and 252GB RAM |