Re: Linux kernel impact on PostgreSQL performance

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Mel Gorman <mgorman(at)suse(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Joshua Drake <jd(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>
Subject: Re: Linux kernel impact on PostgreSQL performance
Date: 2014-01-13 20:15:16
Message-ID: CA+TgmoaB-drKnj5+ndzbStEiSLhegQYbgWLA4_SWv4tGyqMJrw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 13, 2014 at 1:51 PM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
> I notice, Josh, that you didn't mention the problems many people
> have run into with Transparent Huge Page defrag and with NUMA
> access.

Amen to that. Actually, I think NUMA can be (mostly?) fixed by
setting zone_reclaim_mode; is there some other problem besides that?

The other thing that comes to mind is the kernel's caching behavior.
We've talked a lot over the years about the difficulties of getting
the kernel to write data out when we want it to and to not write data
out when we don't want it to. When it writes data back to disk too
aggressively, we get lousy throughput because the same page can get
written more than once when caching it for longer would have allowed
write-combining. When it doesn't write data to disk aggressively
enough, we get huge latency spikes at checkpoint time when we call
fsync() and the kernel says "uh, what? you wanted that data *on the
disk*? sorry boss!" and then proceeds to destroy the world by starving
the rest of the system for I/O for many seconds or minutes at a time.
We've made some desultory attempts to use sync_file_range() to improve
things here, but I'm not sure that's really the right tool, and if it
is we don't know how to use it well enough to obtain consistent
positive results.

On a related note, there's also the problem of double-buffering. When
we read a page into shared_buffers, we leave a copy behind in the OS
buffers, and similarly on write-out. It's very unclear what to do
about this, since the kernel and PostgreSQL don't have intimate
knowledge of what each other are doing, but it would be nice to solve
somehow.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-01-13 20:17:38 Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Previous Message Stephen Frost 2014-01-13 20:06:59 Re: Planning time in explain/explain analyze