From: | Trond Myklebust <trondmy(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Joshua Drake <jd(at)commandprompt(dot)com>, Mel Gorman <mgorman(at)suse(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net> |
Subject: | Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance |
Date: | 2014-01-13 20:53:36 |
Message-ID: | D09B5388-497E-41A4-9F43-F608F3021114@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Jan 13, 2014, at 15:40, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-01-13 15:15:16 -0500, Robert Haas wrote:
>> On Mon, Jan 13, 2014 at 1:51 PM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
>>> I notice, Josh, that you didn't mention the problems many people
>>> have run into with Transparent Huge Page defrag and with NUMA
>>> access.
>>
>> Amen to that. Actually, I think NUMA can be (mostly?) fixed by
>> setting zone_reclaim_mode; is there some other problem besides that?
>
> I think that fixes some of the worst instances, but I've seen machines
> spending horrible amounts of CPU (& BUS) time in page reclaim
> nonetheless. If I analyzed it correctly it's in RAM << working set
> workloads where RAM is pretty large and most of it is used as page
> cache. The kernel ends up spending a huge percentage of time finding and
> potentially defragmenting pages when looking for victim buffers.
>
>> On a related note, there's also the problem of double-buffering. When
>> we read a page into shared_buffers, we leave a copy behind in the OS
>> buffers, and similarly on write-out. It's very unclear what to do
>> about this, since the kernel and PostgreSQL don't have intimate
>> knowledge of what each other are doing, but it would be nice to solve
>> somehow.
>
> I've wondered before if there wouldn't be a chance for postgres to say
> "my dear OS, that the file range 0-8192 of file x contains y, no need to
> reread" and do that when we evict a page from s_b but I never dared to
> actually propose that to kernel people...
O_DIRECT was specifically designed to solve the problem of double buffering between applications and the kernel. Why are you not able to use that in these situations?
Cheers,
Trond
From | Date | Subject | |
---|---|---|---|
Next Message | Marti Raudsepp | 2014-01-13 20:54:42 | Re: PoC: Partial sort |
Previous Message | Peter Geoghegan | 2014-01-13 20:53:29 | Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE |