From: | Trond Myklebust <trondmy(at)gmail(dot)com> |
---|---|
To: | Hannu Krosing <hannu(at)2ndQuadrant(dot)com> |
Cc: | Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Joshua Drake <jd(at)commandprompt(dot)com>, Mel Gorman <mgorman(at)suse(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net> |
Subject: | Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance |
Date: | 2014-01-14 00:48:56 |
Message-ID: | 190E6EA3-7B06-4315-9E4C-33FBEC961531@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Jan 13, 2014, at 19:03, Hannu Krosing <hannu(at)2ndQuadrant(dot)com> wrote:
> On 01/13/2014 09:53 PM, Trond Myklebust wrote:
>> On Jan 13, 2014, at 15:40, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>>
>>> On 2014-01-13 15:15:16 -0500, Robert Haas wrote:
>>>> On Mon, Jan 13, 2014 at 1:51 PM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
>>>>> I notice, Josh, that you didn't mention the problems many people
>>>>> have run into with Transparent Huge Page defrag and with NUMA
>>>>> access.
>>>> Amen to that. Actually, I think NUMA can be (mostly?) fixed by
>>>> setting zone_reclaim_mode; is there some other problem besides that?
>>> I think that fixes some of the worst instances, but I've seen machines
>>> spending horrible amounts of CPU (& BUS) time in page reclaim
>>> nonetheless. If I analyzed it correctly it's in RAM << working set
>>> workloads where RAM is pretty large and most of it is used as page
>>> cache. The kernel ends up spending a huge percentage of time finding and
>>> potentially defragmenting pages when looking for victim buffers.
>>>
>>>> On a related note, there's also the problem of double-buffering. When
>>>> we read a page into shared_buffers, we leave a copy behind in the OS
>>>> buffers, and similarly on write-out. It's very unclear what to do
>>>> about this, since the kernel and PostgreSQL don't have intimate
>>>> knowledge of what each other are doing, but it would be nice to solve
>>>> somehow.
>>> I've wondered before if there wouldn't be a chance for postgres to say
>>> "my dear OS, that the file range 0-8192 of file x contains y, no need to
>>> reread" and do that when we evict a page from s_b but I never dared to
>>> actually propose that to kernel people...
>> O_DIRECT was specifically designed to solve the problem of double buffering
>> between applications and the kernel. Why are you not able to use that in these situations?
> What is asked is the opposite of O_DIRECT - the write from a buffer inside
> postgresql to linux *buffercache* and telling linux that it is the same
> as what
> is currently on disk, so don't bother to write it back ever.
I don’t understand. Are we talking about mmap()ed files here? Why would the kernel be trying to write back pages that aren’t dirty?
> This would avoid current double-buffering between postgresql and linux
> buffer caches while still making use of linux cache when possible.
>
> The use case is pages that postgresql has moved into its buffer cache
> but which it has not modified. They will at some point be evicted from the
> postgresql cache, but it is likely that they will still be needed
> sometime soon,
> so what is required is "writing them back" to the original file, only
> they should
> not really be written - or marked dirty to be written later - more
> levels than
> just to the linux cache, as they *already* are on the disk.
>
> It is probably ok to put them in the LRU position as they are "written"
> out from postgresql, though it may be better if we get some more control
> over
> where in the LRU order they would be placed. It may make sense to put them
> there based on when they were last read while residing inside postgresql
> cache
>
> Cheers
>
>
> --
> Hannu Krosing
> PostgreSQL Consultant
> Performance, Scalability and High Availability
> 2ndQuadrant Nordic OÜ
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Nasby | 2014-01-14 00:55:41 | Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance |
Previous Message | Jim Nasby | 2014-01-14 00:46:24 | Re: Linux kernel impact on PostgreSQL performance |