From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | James Bottomley <James(dot)Bottomley(at)hansenpartnership(dot)com> |
Cc: | Andres Freund <andres(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Trond Myklebust <trondmy(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Dave Chinner <david(at)fromorbit(dot)com>, Joshua Drake <jd(at)commandprompt(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Mel Gorman <mgorman(at)suse(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net> |
Subject: | Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance |
Date: | 2014-01-14 20:09:14 |
Message-ID: | CA+TgmoYymP-TzDsewfu2RYTb0EpHF8GCqGWkE8QvevHayb9fVw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jan 14, 2014 at 3:00 PM, James Bottomley
<James(dot)Bottomley(at)hansenpartnership(dot)com> wrote:
>> Doesn't sound exactly like what I had in mind. What I was suggesting
>> is an analogue of read() that, if it reads full pages of data to a
>> page-aligned address, shares the data with the buffer cache until it's
>> first written instead of actually copying the data.
>
> The only way to make this happen is mmap the file to the buffer and use
> MADV_WILLNEED.
>
>> The pages are
>> write-protected so that an attempt to write the address range causes a
>> page fault. In response to such a fault, the pages become anonymous
>> memory and the buffer cache no longer holds a reference to the page.
>
> OK, so here I thought of another madvise() call to switch the region to
> anonymous memory. A page fault works too, of course, it's just that one
> per page in the mapping will be expensive.
I don't think either of these ideas works for us. We start by
creating a chunk of shared memory that all processes (we do not use
threads) will have mapped at a common address, and we read() and
write() into that chunk.
> Do you care about handling aliases ... what happens if someone else
> reads from the file, or will that never occur? The reason for asking is
> that it's much easier if someone else mmapping the file gets your
> anonymous memory than we create an alias in the page cache.
All reads and writes go through the buffer pool stored in shared
memory, but any of the processes that have that shared memory region
mapped could be responsible for any individual I/O request.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Shigeru Hanada | 2014-01-14 20:40:52 | Re: inherit support for foreign tables |
Previous Message | Robert Haas | 2014-01-14 20:05:56 | Re: Add CREATE support to event triggers |