From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | James Bottomley <James(dot)Bottomley(at)HansenPartnership(dot)com> |
Cc: | Hannu Krosing <hannu(at)2ndQuadrant(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Trond Myklebust <trondmy(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Dave Chinner <david(at)fromorbit(dot)com>, Joshua Drake <jd(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Mel Gorman <mgorman(at)suse(dot)de>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net> |
Subject: | Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance |
Date: | 2014-01-14 15:39:35 |
Message-ID: | 16757.1389713975@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
James Bottomley <James(dot)Bottomley(at)HansenPartnership(dot)com> writes:
> The current mechanism for coherency between a userspace cache and the
> in-kernel page cache is mmap ... that's the only way you get the same
> page in both currently.
Right.
> glibc used to have an implementation of read/write in terms of mmap, so
> it should be possible to insert it into your current implementation
> without a major rewrite. The problem I think this brings you is
> uncontrolled writeback: you don't want dirty pages to go to disk until
> you issue a write()
Exactly.
> I think we could fix this with another madvise():
> something like MADV_WILLUPDATE telling the page cache we expect to alter
> the pages again, so don't be aggressive about cleaning them.
"Don't be aggressive" isn't good enough. The prohibition on early write
has to be absolute, because writing a dirty page before we've done
whatever else we need to do results in a corrupt database. It has to
be treated like a write barrier.
> The problem is we can't give you absolute control of when pages are
> written back because that interface can be used to DoS the system: once
> we get too many dirty uncleanable pages, we'll thrash looking for memory
> and the system will livelock.
Understood, but that makes this direction a dead end. We can't use
it if the kernel might decide to write anyway.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Trond Myklebust | 2014-01-14 15:42:05 | Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance |
Previous Message | Pavel Stehule | 2014-01-14 15:39:33 | Re: plpgsql.consistent_into |