Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance

From: Jim Nasby <jim(at)nasby(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, James Bottomley <James(dot)Bottomley(at)HansenPartnership(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Joshua Drake <jd(at)commandprompt(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Mel Gorman <mgorman(at)suse(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Date: 2014-01-14 00:43:11
Message-ID: 52D4881F.4090208@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/13/14, 4:44 PM, Andres Freund wrote:
>>> > > One major usecase is transplanting a page comming from postgres'
>>> > >buffers into the kernel's buffercache because the latter has a much
>>> > >better chance of properly allocating system resources across independent
>>> > >applications running.
>> >
>> >If you want to share pages between the application and the page cache,
>> >the only known interface is mmap ... perhaps we can discuss how better
>> >to improve mmap for you?
> I think purely using mmap() is pretty unlikely to work out - there's
> just too many constraints about when a page is allowed to be written out
> (e.g. it's interlocked with postgres' write ahead log). I also think
> that for many practical purposes using mmap() would result in an absurd
> number of mappings or mapping way too huge areas; e.g. large btree
> indexes are usually accessed in a quite fragmented manner.

Which brings up another interesting area^Wcan-of-worms: the database is implementing journaling on top of a filesystem that's probably also journaling. And it's going to get worse: a Segate researcher presented at RICon East last year that the next generation (or maybe the one after that) of spinning rust will use "shingling", which means that the drive can't write randomly. So now the drive will ALSO have to journal. And of course SSDs already do this.

So now there's *three* pieces of software all doing the exact same thing, none of which are able to coordinate with each other.
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2014-01-14 00:46:24 Re: Linux kernel impact on PostgreSQL performance
Previous Message Florian Pflug 2014-01-14 00:36:49 Re: plpgsql.consistent_into