Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Dave Chinner <david(at)fromorbit(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Mel Gorman <mgorman(at)suse(dot)de>, Josh Berkus <josh(at)agliodbs(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Joshua Drake <jd(at)commandprompt(dot)com>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Date: 2014-01-15 22:29:40
Message-ID: CAMkU=1xNTWDyz-5Ghx_cNajEN3kcA3on8W7ZCiRhVRLddEPssA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 15, 2014 at 7:12 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:
> > On 01/15/2014 07:50 AM, Dave Chinner wrote:
> >> FWIW [and I know you're probably sick of hearing this by now], but
> >> the blk-io throttling works almost perfectly with applications that
> >> use direct IO.....
>
> > For checkpoint writes, direct I/O actually would be reasonable.
> > Bypassing the OS cache is a good thing in that case - we don't want the
> > written pages to evict other pages from the OS cache, as we already have
> > them in the PostgreSQL buffer cache.
>
> But in exchange for that, we'd have to deal with selecting an order to
> write pages that's appropriate depending on the filesystem layout,
> other things happening in the system, etc etc. We don't want to build
> an I/O scheduler, IMO, but we'd have to.
>
> > Writing one page at a time with O_DIRECT from a single process might be
> > quite slow, so we'd probably need to use writev() or asynchronous I/O to
> > work around that.
>
> Yeah, and if the system has multiple spindles, we'd need to be issuing
> multiple O_DIRECT writes concurrently, no?
>

writev effectively does do that, doesn't it? But they do have to be on the
same file handle, so that could be a problem. I think we need something
like sorted checkpoints sooner or later, anyway.

> What we'd really like for checkpointing is to hand the kernel a boatload
> (several GB) of dirty pages and say "how about you push all this to disk
> over the next few minutes, in whatever way seems optimal given the storage
> hardware and system situation. Let us know when you're done."

And most importantly, "Also, please don't freeze up everything else in the
process"

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-01-15 23:31:21 Re: Backup throttling
Previous Message Simon Riggs 2014-01-15 22:14:43 Re: Turning off HOT/Cleanup sometimes