From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Sean Chittenden <sean(at)chittenden(dot)org>, "Jim C(dot) Nasby" <jim(at)nasby(dot)net>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: O_DIRECT in freebsd |
Date: | 2003-06-23 02:43:36 |
Message-ID: | 200306230243.h5N2haa15347@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Basically, I think we need free-behind rather than O_DIRECT.
>
> There are two separate issues here --- one is what's happening in our
> own cache, and one is what's happening in the kernel disk cache.
> Implementing our own free-behind code would help in our own cache but
> does nothing for the kernel cache.
Right.
> My thought on this is that for large seqscans we could think about
> doing reads through a file descriptor that's opened with O_DIRECT.
> But writes should never go through O_DIRECT. In some scenarios this
> would mean having two FDs open for the same relation file. This'd
> require moderately extensive changes to the smgr-related APIs, but
> it doesn't seem totally out of the question. I'd kinda like to see
> some experimental evidence that it's worth doing though. Anyone
> care to make a quick-hack prototype and do some measurements?
True, it is a cost/benefit issue. My assumption was that once we have
free-behind in the PostgreSQL shared buffer cache, the kernel cache
issues would be minimal, but I am willing to be found wrong.
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2003-06-23 02:45:06 | Re: O_DIRECT in freebsd |
Previous Message | Weiping He | 2003-06-23 02:41:42 | Re: a problem with index and user define type |