From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Helge Bahmann <bahmann(at)math(dot)tu-freiberg(dot)de> |
Cc: | Janardhana Reddy <jana-reddy(at)mediaring(dot)com(dot)sg>, pgsql-patches <pgsql-patches(at)postgresql(dot)org> |
Subject: | Re: WAL Performance Improvements |
Date: | 2002-02-26 16:48:53 |
Message-ID: | 428.1014742133@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Helge Bahmann <bahmann(at)math(dot)tu-freiberg(dot)de> writes:
> This is not to say that your WAL optimization is worthless, but the
> benchmark you gave is certainly wrong.
There is actually good reason to think that the change would be a net
loss in many scenarios. The problem is that if you write a partial
filesystem block, the kernel must first read in the old contents of
the block, then overlay the data you've specified to write onto the
appropriate part of the buffer. That disk read takes time --- and
what's worse, it's physical I/O that will be done while the process
requesting the write is holding the WALWriteLock. (AFAIK the kernel
will not absorb the user data until it's got a buffer to dump it
into; anyone want to dig into kernel sources and confirm that?)
On the other hand, when you write a full block, there's no need to
read the old block contents. The user data will just be copied to
a freshly-allocated kernel disk buffer. This is why I suggested
that the first write of a WAL block should write the entire block.
We can hope that subsequent writes of just part of the block will find
the block still in kernel disk buffers, and so avoid a read operation.
AFAICS the only real win that can be gotten with a change like
Janardhana's would be to avoid writing multiple blocks in the case
where the filesystem block size is smaller than the xlog's BLCKSZ.
Tuning this correctly would require knowing the kernel's block size.
Anyone have ideas about a portable way to find that out?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2002-02-26 16:51:54 | Re: quotes in SET grammar |
Previous Message | Thomas Lockhart | 2002-02-26 16:44:42 | Re: quotes in SET grammar |
From | Date | Subject | |
---|---|---|---|
Next Message | Helge Bahmann | 2002-02-26 17:31:49 | Re: WAL Performance Improvements |
Previous Message | Tom Lane | 2002-02-26 16:00:33 | Re: minor doc patch for example in 'SET' docs |