From: | Keaton Adams <kadams(at)mxlogic(dot)com> |
---|---|
To: | "Jim C(dot) Nasby" <decibel(at)decibel(dot)org> |
Cc: | pgsql-admin(at)postgresql(dot)org |
Subject: | Re: WAL file utilization question |
Date: | 2007-05-16 16:49:04 |
Message-ID: | 1179334144.22514.14.camel@MXLRMT-208.corp.mxlogic.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Well, no, I'm not that paranoid where I expect checkpoints to be taking
place so often. I do find it interesting that controlling checkpoint
frequency is a factor of the number of WAL files available. In order to
get up to 60 second checkpoints I had to set checkpoint_segments to 25,
which resulted in the creation of 52 log files in data/pg_xlog. So for
120 second checkpoints I will need to have up to 104 log files in
data/pg_xlog available? That's nearly 2 GB of space for WAL logs, and
it only nets on average a 2 minute checkpoint. That would appear to be
the pattern (2 * checkpoint_segments at the busiest time):
# - Checkpoints -
#3 - 8 seconds - 3 log files
#6 - 16 seconds - 13 log files
#9 - 29 seconds - 19 log files
#20 - 57 seconds - 41 log files
#25 - > 60 sec - 52 log files
checkpoint_segments = 25 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 300 # range 30-3600, in seconds
#checkpoint_warning = 60 # in seconds, 0 is off
So, if I do set the checkpoint_segments parameter to 50, 75, or even 100
I will have a considerable number of WAL files in data/pg_xlog, but they
will be used more efficiently, so I will be archiving less? That's what
I understand from the comments received so far.
On PITR it makes sense, given the nature of the WAL file, that only full
pages can be used. So I won't be able to set full_page_writes to false
in this case.
-Keaton
On Wed, 2007-05-16 at 11:35 -0500, Jim C. Nasby wrote:
> On Wed, May 16, 2007 at 10:28:29AM -0600, Keaton Adams wrote:
> > On most database systems I am used to, there is a physical log and
> > separate logical log. The physical log holds any ???before images??? of data
> > pages that have been modified since the last checkpoint. The logical log
> > holds the individual rows of data that have been inserted, updated,
> > deleted as well as checkpoint records and the like. If the server
> > crashes, fast recovery is initiated where (1) any pages in the physical
> > log are put back in the data files (tablespaces/dbspaces) on disk to get
> > back to a state of physical consistency and (2) individual transactions
> > in the logical log since the last checkpoint are rolled forward / rolled
> > back to get to a point of logical consistency.
>
> PostgreSQL combines the two, though there has been recent discussion
> about changing that. There may be some improvements in this regard in
> 8.3 (I don't remember if the patches were accepted or not).
>
> > Even with full_page_writes set to false and checkpoints taking place
> > every 60 seconds or so, the amount of WAL log data generated per minute
> > seems to be significant.
>
> Wow, do you really want to be checkpointing every 60 seconds? That's
> going to greatly increase your WAL volume, as well as the size of WAL.
>
> > So my question is this: If I kick off a transaction that loads records
> > with a size of 100 bytes, does the insert for that record take 100 bytes
> > in the WAL file, or is the data archived in the WAL log in page size
> > (8k) portions? So with piggyback commits if I can only stack up 2K worth
> > of data before the next LogFlush, will 2K be written to the WAL file, or
> > will 8K be written each time regardless of the amount of actual
> > transaction data that is available to flush?
>
> (Generally) WAL only records differences. The exception is that the
> first time a page is modified after a checkpoint, the entire page is
> written out to WAL.
>
> > Since there is no separate physical log to keep track of dirty/modified
> > pages since the last checkpoint I would assume that the WAL log is
> > serving a dual purpose of being able to get back to the point of
> > physical and logical database consistency, but I need to know for
> > certain that there is not a way to reduce the amount of WAL data being
> > written for the amount of transaction information we are actually
> > writing to the database at any given point in time.
>
> The only way to do it right now is to reduce the frequency of your
> checkpoints. IIRC you can't actually disable full page writes if you're
> using PITR.
From | Date | Subject | |
---|---|---|---|
Next Message | Jim C. Nasby | 2007-05-16 16:51:43 | Re: WAL file utilization question |
Previous Message | Alvaro Herrera | 2007-05-16 16:35:20 | Re: WAL file utilization question |