Re: Need help understanding WAL and checkpoints

From: drew_hunt1976(at)yahoo(dot)co(dot)jp
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Subject: Re: Need help understanding WAL and checkpoints
Date: 2013-02-07 10:16:21
Message-ID: 325825.13507.qm@web4505.mail.ogk.yahoo.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Albe

--- On Wed, 2013/2/6, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:

> drew_hunt wrote:
> > I'm trying to get my head around WAL and checkpoints and need to ask a couple of questions before I
> > get a headache.
> >
> > Firstly, I see the terms "WAL log", "WAL file" and "transaction log" all over the place - are these
> > the same thing (i.e. files in the pg_xlog directory)?
>
> Usually they mean the same thing.
> For exact definitions, read
> http://www.postgresql.org/docs/current/static/continuous-archiving.html#BACKUP-ARCHIVING-WAL
> "WAL" consists of a stream of "WAL records" and is physically
> represented as "WAL segment files" (in pg_xlog).
>
> > I'm a bit confused by this paragraph in the docs:
> >
> > "Checkpoints are points in the sequence of transactions at which it is guaranteed that the heap and
> > index data files have been updated with all information written before the checkpoint. At checkpoint
> > time, all dirty data pages are flushed to disk and a special checkpoint record is written to the log
> > file. (The changes were previously flushed to the WAL files.)"
> >
> > ( see: http://www.postgresql.org/docs/current/static/wal-configuration.html )
> >
> > "a special checkpoint record is written to the log file."
> >     -> which log file is meant here?
>
> The WAL.
>
> > "The changes were previously flushed to the WAL files."
> >    -> does "previously" here mean "at a previous point in time" or "in previous PostgreSQL versions"?
>
> The former.
>
> >    -> at what point are changes flushed to WAL files?
>
> The change must be on disk in a WAL segment before the
> transaction can commit.
>
> > So say I perform an operation like :
> >
> >   UPDATE foo SET bar='baz'
> >
> > are the following assumptions correct?
> >
> > - The first time this changed data hits the disk, it is as an entry in the WAL log
> >
> > - At some point a checkpoint occurs, and the changed data is written to the actual data file from
> > system memory (the "dirty data pages"?)
> >
> > - the only time the actual data files will be updated from the WAL log (i.e. not from system memory)
> > will be after a crash, when the logs are replayed from the last checkpoint?
>
> All three are correct.

Many thanks for the answers, its cleared things up for me :)

- drew

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Anoop K 2013-02-07 10:19:38 Re: REINDEX deadlock - Postgresql -9.1
Previous Message Scott Marlowe 2013-02-07 10:07:41 Re: REINDEX deadlock - Postgresql -9.1