Re: question regarding full_page_write

From: Martín Marqués <martin(dot)marques(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: AI Rumman <rummandba(at)gmail(dot)com>, pgsql-general General <pgsql-general(at)postgresql(dot)org>
Subject: Re: question regarding full_page_write
Date: 2011-08-24 15:12:00
Message-ID: CABeG9Ls7d1pJ_VkxzGLxHro5EUTFos_+Xt-H3u_oF6UKAGqNMQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

El día 22 de agosto de 2011 18:39, Greg Smith <greg(at)2ndquadrant(dot)com> escribió:
> On 08/22/2011 05:07 PM, Martín Marqués wrote:
>>
>> My question regarding your answer is, why is it important for the
>> first page after a checkpoint and not on other page writes?
>>
>
> The first time a page is written after a checkpoint, when full_page_writes
> is on, the entire 8K page is written out to disk at that point.  The idea is
> that if the page is corrupted in any way by a partial write, you can restore
> it to a known good state again by using this version.  After that copy,
> though, additional modifications to the page only need to save the delta of
> what changed, at the row level.  If there's a crash, during recovery the
> full page image will be written, then the series of deltas, ending up with
> the same data as was intended.
>
> This whole mechanism resets again each time a checkpoint finishes, and the
> full page writes start all over again.  One of the main purposes of
> checkpoints are to move forward the pointer of how far back crash recovery
> needs to replay from.  Starting each new checkpoint over again, with a full
> copy of all the data modified going into the WAL, it is part of that logic.

Still something missing (for me :-)):

Checkpoint happens, logs get flushed to disk, WAL segments get
archived if archive is on, and then the are recycled.

Now a new transaction get commited, and at least 1 page has to go to
WAL (those are 8Kb). full_page_writes garantees all 8Kb are written or
nothing. After this thansaction comes another, but here you don't need
the garantee of all 8Kb in WAL (you say because deltas are enough).

Why aren't deltas good enough for the first 8Kb? Is there other
information in the first 8Kb that make those more important?

--
Martín Marqués
select 'martin.marques' || '@' || 'gmail.com'
DBA, Programador, Administrador

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Samba 2011-08-24 15:33:17 Streaming Replication: Observations, Questions and Comments
Previous Message Pavel Stehule 2011-08-24 15:06:14 Re: init script or procedure