Re: High checkpoint_segments

From: Venkat Balaji <venkat(dot)balaji(at)verse(dot)in>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, Jay Levitt <jay(dot)levitt(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: High checkpoint_segments
Date: 2012-02-15 13:23:51
Message-ID: CAFrxt0jnGNy_8WM70w6g5rm14PgLn-5d3JaK_YHJp9FUwzBHcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> > Data loss would be an issue when there is a server crash or pg_xlog crash
> > etc. That many number of pg_xlog files (1000) would contribute to huge
> > data
> > loss (data changes not synced to the base are not guaranteed). Of-course,
> > this is not related to the current situation. Normally we calculate the
> > checkpoint completion time, IO pressure, CPU load and the threat to the
> > data loss when we configure checkpoint_segments.
>
> So you're saying that by using small number of checkpoint segments you
> limit the data loss when the WAL gets corrupted/lost? That's a bit like
> buying a Maseratti and then not going faster than 10mph because you might
> crash at higher speeds ...
>

No. I am not saying that checkpoint_segments must be lower. I was just
trying to explain the IO over-head on putting high (as high as 1000)
checkpoint segments. Lower number of checkpoint segments will lead to more
frequent IOs which is not good. Agreed.

>
> The problem here is that the WAL is usually placed on more reliable drives
> (compared to the data files) or a RAID1 array and as it's just writing
> data sequentially, so the usage pattern is much less likely to cause
> data/drive corruption (compared to data files that need to handle a lot of
> random I/O, etc.).
>

Agreed.

> So while it possible the WAL might get corrupted, the probability of data
> file corruption is much higher. And the corruption might easily happen
> silently during a checkpoint, so there won't be any WAL segments no matter
> how many of them you keep ...
>

Agreed. When corruption occurs, it really does not matter how many WAL
segments are kept in pg_xlog.
But, at any point of time if PG needs

> And by using low number of checkpoint segments it actually gets worse,
> because it means more frequent checkpoints -> more I/O on the drives ->
> more wearout of the drives etc.
>

Completely agreed. As mentioned above. I choose checkpoint_segments and
checkpoint_timeout once i observe the checkpoint behavior.

If you need to protect yourself against this, you need to keep a WAL
> archive (prefferably on a separate machine) and/or a hot standby for
> failover.
>

WAL archiving is a different situation where-in you need to backup the
pg_xlog files by enabling archiving.
I was referring to an exclusive situation, where-in pg_xlogs are not
archived and data is not yet been synced to base files (by bgwriter) and
the system crashed, then PG would depend on pg_xlog to recover and reach
the consistent state, if the pg_xlog is also not available, then there
would be a data loss and this depends on how much data is present in
pg_xlog files.

Thanks,
VB

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Rajan, Pavithra 2012-02-15 13:29:58 Dump functions alone
Previous Message Andres Freund 2012-02-15 10:42:00 Re: High checkpoint_segments