From: | Venkat Balaji <venkat(dot)balaji(at)verse(dot)in> |
---|---|
To: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
Cc: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, Jay Levitt <jay(dot)levitt(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: High checkpoint_segments |
Date: | 2012-02-15 13:23:51 |
Message-ID: | CAFrxt0jnGNy_8WM70w6g5rm14PgLn-5d3JaK_YHJp9FUwzBHcg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
> > Data loss would be an issue when there is a server crash or pg_xlog crash
> > etc. That many number of pg_xlog files (1000) would contribute to huge
> > data
> > loss (data changes not synced to the base are not guaranteed). Of-course,
> > this is not related to the current situation. Normally we calculate the
> > checkpoint completion time, IO pressure, CPU load and the threat to the
> > data loss when we configure checkpoint_segments.
>
> So you're saying that by using small number of checkpoint segments you
> limit the data loss when the WAL gets corrupted/lost? That's a bit like
> buying a Maseratti and then not going faster than 10mph because you might
> crash at higher speeds ...
>
No. I am not saying that checkpoint_segments must be lower. I was just
trying to explain the IO over-head on putting high (as high as 1000)
checkpoint segments. Lower number of checkpoint segments will lead to more
frequent IOs which is not good. Agreed.
>
> The problem here is that the WAL is usually placed on more reliable drives
> (compared to the data files) or a RAID1 array and as it's just writing
> data sequentially, so the usage pattern is much less likely to cause
> data/drive corruption (compared to data files that need to handle a lot of
> random I/O, etc.).
>
Agreed.
> So while it possible the WAL might get corrupted, the probability of data
> file corruption is much higher. And the corruption might easily happen
> silently during a checkpoint, so there won't be any WAL segments no matter
> how many of them you keep ...
>
Agreed. When corruption occurs, it really does not matter how many WAL
segments are kept in pg_xlog.
But, at any point of time if PG needs
> And by using low number of checkpoint segments it actually gets worse,
> because it means more frequent checkpoints -> more I/O on the drives ->
> more wearout of the drives etc.
>
Completely agreed. As mentioned above. I choose checkpoint_segments and
checkpoint_timeout once i observe the checkpoint behavior.
If you need to protect yourself against this, you need to keep a WAL
> archive (prefferably on a separate machine) and/or a hot standby for
> failover.
>
WAL archiving is a different situation where-in you need to backup the
pg_xlog files by enabling archiving.
I was referring to an exclusive situation, where-in pg_xlogs are not
archived and data is not yet been synced to base files (by bgwriter) and
the system crashed, then PG would depend on pg_xlog to recover and reach
the consistent state, if the pg_xlog is also not available, then there
would be a data loss and this depends on how much data is present in
pg_xlog files.
Thanks,
VB
From | Date | Subject | |
---|---|---|---|
Next Message | Rajan, Pavithra | 2012-02-15 13:29:58 | Dump functions alone |
Previous Message | Andres Freund | 2012-02-15 10:42:00 | Re: High checkpoint_segments |