From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Redesigning checkpoint_segments |
Date: | 2013-06-05 18:35:32 |
Message-ID: | 51AF84F4.4000504@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 05.06.2013 21:16, Fujii Masao wrote:
> On Wed, Jun 5, 2013 at 9:16 PM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> I propose that we do something similar, but not exactly the same. Let's have
>> a setting, max_wal_size, to control the max. disk space reserved for WAL.
>> Once that's reached (or you get close enough, so that there are still some
>> segments left to consume while the checkpoint runs), a checkpoint is
>> triggered.
>
> What if max_wal_size is reached while the checkpoint is running? We should
> change the checkpoint from spread mode to fast mode?
The checkpoint spreading code already tracks if the checkpoint is "on
schedule", and it takes into account both checkpoint_timeout and
checkpoint_segments. Ie. if you consume segments faster than expected,
the checkpoint will speed up as well. Once checkpoint_segments is
reached, the checkpoint will complete ASAP, with no delays to spread it out.
This would still work the same with max_wal_size. A new checkpoint would
be started well before reaching max_wal_size, so that it has enough time
to complete. If the checkpoint "falls behind", it will hurry up until
it's back on schedule. If max_wal_size is reached anyway, it will
complete ASAP.
> Or, if max_wal_size
> is hard limit, we should keep the allocation of new WAL file waiting until
> the checkpoint has finished and removed some old WAL files?
I was not thinking of making it a hard limit. It would be just like
checkpoint_segments from that point of view - if a checkpoint takes a
long time, max_wal_size might still be exceeded.
>> In this proposal, the number of segments preallocated is controlled
>> separately from max_wal_size, so that you can set max_wal_size high, without
>> actually consuming that much space in normal operation. It's just a
>> backstop, to avoid completely filling the disk, if there's a sudden burst of
>> activity. The number of segments preallocated is auto-tuned, based on the
>> number of segments used in previous checkpoint cycles.
>
> How is wal_keep_segments handled in your approach?
Hmm, haven't thought about that. I think a better unit to set
wal_keep_segments in would also be MB, not segments. Perhaps
max_wal_size should include WAL retained for wal_keep_segments, leaving
less room for checkpoints. Ie. when you you set wal_keep_segments
higher, a xlog-based checkpoint would be triggered earlier, because the
old segments kept for replication would leave less room for new
segments. And setting wal_keep_segments higher than max_wal_size would
be an error.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2013-06-05 19:07:56 | Re: Configurable location for extension .control files |
Previous Message | Fujii Masao | 2013-06-05 18:16:09 | Re: Redesigning checkpoint_segments |