From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
Cc: | Venkata Balaji N <nag1010(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Redesigning checkpoint_segments |
Date: | 2015-02-03 21:18:34 |
Message-ID: | 54D13B2A.6040808@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 02/03/2015 07:50 AM, Robert Haas wrote:
> On Tue, Feb 3, 2015 at 10:44 AM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> That's the whole point of this patch. "max_checkpoint_segments = 10", or
>> "max_checkpoint_segments = 160 MB", means that the system will begin a
>> checkpoint so that when the checkpoint completes, and it truncates away or
>> recycles old WAL, the total size of pg_xlog is 160 MB.
>>
>> That's different from our current checkpoint_segments setting. With
>> checkpoint_segments, the documented formula for calculating the disk usage
>> is (2 + checkpoint_completion_target) * checkpoint_segments * 16 MB. That's
>> a lot less intuitive to set.
>
> Hmm, that's different from what I was thinking. We probably shouldn't
> call that max_checkpoint_segments, then. I got confused and thought
> you were just trying to decouple the number of segments that it takes
> to trigger a checkpoint from the number we keep preallocated.
Wait, what? Because the new setting is an actual soft maximum, we
*shouldn't* call it a maximum? Or are you saying something else?
On 02/03/2015 04:25 AM, Heikki Linnakangas wrote:
> On 02/02/2015 04:21 PM, Andres Freund wrote:
>> I think we need to increase checkpoint_timeout too - that's actually
>> just as important for the default experience from my pov. 5 minutes
>> often just unnecessarily generates FPWs en masse.
I have yet to see any serious benchmarking on checkpoint_timeout. It
does seem that for some workloads on some machines a longer timeout is
better, but I've also seen workloads where a longer timeout decreases
throughput or raises IO. So absent some hard numbers, I'd be opposed to
changing the default.
>>
>>> I'll open the bidding at 1600MB (aka 100).
>>
>> Fine with me.
>
> I wouldn't object to raising it a little bit, but that's way too high.
> It's entirely possible to have a small database that generates a lot of
> WAL. A table that has only a few rows, but is updated very very
> frequently, for example. And checkpointing such a database is quick too,
> so frequent checkpoints are not a problem. You don't want to end up with
> 1.5 GB of WAL on a 100 MB database.
I suggest 192MB instead (12 segments). That almost doubles our current
real default, without requiring huge disk space which might surprise
some users.
In practice, checkpoint_segments is impossible to automatically tune
correctly. So let's be conservative.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2015-02-03 21:51:42 | pgsql: Process 'die' interrupts while reading/writing from the client s |
Previous Message | Andres Freund | 2015-02-03 20:28:11 | Re: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.) |