From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: postgresql latency & bgwriter not doing its job |
Date: | 2014-08-26 09:34:36 |
Message-ID: | alpine.DEB.2.10.1408261108450.7535@sto |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> Uh. I'm not surprised you're facing utterly horrible performance with
> this. Did you try using a *large* checkpoints_segments setting? To
> achieve high performance
I do not seek "high performance" per se, I seek "lower maximum latency".
I think that the current settings and parameters are designed for high
throughput, but do not allow to control the latency even with a small
load.
> you likely will have to make checkpoint_timeout *longer* and increase
> checkpoint_segments until *all* checkpoints are started because of
> "time".
Well, as I want to test a *small* load in a *reasonable* time, so I did
not enlarge the number of segments, otherwise it would take ages.
If I put a "checkpoint_timeout = 1min" and "checkpoint_completion_target =
0.9" so that the checkpoints are triggered by the timeout,
LOG: checkpoint starting: time
LOG: checkpoint complete: wrote 4476 buffers (27.3%); 0 transaction log
file(s) added, 0 removed, 0 recycled; write=53.645 s, sync=5.127 s,
total=58.927 s; sync files=12, longest=2.890 s, average=0.427 s
...
The result is basically the same (well 18% transactions lost, but the
result do not seem to be stable one run after the other), only there are
more checkpoints.
I fail to understand how multiplying both the segments and time would
solve the latency problem. If I set 30 segments than it takes 20 minutes
to fill them, and if I put timeout to 15min then I'll have to wait for 15
minutes to test.
> There's three reasons:
> a) if checkpoint_timeout + completion_target is large and the checkpoint
> isn't executed prematurely, most of the dirty data has been written out
> by the kernel's background flush processes.
Why would they be written by the kernel if bgwriter has not sent them??
> b) The amount of WAL written with less frequent checkpoints is often
> *significantly* lower because fewer full page writes need to be
> done. I've seen production reduction of *more* than a factor of 4.
Sure, I understand that, but ISTM that this test does not exercise this
issue, the load is small, the full page writes do not matter much.
> c) If checkpoint's are infrequent enough, the penalty of them causing
> problems, especially if not using ext4, plays less of a role overall.
I think that what you suggest would only delay the issue, not solve it.
I'll try to ran a long test.
--
Fabien.
From | Date | Subject | |
---|---|---|---|
Next Message | Fabien COELHO | 2014-08-26 09:37:56 | Re: pgbench throttling latency limit |
Previous Message | Dave Page | 2014-08-26 09:17:05 | Re: What in the world is happening with castoroides and protosciurus? |