From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
Cc: | Ants Aasma <ants(at)cybertec(dot)at>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Vacuum, Freeze and Analyze: the big picture |
Date: | 2013-06-03 21:35:33 |
Message-ID: | CA+TgmoYCd8Gve126F_vppPi-eEz6Wo+WGKB6=k1S9AGyabCxWg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jun 3, 2013 at 3:48 PM, Martijn van Oosterhout
<kleptog(at)svana(dot)org> wrote:
> On Mon, Jun 03, 2013 at 11:27:57AM +0300, Ants Aasma wrote:
>> > I can't rule that out. Personally, I've always attributed it to the
>> > fact that it's (a) long and (b) I/O-intensive. But it's not
>> > impossible there could also be bugs lurking.
>>
>> It could be related to the OS. I have no evidence for or against, but
>> it's possible that OS write-out routines defeat the careful cost based
>> throttling that PostgreSQL does by periodically dumping a large
>> portion of dirty pages into the write queue at once. That does nasty
>> things to query latencies as evidenced by the work on checkpoint
>> spreading.
>
> In other contexts I've run into issues relating to large continuous
> writes stalling. The issue is basically that the Linux kernel allows
> (by default) writes to pile up until they reach 5% of physical memory
> before deciding that the sucker who wrote the last block becomes
> responsible for writing the whole lot out. At full speed of course.
> Depending on the amount of memory and the I/O speed of your disks this
> could take a while, and interfere with other processes.
>
> This leads to extremely bursty I/O behaviour.
>
> The solution, as usual, is to make it more aggressive, so the
> kernel background writer triggers at 1% memory.
>
> I'm not saying that's the problem here, but it is an example of a
> situation where the write queue can become very large very quickly.
Yeah. IMHO, the Linux kernel's behavior around the write queue is
flagrantly insane. The threshold for background writing really seems
like it ought to be zero. I can see why it makes sense to postpone
writing back dirty data if we're otherwise starved for I/O. But it
seems like the kernel is disposed to cache large amounts of dirty data
for an unbounded period of time even if the I/O system is completely
idle, and it's difficult to imagine what class of user would find that
behavior desirable.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2013-06-03 22:16:49 | Re: Vacuum, Freeze and Analyze: the big picture |
Previous Message | Peter Geoghegan | 2013-06-03 21:27:01 | Re: Vacuum, Freeze and Analyze: the big picture |