Re: Turning off HOT/Cleanup sometimes

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Turning off HOT/Cleanup sometimes
Date: 2015-04-15 16:39:51
Message-ID: CA+Tgmob=2nPpXiAMQ2kxaeNTrDiPRJOt3PdcGFP_tc+1ot2muQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 15, 2015 at 8:42 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> I won't take responsibility for paying my neighbor's tax bill, but I
>> might take responsibility for picking up his mail while he's on
>> holiday.
>
> That makes it sound like this is an occasional, non-annoying thing.
>
> It's more like, whoever fetches the mail needs to fetch it for
> everybody. So we are slowing down one person disproportionately, while
> others fly through without penalty. There is no argument that one
> workload necessarily needs to perform that on behalf of the other
> workload.

Sure there is. It's called a tragedy of the commons - everybody acts
in their own selfish interest (it's not *my* responsibility to limit
grazing on public land, or prune this page that I'm not modifying) and
as a result some resource that everybody cares about (grass,
system-wide I/O) gets trashed to everyone's detriment. Purely selfish
behavior can only be justified here if we assume that the selfish
actor intends to participate in the system only once: I'm going to run
one big reporting query which must run as fast as possible, and then
I'm getting on a space ship to Mars. So if my refusal to do any
pruning during that reporting query causes lots of extra I/O on the
system ten minutes from now, I don't care, because I'll have left the
playing field forever at that point.

As Heikki points out, any HOT pruning operation generates WAL and has
a CPU cost. However, pruning a page that is currently dirty
*decreases* the total volume of writes to the data files, whereas
pruning a page that is currently clean *increases* the total volume of
writes to the data files. In the first case, if we prune the page
right now while it's still dirty, we can't possibly cause any
additional data-file writes, and we may save one, because it's
possible that someone else would later have pruned it when it was
clean and there was no other reason to dirty it. In the second case,
if we prune the page that is currently clean, it will become dirty.
That will cost us no additional I/O if the page is again modified
before it's written out, but otherwise it costs an additional data
file write. I think there's a big difference between those two cases.
Sure, from the narrow point of view of how much work it takes this
scan to process this page, it's always better not to prune. But if
you make the more realistic assumption that you will keep on issuing
queries on the system, then what you're doing to the overall system
I/O load is pretty important.

By the way, was anything ever done about this:

http://www.postgresql.org/message-id/20140929091343.GA4716@alap3.anarazel.de

That's just a workload that is 5/6th pgbench -S and 1/6th pgbench,
which is in no way an unrealistic workload, and showed a significant
regression with an earlier version of the patch. You seem very eager
to commit this patch after four months of inactivity, but I think this
is a pretty massive behavior change that deserves careful scrutiny
before it goes in. If we push something that changes longstanding
behavior and can't even be turned off, and it regresses behavior for a
use case that common, our users are going to come after us with
pitchforks. That's not to say some people won't be happy, but in my
experience it takes a lot of happy users to make up for getting
stabbed with even one pitchfork.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Qingqing Zhou 2015-04-15 16:52:52 Assert there is no duplicated exit callbacks
Previous Message Alvaro Herrera 2015-04-15 16:23:38 Re: [COMMITTERS] pgsql: Move pg_upgrade from contrib/ to src/bin/