Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> It seems like we ought to distinguish heap cleanup activities from
> user-visible semantics (IOW, users shouldn't care if a HOT cleanup
> has to be done over after restart, so if the transaction only
> wrote such records there's no need to flush). This'd require more
> process-global state than we keep now, I'm afraid.
That makes sense, and seems like the right long-term fix. It seems
like a boolean might do it; the trick would be setting it (or not)
in all the right places.
> Another approach we could take (also nontrivial) is to prevent
> select-only queries from doing HOT cleanups. You said upthread
> that there were alleged performance benefits from aggressive
> cleanup, but IMO that can charitably be described as unproven.
> The real reason it happens is that we didn't see a simple way for
> page fetches to know soon enough whether a tuple update would be
> likely to happen later, so they just do cleanups unconditionally.
Hmm. One trivial change could be to skip it when the top level
transaction is declared to be READ ONLY. At least that would give
people a way to work around it for now. Of course, that can't be
back-patched before 9.1 because subtransactions could override READ
ONLY before that.
-Kevin