Re: termination of backend waiting for sync rep generates a junk log message

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: termination of backend waiting for sync rep generates a junk log message
Date: 2011-10-23 22:34:47
Message-ID: 29610.1319409287@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Well, there is a general problem that anything which throws an ERROR
> too late in the commit path is Evil; and sync rep makes that worse to
> the extent that it adds more stuff late in the commit path, but it
> didn't invent the problem.

BTW, it strikes me that if we want to do something about that, it ought
to be possible; but it has to be built into error handling, not a
localized hack for sync rep.

Consider a design along these lines: we invent a global flag that gets
set at some appropriate point in RecordTransactionCommit (probably right
where we exit the commit critical section) and is not cleared until we
send a suitable message to the client --- I think either
command-complete or an error message would qualify, but that would have
to be analyzed more carefully than I've done so far. If elog.c is told
to send an error message while this flag is set, then it does something
special to inform the client that this was a post-commit error and the
xact is in fact committed.

My inclination for the "something special" would be to add a new error
message field, but that could be difficult for clients to examine
depending on what sort of driver infrastructure they're dealing with.
You could also imagine emitting a separate NOTICE or WARNING message,
which is analogous to the current hack in SyncRepWaitForLSN, but seems
pretty ugly because it requires clients to re-associate that event with
the later error message. (But it might be worth doing anyway for human
users, even if we provide a different flag mechanism that is intended
for program consumption.) Or maybe we could override the SQLSTATE with
some special value. Or something else.

Given infrastructure like this, it would be reasonable for
SyncRepWaitForLSN to just throw an ERROR if it gets an interrupt,
instead of trying to kluge its own solution.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-10-23 22:39:01 Re: Hot Backup with rsync fails at pg_clog if under load
Previous Message Tom Lane 2011-10-23 22:04:33 Re: So, is COUNT(*) fast now?