Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Mon, 2010-10-18 at 13:26 -0500, Kevin Grittner wrote:
>>> 3. Limited shared memory space to hold information about
>>> committed transactions that are still "interesting".
>>> It's a challenging problem, however, and the current solution is
>>> less than ideal.
>>
>> I'd go further than that and say that it clearly needs to be
>> fixed.
>
> OK, this will remain an open issue then.
This seems to me to be by far the biggest problem with the patch.
I've been working through various ideas, and I think I see the light
at the end of the tunnel. I'm posting the ideas for a reality
check. I'm hoping for comments.
It seems clear to me that we need to attack this from two
directions:
(1) Mitigation, by more aggressively identifying transactions which
are no longer interesting so they can be cleaned up.
(2) Graceful degradation, by somehow summarizing information as we
approach the hard limit, so that we incrementally increase the
probability of false positives rather than resorting to either of
the two "simple" solutions (refusing new serializable transactions
or canceling the oldest running serializable transactions).
(Suggestions for other ways to attack this are welcome.)
It seemed to me likely that investigating (1) would help to clarify
how to do (2), so here's what I've got in the mitigation department:
(1A) A committed transaction TX which hasn't written data can be
cleaned up when there is no overlapping non-read-only transaction
which is active and which overlaps a committed transaction which
wrote data and committed too soon to overlap TX.
(1B) Predicate locks and information about rw-conflicts *in* for a
committed transaction can be released when there are no overlapping
non-read-only transactions active. Except for transactions
described in (1A), the fact that the transaction was a serializable
transaction with a rw-conflict *out* is significant, but nothing
else about the transaction matters, and we don't need to know
details of the conflict or maintain a reverse pointer.
(1C) A committing transaction which has written data can clear its
conflict out pointer if it points to an active transaction which
does not overlap a committed transaction which wrote data.
(Obviously, the corresponding pointer in the other direction can
also be cleared.)
(1D) A committed transaction with no rw-conflict out cannot become a
pivot in a dangerous structure, because the transaction on the "out"
side of a pivot must commit first for there to be a problem.
(Obviously, two committed transactions cannot develop a new
conflict.) Since a read-only transaction can only participate in a
dangerous structure through a conflict with a pivot, transactions in
this state can be ignored when a read-only transaction is checking
for conflicts.
That's all I've been able to come up with so far. Let's see how
that plays with the SSI worst case -- a long running transaction
concurrent with many faster transactions.
(2A) If the long-running transaction is read-only, it looks pretty
good. We can clear concurrent transactions on a reasonable schedule
and just maintain a list of committed serializable transactions with
rw-conflicts out which wrote data and have xids above the
serializable global xmin. We can probably make room for such a list
of xids somehow -- I could even see potentially using the SLRU
mechanism for this without killing performance -- the long-running
read-only transaction would just need to look up a particular xid in
the list whenever it read a non-visible tuple. If it exists, the
long-running transaction must roll back with a serialization
failure. Normally the list should be short enough to stay in RAM.
(2B) It gets more complicated if the long-running transaction can
also write. This is both because the predicate lock information
must be maintained and associated with something, and because the
long running transaction can become a pivot with an old short-lived
transaction on the rw-conflict *in* side. The rw-conflicts *out*
can be handled the same as for a long-running read-only transaction
(described above).
I think that tempered with the above, the performance impact will be
minimal if we apply the technique Heikki described here to the
oldest committed transactions when the list becomes full:
http://archives.postgresql.org/pgsql-hackers/2010-09/msg01477.php
I don't think this change will affect predicate.h or any of the code
which uses its API. The only thing which uses the internal
structures outside of predicate.c is the code to show the SIReadLock
entries in the pg_locks view. I'm not sure how we should show locks
for which we no longer have an associated virtual transaction ID.
Does anyone have thoughts on that? Just leave virtualtransaction
NULL, and leave the rest as-is?
-Kevin