From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | lazy snapshots? |
Date: | 2010-10-21 01:34:16 |
Message-ID: | AANLkTi=0PiDTLMu6pWaPMb-4L3NbOm_Y6mppib_vp4ra@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I had the following idea for an optimization. Feel free to tell me I'm nuts.
Would it be possible to postpone the operation of taking a snapshot
until we encounter an in-doubt tuple - that is, a tuple whose XMIN or
XMAX is committed but not all-visible? It seems to me that there are
many transactions that probably never look at any recently-modified
data, and that the overhead (and contention) of scanning the ProcArray
could be avoided for such transactions. At the time when we currently
take a snapshot, we could instead record an estimate of the oldest XID
still running; I'll call this value the threshold XID. Ideally, this
would be something we could read from shared memory in O(1) time.
Subsequently, when we examine XMIN or XMAX, we may find that it's
aborted (in which case we don't need a snapshot to decide what to do)
or that the XID we're examining precedes the threshold XID (in which
case we don't need a snapshot to decide what to do) or that the XID
we're examining is our own (in which case we again don't need a
snapshot to decide what to do). If none of those conditions hold, we
take a snapshot. (Possibly, we could try rereading the threshold XID
from shared memory, because it might have advanced far enough to get
us out of the woods.)
It's necessary to convince ourselves not only that this has some
performance benefit but that it's actually correct. It's easy to see
that, if we never take a snapshot, all the tuple visibility decisions
we make will be exactly identical to the ones that we would have made
with a snapshot; the choice of snapshot in that case is arbitrary.
But if we do eventually take a snapshot, we'll likely make different
tuple visibility decisions than we would have made had we taken the
snapshot earlier. However, the decisions that we make prior to taking
the snapshot will be consistent with the snapshot, and we will
certainly see the effects of all transactions that committed before we
started. We may also see the effects of some transactions that commit
after we started, but that is OK: it is just as if our whole
transaction had been started slightly later and then executed more
quickly thereafter. It would be bad if we saw the effect of
transaction A but not transaction B where transaction B committed
after transaction A, but the way snapshots are taken prevents that
regardless of exactly when we do it.
VACUUM can't remove any tuples with committed XMINs unless their XMAX
precedes our threshold XID, but I think that's not any worse under
this proposal than it is anyway. If we took a full snapshot instead
of just writing down a threshold XID, we'd have the same problem.
OK, that's it. Comments?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2010-10-21 01:41:36 | Re: default_statistics_target WAS: max_wal_senders must die |
Previous Message | Greg Stark | 2010-10-21 01:16:27 | Re: default_statistics_target WAS: max_wal_senders must die |