From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jonathan Katz <jkatz(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com> |
Subject: | Re: Improving connection scalability: GetSnapshotData() |
Date: | 2020-04-08 13:24:13 |
Message-ID: | CA+TgmoaC9719CJH2RTAZC9xkebxmbf+zYJo9VgV4GJBwqA5xiw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Apr 7, 2020 at 4:27 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> The main reason is that we want to be able to cheaply check the current
> state of the variables (mostly when checking a backend's own state). We
> can't access the "dense" ones without holding a lock, but we e.g. don't
> want to make ProcArrayEndTransactionInternal() take a lock just to check
> if vacuumFlags is set.
>
> It turns out to also be good for performance to have the copy for
> another reason: The "dense" arrays share cachelines with other
> backends. That's worth it because it allows to make GetSnapshotData(),
> by far the most frequent operation, touch fewer cache lines. But it also
> means that it's more likely that a backend's "dense" array entry isn't
> in a local cpu cache (it'll be pulled out of there when modified in
> another backend). In many cases we don't need the shared entry at commit
> etc time though, we just need to check if it is set - and most of the
> time it won't be. The local entry allows to do that cheaply.
>
> Basically it makes sense to access the PGPROC variable when checking a
> single backend's data, especially when we have to look at the PGPROC for
> other reasons already. It makes sense to look at the "dense" arrays if
> we need to look at many / most entries, because we then benefit from the
> reduced indirection and better cross-process cacheability.
That's a good explanation. I think it should be in the comments or a
README somewhere.
> How about:
> /*
> * If the current xactCompletionCount is still the same as it was at the
> * time the snapshot was built, we can be sure that rebuilding the
> * contents of the snapshot the hard way would result in the same snapshot
> * contents:
> *
> * As explained in transam/README, the set of xids considered running by
> * GetSnapshotData() cannot change while ProcArrayLock is held. Snapshot
> * contents only depend on transactions with xids and xactCompletionCount
> * is incremented whenever a transaction with an xid finishes (while
> * holding ProcArrayLock) exclusively). Thus the xactCompletionCount check
> * ensures we would detect if the snapshot would have changed.
> *
> * As the snapshot contents are the same as it was before, it is is safe
> * to re-enter the snapshot's xmin into the PGPROC array. None of the rows
> * visible under the snapshot could already have been removed (that'd
> * require the set of running transactions to change) and it fulfills the
> * requirement that concurrent GetSnapshotData() calls yield the same
> * xmin.
> */
That's nice and clear.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | David Steele | 2020-04-08 13:25:26 | Re: Allow auto_explain to log plans before queries are executed |
Previous Message | Alexander Korotkov | 2020-04-08 12:59:50 | Re: Improving connection scalability: GetSnapshotData() |