Re: New docs chapter on Transaction Management and related changes

From: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Robert Treat <rob(at)xzilla(dot)net>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Erik Rijkers <er(at)xs4all(dot)nl>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New docs chapter on Transaction Management and related changes
Date: 2022-11-10 11:31:25
Message-ID: CANbhV-Ex7eYvCY8_s-vTc-T_Vc4EzYtoe_vq2tQ9ebsPeFy_-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 7 Nov 2022 at 22:04, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
>
> On Sat, 2022-11-05 at 10:08 +0000, Simon Riggs wrote:
> > Agreed; new compilation patch attached, including mine and then
> > Robert's suggested rewordings.
>
> Thanks. There is clearly a lot of usefule information in this.
>
> Some comments:
>
> > --- a/doc/src/sgml/func.sgml
> > +++ b/doc/src/sgml/func.sgml
> > @@ -24673,7 +24673,10 @@ SELECT collation for ('foo' COLLATE "de_DE");
> > <para>
> > Returns the current transaction's ID. It will assign a new one if the
> > current transaction does not have one already (because it has not
> > - performed any database updates).
> > + performed any database updates); see <xref
> > + linkend="transaction-id"/> for details. If executed in a
> > + subtransaction this will return the top-level xid; see <xref
> > + linkend="subxacts"/> for details.
> > </para></entry>
> > </row>
>
> I would use a comma after "subtransaction", and I think it would be better to write
> "transaction ID" instead of "xid".
>
> > @@ -24690,6 +24693,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
> > ID is assigned yet. (It's best to use this variant if the transaction
> > might otherwise be read-only, to avoid unnecessary consumption of an
> > XID.)
> > + If executed in a subtransaction this will return the top-level xid.
> > </para></entry>
> > </row>
>
> Same as above.
>
> > @@ -24733,6 +24737,8 @@ SELECT collation for ('foo' COLLATE "de_DE");
> > <para>
> > Returns a current <firstterm>snapshot</firstterm>, a data structure
> > showing which transaction IDs are now in-progress.
> > + Only top-level xids are included in the snapshot; subxids are not
> > + shown; see <xref linkend="subxacts"/> for details.
> > </para></entry>
> > </row>
>
> Again, I would avoid "xid" and "subxid", or at least use "transaction ID (xid)"
> and similar.
>
> > --- a/doc/src/sgml/ref/release_savepoint.sgml
> > +++ b/doc/src/sgml/ref/release_savepoint.sgml
> > @@ -34,23 +34,16 @@ RELEASE [ SAVEPOINT ] <replaceable>savepoint_name</replaceable>
> > <title>Description</title>
> >
> > <para>
> > - <command>RELEASE SAVEPOINT</command> destroys a savepoint previously defined
> > - in the current transaction.
> > + <command>RELEASE SAVEPOINT</command> will subcommit the subtransaction
> > + established by the named savepoint, if one exists. This will release
> > + any resources held by the subtransaction. If there were any
> > + subtransactions of the named savepoint, these will also be subcommitted.
> > </para>
> >
> > <para>
>
> "Subtransactions of the named savepoint" is somewhat confusing; how about
> "subtransactions of the subtransaction established by the named savepoint"?
>
> If that is too long and explicit, perhaps "subtransactions of that subtransaction".
>
> > @@ -78,7 +71,7 @@ RELEASE [ SAVEPOINT ] <replaceable>savepoint_name</replaceable>
> >
> > <para>
> > It is not possible to release a savepoint when the transaction is in
> > - an aborted state.
> > + an aborted state, to do that use <xref linkend="sql-rollback-to"/>.
> > </para>
> >
> > <para>
>
> I think the following is more English:
> "It is not possible ... state; to do that, use <xref .../>."
>
> > --- a/doc/src/sgml/ref/rollback.sgml
> > +++ b/doc/src/sgml/ref/rollback.sgml
> > @@ -56,11 +56,14 @@ ROLLBACK [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ]
> > <term><literal>AND CHAIN</literal></term>
> > <listitem>
> > <para>
> > - If <literal>AND CHAIN</literal> is specified, a new transaction is
> > + If <literal>AND CHAIN</literal> is specified, a new unaborted transaction is
> > immediately started with the same transaction characteristics (see <xref
> > linkend="sql-set-transaction"/>) as the just finished one. Otherwise,
> > no new transaction is started.
>
> I don't think that is an improvement. "Unaborted" is an un-word. A new transaction
> is always "unaborted", isn't it?
>
> > --- a/doc/src/sgml/wal.sgml
> > +++ b/doc/src/sgml/wal.sgml
> > @@ -909,4 +910,36 @@
> > seem to be a problem in practice.
> > </para>
> > </sect1>
> > +
> > + <sect1 id="two-phase">
> > +
> > + <title>Two-Phase Transactions</title>
> > +
> > + <para>
> > + <productname>PostgreSQL</productname> supports a two-phase commit (2PC)
> [...]
> > + <filename>pg_twophase</filename> directory. Currently-prepared
> > + transactions can be inspected using <link
> > + linkend="view-pg-prepared-xacts"><structname>pg_prepared_xacts</structname></link>.
> > + </para>
> > + </sect1>
> > +
> > </chapter>
>
> I don't like "currently-prepared". How about:
> "Transaction that are currently prepared can be inspected..."
>
> This is clearly interesting information, but I don't think the WAL chapter is the right
> place for this. "pg_twophase" is already mentioned in "storage.sgml", and details about
> when exactly a prepared transaction is persisted may exceed the details level needed by
> the end user.
>
> I'd look for that information in the reference page for PREPARE TRANSACTION; perhaps
> that would be a better place. Or, even better, the new "xact.sgml" chapter.
>
> > --- /dev/null
> > +++ b/doc/src/sgml/xact.sgml
>
> + <title>Transaction Management</title>
>
> + The word transaction is often abbreviated as "xact".
>
> Should use <quote> here.
>
> > + <title>Transactions and Identifiers</title>
>
> > + <para>
> > + Once a transaction writes to the database, it is assigned a
> > + non-virtual <literal>TransactionId</literal> (or <type>xid</type>),
> > + e.g., <literal>278394</literal>. Xids are assigned sequentially
> > + using a global counter used by all databases within the
> > + <productname>PostgreSQL</productname> cluster. This property is used by
> > + the transaction system to order transactions by their first database
> > + write, i.e., lower-numbered xids started writing before higher-numbered
> > + xids. Of course, transactions might start in a different order.
> > + </para>
>
> "This property"? How about:
> "Because transaction IDs are assigned sequentially, the transaction system can
> use them to order transactions by their first database write"
>
> I would want some additional information here: why does the transaction system have
> to order transactions by their first database write?
>
> "Of course, transactions might start in a different order."
>
> Now that confuses me. Are you saying that BEGIN could be in a different order
> than the first database write? Perhaps like this:
>
> "Note that the order in which transactions perform their first database write
> might be different from the order in which the transactions started."
>
> > + The internal transaction ID type <type>xid</type> is 32-bits wide
>
> There should be no hyphen in "32 bits wide", just as in "3 years old".
>
> > + A 32-bit epoch is incremented during each
> > + wrap around.
>
> We usually call this "wraparound" without a space.
>
> > + There is also a 64-bit type <type>xid8</type> which
> > + includes this epoch and therefore does not wrap around during the
> > + life of an installation and can be converted to xid by casting.
>
> Running "and"s. Better:
>
> "There is also ... and does not wrap ... life of an installation.
> <type>xid8</type> can be converted to <type>xid</type> by casting."
>
> > + Xids are used as the
> > + basis for <productname>PostgreSQL</productname>'s <link
> > + linkend="mvcc">MVCC</link> concurrency mechanism, <link
> > + linkend="hot-standby">Hot Standby</link>, and Read Replica servers.
>
> What is the difference between a hot standby and a read replica? I think
> one of these terms is sufficient.
>
> > + In addition to <literal>vxid</literal> and <type>xid</type>,
> > + when a transaction is prepared for two-phase commit it
> > + is also identified by a Global Transaction Identifier
> > + (<acronym>GID</acronym>).
>
> Better:
>
> "In addition to <literal>vxid</literal> and <type>xid</type>,
> prepared transactions also have a Global Transaction Identifier
> (<acronym>GID</acronym>) that is assigned when the transaction is
> prepared for two-phase commit."
>
> > + <sect1 id="xact-locking">
> > +
> > + <title>Transactions and Locking</title>
> > +
> > + <para>
> > + Currently-executing transactions are shown in <link
> > + linkend="view-pg-locks"><structname>pg_locks</structname></link>
> > + in columns <structfield>virtualxid</structfield> and
> > + <structfield>transactionid</structfield>.
>
> Better:
>
> "The transaction IDs of currently executing transactions are shown in <link
> linkend="view-pg-locks"><structname>pg_locks</structname></link>
> in the columns <structfield>virtualxid</structfield> and
> <structfield>transactionid</structfield>."
>
> > + Lock waits on table-level locks are shown waiting for
> > + <structfield>virtualxid</structfield>, while lock waits on row-level
> > + locks are shown waiting for <structfield>transactionid</structfield>.
>
> That's not true. Transactions waiting for table-level locks are shown
> waiting for a "relation" lock in both "pg_stat_activity" and "pg_locks".

Agreed.

Lock waits on table-locks are shown waiting for a lock type of
<literal>relation</literal>,
while lock waits on row-level locks are shown waiting for a lock type
of <literal>transactionid</literal>.
Table-level locks require only a virtualxid when the lock is less than an
AccessExclusiveLock; in other cases an xid must be allocated.

--
Simon Riggs http://www.EnterpriseDB.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2022-11-10 11:56:25 Re: Avoid overhead open-close indexes (catalog updates)
Previous Message Alvaro Herrera 2022-11-10 11:17:57 Re: New docs chapter on Transaction Management and related changes