Re: High rate of transaction failure with the Serializable Isolation Level

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Reza Taheri <rtaheri(at)vmware(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: High rate of transaction failure with the Serializable Isolation Level
Date: 2014-07-24 14:02:59
Message-ID: 1406210579.67466.YahooMailNeo@web122301.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Reza Taheri <rtaheri(at)vmware(dot)com> wrote:

> I am running into very high failure rates when I run with the
> Serializable Isolation Level. I have simplified our configuration
> to a single database with a constant workload, a TPC-E workload
> if you will, to focus on this this problem. We are running with
> PGSQL 9.2.4

I don't remember any bug fixes that would be directly related to
what you describe in the last 15 months, but it might be better to
do any testing with fixes for known bugs:

http://www.postgresql.org/support/versioning/

> When we raise the Trade-Result transaction to
> SQL_TXN_SERIALIZABLE, we face a storm of conflicts. Out of
> 37,342 Trade-Result transactions, 15,707 hit an error, and have
> to be rolled back and retired one or more times. The total
> failure count (due to many transactions failing more than once)
> is 31,388.
>
> What is unusual is that the majority of the failures occur in a
> statement that should not have any isolation conflicts.

As already pointed out by Craig, statements don't have
serialization failures; transactions do.  In some cases a
transaction may become "doomed to fail" by the action of a
concurrent transaction, but the actual failure cannot occur until
the next statement is run on the connection with the doomed
transaction; it may have nothing to do with the statement itself.

If you want to understand the theory of how SERIALIZABLE
transactions are implemented in PostgreSQL, these links may help:

http://vldb.org/pvldb/vol5/p1850_danrkports_vldb2012.pdf

http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/backend/storage/lmgr/README-SSI;hb=master

http://wiki.postgresql.org/wiki/Serializable

For a more practical set of examples about the differences in
using REPEATABLE READ and SERIALIZABLE transaction isolation levels
in PostgreSQL, see:

http://wiki.postgresql.org/wiki/SSI

If you are just interested in reducing the number of serialization
failures, see the suggestions near the end of this section of the
documentation:

http://www.postgresql.org/docs/9.2/interactive/transaction-iso.html#XACT-SERIALIZABLE

Any of these items (or perhaps a combination of them) may
ameliorate the problem.  Note that I have seen reports of cases
where max_pred_locks_per_transaction needed to be set to 20x the
default to reduce serialization failures to an acceptable level.
The default is intentionally set very low because so many people do
not use this isolation level, and this setting reserves shared
memory for purposes of tracking serializable transactions; the
space is wasted for those who don't choose to use them.

There is still a lot of work possible to reduce the rate of false
positives, which has largely gone undone so far due to a general
lack of problem reports from people which could not be solved
through tuning.  If you have such a case, it would be interesting
to have all relevant details, so that we can target which of the
many enhancements are relevant to your case.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Borodin Vladimir 2014-07-24 14:22:01 Debugging writing load
Previous Message Craig Ringer 2014-07-24 05:01:15 Re: High rate of transaction failure with the Serializable Isolation Level