[OSSTEST PATCH 1/1] PostgreSQL db: Retry transactions on constraint failures

From: Ian Jackson <ian(dot)jackson(at)eu(dot)citrix(dot)com>
To: <xen-devel(at)lists(dot)xenproject(dot)org>
Cc: <pgsql-hackers(at)postgresql(dot)org>, Ian Jackson <ian(dot)jackson(at)eu(dot)citrix(dot)com>, Ian Jackson <Ian(dot)Jackson(at)eu(dot)citrix(dot)com>
Subject: [OSSTEST PATCH 1/1] PostgreSQL db: Retry transactions on constraint failures
Date: 2016-12-09 18:26:31
Message-ID: 1481307991-16971-2-git-send-email-ian.jackson@eu.citrix.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This is unfortunate but appears to be necessary.

Signed-off-by: Ian Jackson <Ian(dot)Jackson(at)eu(dot)citrix(dot)com>
CC: pgsql-hackers(at)postgresql(dot)org
---
Osstest/JobDB/Executive.pm | 45 ++++++++++++++++++++++++++++++++++++++++++++-
tcl/JobDB-Executive.tcl | 6 ++++--
2 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/Osstest/JobDB/Executive.pm b/Osstest/JobDB/Executive.pm
index 610549a..dc6d3c2 100644
--- a/Osstest/JobDB/Executive.pm
+++ b/Osstest/JobDB/Executive.pm
@@ -62,8 +62,51 @@ sub need_retry ($$$) {
my ($jd, $dbh,$committing) = @_;
return
($dbh_tests->err() // 0)==7 &&
- ($dbh_tests->state =~ m/^(?:40P01|40001)/);
+ ($dbh_tests->state =~ m/^(?:40P01|40001|23|40002)/);
# DEADLOCK DETECTED or SERIALIZATION FAILURE
+ # or any Integrity Constraint Violation including
+ # TRANSACTION_INTEGRITY_CONSTRAINT_VIOLATION.
+ #
+ # An Integrity Constraint Violation ought not to occur with
+ # serialisable transactions, so it is aways a bug. These bugs
+ # should not be retried. However, there is a longstanding bug in
+ # PostgreSQL: SERIALIZABLE's guarantee of transaction
+ # serialisability only applies to successful transactions.
+ # Concurrent SERIALIZABLE transactions may generate "impossible"
+ # errors. For example, doing a SELECT to ensure that a row does
+ # not exist, and then inserting it, may produce a unique
+ # constraint violation.
+ #
+ # I have not been able to find out clearly which error codes may
+ # be spuriously generated. At the very least "23505
+ # UNIQUE_VIOLATION" is, but I'm not sure about others. I am
+ # making the (hopefully not unwarranted) assumption that this is
+ # the only class of spurious errors. (We don't have triggers.)
+ #
+ # The undesirable side effect is that a buggy transaction would be
+ # retried at intervals until the retry count is reached. But
+ # there seems no way to avoid this.
+ #
+ # This bug may have been fixed in very recent PostgreSQL (although
+ # a better promise still seems absent from the documentation, at
+ # the time of writing in December 2016). But we need to work with
+ # PostgreSQL back to at least 9.1. Perhaps in the future we can
+ # make this behaviour conditional on the pgsql bug being fixed.
+ #
+ # References:
+ #
+ # "WIP: Detecting SSI conflicts before reporting constraint violations"
+ # January 2016 - April 2016 on pgsql-hackers
+ # https://www.postgresql.org/message-id/flat/CAEepm%3D2_9PxSqnjp%3D8uo1XthkDVyOU9SO3%2BOLAgo6LASpAd5Bw%40mail.gmail.com
+ # (includes patch for PostgreSQL and its documentation)
+ #
+ # BUG #9301: INSERT WHERE NOT EXISTS on table with UNIQUE constraint in concurrent SERIALIZABLE transactions
+ # 2014, pgsql-bugs
+ # https://www.postgresql.org/message-id/flat/3F697CF1-2BB7-40D4-9D20-919D1A5D6D93%40apple.com
+ #
+ # "Working around spurious unique constraint errors due to SERIALIZABLE bug"
+ # 2009, pgsql-general
+ # https://www.postgresql.org/message-id/flat/D960CB61B694CF459DCFB4B0128514C203937E44%40exadv11.host.magwien.gv.at
}

sub current_flight ($) { #method
diff --git a/tcl/JobDB-Executive.tcl b/tcl/JobDB-Executive.tcl
index 62c63af..6b9bcb0 100644
--- a/tcl/JobDB-Executive.tcl
+++ b/tcl/JobDB-Executive.tcl
@@ -365,8 +365,10 @@ proc transaction {tables script {autoreconnect 0}} {
if {$rc} {
switch -glob $errorCode {
{OSSTEST-PSQL * 40P01} -
- {OSSTEST-PSQL * 40001} {
- # DEADLOCK DETECTED or SERIALIZATION FAILURE
+ {OSSTEST-PSQL * 40001} -
+ {OSSTEST-PSQL * 23*} -
+ {OSSTEST-PSQL * 40002} {
+ # See Osstest/JobDB/Executive.pm:need_retry
logputs stdout \
"transaction serialisation failure ($errorCode) ($result) retrying ..."
if {$dbopen} { db-execute ROLLBACK }
--
2.1.4

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Oleksandr Shulgin 2016-12-09 18:48:05 Re: proposal: psql statements \gstore \gstore_binary (instead COPY RAW)
Previous Message Ian Jackson 2016-12-09 18:26:30 [OSSTEST PATCH 0/1] PostgreSQL db: Retry on constraint violation