Re: Re: BUG #9301: INSERT WHERE NOT EXISTS on table with UNIQUE constraint in concurrent SERIALIZABLE transactions

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: David Johnston <polobo(at)yahoo(dot)com>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Re: BUG #9301: INSERT WHERE NOT EXISTS on table with UNIQUE constraint in concurrent SERIALIZABLE transactions
Date: 2014-03-14 19:43:14
Message-ID: 1394826194.18022.YahooMailNeo@web122305.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

David Johnston <polobo(at)yahoo(dot)com> wrote:
> Tom Lane-2 wrote
>> mscott@ writes:

>>> -- The following seems to violate the fundamental guarantee of
>>> ISOLATION LEVEL SERIALIZABLE in that the two transactions below
>>> do not behave the same as if they were run serially.
>>
>> I must be missing something ... what about the duplicate-key
>> error is not what you would expect if the two transactions had
>> been run serially?
>>
>> BTW, the fundamental guarantee is not what you said above.  It
>> is that you will *either* get the same results as if the
>> transactions had been run in some unspecified serial order, *or*
>> you will get a serialization error.  There is certainly no
>> guarantee about which of these cases ensues.

> My reading is that if run serially the second function call
> results in a select returning zero records (exists returns true,
> then not-ed to a constant false in the where clause) and thus no
> insert is attempted and thus no duplicate key error - the record
> from the first transaction remains as-is and the second is
> effectively a no-op.
>
> So the fact the serializable versions fails is correct but it
> fails "incorrectly" since a serial execution could never fail
> with a duplicate key error.  Thus neither a possible unspecified
> serial order result (dup being an impossible one) nor a
> serialization error occurred which is contrary to the guarantee
> that you state.
>
> Now, the fact is the serialization case correctly fails but from
> an API perspective immediate retry makes no sense for duplicate
> key failure while it does for serialization failure.

This issue has been noted in the literature since at least 2007[1].
In the static code analysis of academic and financial systems at
Indian Institute of Technology Bombay which had been relying on
snapshot isolation for transactional integrity they initially
flagged many possible sources of data corruption from serialization
failure which turned out not to actually be possible due to primary
keys, unique indexes, or foreign keys.  After considering such
issues they were left with only two real risks in each of the
systems, and were only able to find actual data corruption of one
of those in each system.

If there were a cheap and reliable way to turn violations of these
constraints into serialization failures where appropriate, I agree
that it would be better to do so.  I have not been able to see any
way to do that, but am all ears if anyone else has an idea.  (Note
the constraints of cheap and reliable; we could allow the SSI
mechanism to generate SIReadLocks during constraint enforcement,
but it's not clear that it would always find a serialization error
before the constraint caused an error, and the overhead might be
significant.)

The actual guarantee provided is more like: "The behavior of any
set of successfully committed concurrent serializable transactions
will be consistent with some unspecified serial order of execution
of those transactions."  Constraints may terminate a transaction
before the serializable snaphsot isolation mechanism is able to
recognize a problem and cancel a transaction with a different
SQLSTATE, but the guarantee should prevent any serialization
anomalies from appearing in the database.  It does present a
challenge in terms of knowing whether a constraint validation like
"duplicate key" indicates a situation which can be automatically
re-tried.  If that is of paramount importance one could implement
the logic for preventing duplicate keys or enforcing foreign keys
in triggers using serializable transactions, but that would come
with a heavy price in both maintenance effort and performance.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] http://www.vldb.org/conf/2007/papers/industrial/p1263-jorwekar.pdf
Automating the Detection of Snapshot Isolation Anomalies
by Sudhir Jorwekar, Krithi Ramamritham, Alan Fekete, S. Sudarshan
VLDB ‘07, September 23-28, 2007, Vienna, Austria.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message James L. Morton 2014-03-14 20:54:01 Re: BUG #9551: Hang in State "authentication" Prevents Vacuum from Freeing Dead Rows
Previous Message Alvaro Herrera 2014-03-14 15:33:04 Re: BUG #9555: pg_dump for tables with inheritance recreates the table with the wrong order of columns