Attached is the latest Serializable Snapshot Isolation (SSI) patch.
With Joe's testing and review, and with stress tests adapted from
those used by Florian for his patch, we were able to identify and
fix several bugs. Stability seems good now. We have many tests for
correct behavior which are all looking good. The only solid
benchmarks we have so far show no impact on isolation levels other
than SERIALIZABLE, and a 1.8% increase in run time for a saturation
run of small, read only SERIALIZABLE transactions against a fully
cached database. Dan has been working on setting up some benchmarks
using DBT-2, but doesn't yet have results to publish. If we can get
more eyes on the code during this CF, I'm hoping we can get this
patch committed this round.
This patch is basically an implementation of the techniques
described in the 2008 paper by Cahill et al, and which was further
developed in Cahill's 2009 PhD thesis. Techniques needed to be
adapted somewhat because of differences between PostgreSQL and the
two databases used for prototype implementations for those papers
(Oracle Berkeley DB and InnoDB), and there are a few original ideas
from Dan and myself used to optimize the implementation. One reason
for hoping that this patch gets committed in this CF is that it will
leave time to try out some other, more speculative optimizations
before release.
Documentation is not included in this patch; I plan on submitting
that to a later CF as a separate patch. Changes should be almost
entirely within the Concurrency Control chapter. The current patch
has one new GUC which (if kept) will need to be documented, and one
of the potential optimizations could involve adding a new
transaction property which would then need documentation.
The premise of the patch is simple: that snapshot isolation comes so
close to supporting fully serializable transactions that S2PL is not
necessary -- the database engine can watch for rw-dependencies among
transactions, without introducing any blocking, and roll back
transactions as required to prevent serialization anomalies. This
eliminates the need for using the SELECT FOR SHARE or SELECT FOR
UPDATE clauses, the need for explicit locking, and the need for
additional updates to introduce conflict points.
While block-level locking is included in this patch for btree and
GiST indexes, an index relation lock is still used for predicate
locks when a search is made through a GIN or hash index. These
additional index types can be implemented separately. Dan is
looking at bringing btree indexes to finer granularity, but wants to
have good benchmarks first, to confirm that the net impact is a gain
in performance.
Most of the work is in the new predicate.h and predicate.c files,
which total 2,599 lines, over 39% of which are comment lines. There
are 1626 lines in the new pg_dtester.py.in files, which uses Markus
Wanner's dtester software to implement a large number of correctness
tests. We added 79 lines to lockfuncs.c to include the new
SIReadLock entries in the pg_locks view. The rest of the patch
affects 286 lines (counting an updated line twice) across 25
existing PostgreSQL source files to implement the actual feature.
The code organization and naming issues mentioned here remain:
http://archives.postgresql.org/pgsql-hackers/2010-07/msg00383.php
-Kevin