For perspective: Since our current application framework is about
ten years old now, the Wisconsin State Courts System has put
together a "Long-Term Application Development Group" to review all
aspects of our development and production runtime environments. We
started by reviewing various aspect of these environments and
addressing the question: "Ignoring for the moment how we would get
there, where would we most like to be four years from now?"
One of the elements on the list was our database environment. We're
very happy with PostgreSQL except for one thing -- the lack of
support for serializable transaction isolation means that no
integrity rules can be relied upon except those implemented within
the database engine itself (such as foreign key constraints). Given
the complexity of the schema, the number of programmers, and the
number of queries -- any attempt to address the problem by looking
at particular interactions between particular queries and using
explicit locking would be hard to distinguish from completely
useless.
Serializable transaction isolation is attractive for shops with
active development by many programmers against a complex schema
because it guarantees data integrity with very little staff time --
if a transaction can be shown to always do the right thing when it
is run alone (before or after any other transaction), it will always
do the right thing in any mix of concurrent serializable
transactions. Period.
I'm going to try to ignore any suggestions that no other database
product has implemented this or that it's not useful. While it
carries a performance cost which makes it untenable in some
situations, and isn't needed when you have only a handful of
programmers developing against only 100 tables, it has it's place
and is supported in every major database product I know *except* for
PostgreSQL.
We would like to relax our portability mandate with regard to the
database layer, and use more PostgreSQL-specific features, but are
reluctant to do so while PostgreSQL remains virtually alone in
lacking support for actual serializable transactions. (Other MVCC
databases, like Oracle and recent versions of Microsoft SQL Server
provide snapshot isolation, but also support serializable
isolation.) Given the benefits of using the PostgreSQL-specific
features, it appears that we might be ahead by implementing support
for serializable transaction isolation so that we can relax
portability requirements and thereby avoid developing, in our
framework, our own (portable) implementation of features available
in "native" PostgreSQL.
Between staff and contractors we have over 20 programmers working
here. We would probably draw on that base for some of the work.
Given the nature of the work, we might be able to find a CS grad
student somewhere who would be interested in contributing. If there
were any contractors with PostgreSQL development experience
registered with the state's procurement agency[1], we would be very
likely to draw on them in the effort.
I would like to start new threads for any technical discussions --
what I want to discuss here is how to approach this in a way that
avoids the SE-PostgreSQL doldrums. I know from some responses to
previous discussions of the issue, both on and off list, that there
are other PostgreSQL users who want this, so at a minimum we would
make our patch set available to others; but it would certainly be a
benefit to both us (in avoiding re-basing the patch for every major
release, as well as having extra eyes on the code), and to others
(in having it "just work" on installation, without applying a
third-party patch set) if we can do this right.
One thing which strikes me is that there are at least three fairly
well documented techniques for implementing true serializable
transaction isolation under MVCC:
(1) Strict 2 Phase Locking (S2PL). This is the most widely used,
by far, and probably best understood by developers. It has the
disadvantage of creating so much blocking under some loads that it
isn't always feasible.
(2) Optimistic Concurrency Control (OCC). This generates almost no
overhead in some DBMSs under some read-mostly loads (although I'm
not sure whether that would be true in PostgreSQL), but degenerates
worse than S2PL under many loads with higher levels of write
contention.
(3) Serializable Snapshot Isolation (SSI). This relies mostly on
snapshot isolation techniques, with detection of possible conflicts
with low overhead. It is a very new technique, appearing in the
literature less than two years ago, and only implemented so far in
two prototypes: Oracle Berkeley-DB and InnoDB. Under many loads
performance is very close to Snapshot Isolation (SI), but the
dynamics aren't too well known yet, and there may be some lurking
pessimal cases not yet identified. Serialization errors are higher
than in SI or S2PL, so it should only be used where the software is
prepared to deal with serialization errors in a universal and
systematic fashion, rather than hand-coding recovery on a
query-by-query basis. The more sophisticated identification of
dangerous lock structures among a set of concurrent transactions
means that the particular reason for any serialization failure may
be hard to explain to the uninitiated.
Given that each of these would be the best choice for some
transaction mixes, it might make sense to allow some mapping of the
four ANSI transaction isolation levels to strategies for
implementation. At the risk of generating some backlash on this
particular point, it might allow a Read Committed implementation
which avoids some of the current anomalies, as a user-selectable
alternative to the current implementation. As a hypothetical
example, one might map the ANSI Read Uncommitted mode to what
PostgreSQL currently calls Read Committed, Read Committed to a
get-a-new-snapshot strategy, Repeatable Read to SI, and Serializable
to SSI. (Why do I feel like now would be a good time to duck?)
To give some idea of the scope of development, Michael Cahill added
SSI to InnoDB by modifying 250 lines of code and adding 450 lines of
code; however, InnoDB already had the S2PL option and the prototype
implementation isn't as sophisticated as I feel is necessary for real
production use (particularly regarding the granularity of SIREAD
locks). I'm assuming it would take more to reach real production
quality in PostgreSQL. My SWAG would be to multiply by two or three.
So, with all the above for perspective, assuming management approves
development of this for use in our shop, how would people recommend
that we proceed to maximize the chances that it eventually finds its
way into the PostgreSQL code base? Keep in mind that we're not
panicking here -- we're looking at a potential two to four year time
line.
-Kevin
[1] http://vendornet.state.wi.us/vendornet/vguide/index.asp