From: | Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> |
---|---|
To: | Jonathan Guthrie <jguthrie(at)brokersys(dot)com> |
Cc: | Richard Huxton <dev(at)archonet(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: I'm puzzled by a foreign key constraint problem |
Date: | 2008-11-04 19:40:58 |
Message-ID: | 4910A54A.6010106@postnewspapers.com.au |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Jonathan Guthrie wrote:
> It's possible, likely even. We use a connection pool to manage
> connections to the database and they're doled out as the system sees
> fit. However, at some point every update has to finish such that any
> view of the database will see that update as finished, right?
So that any /new/ snapshot of the state of the database sees it, yes.
If you have long-lived transactions at the SERIALIZABLE isolation level,
they won't (by design) see changes committed by other transactions after
the first statement in the SERIALIZABLE transaction has been issued.
See the manual for more information on concurrency, MVCC, and
transaction isolation.
> There are two reasons why I'm dissatisfied with that answer. First, my
> mandate is basically to create an interface layer for Postgres and then
> port the SQL Server stored procedures without changing how they work.
This is unlikely to be possible. The two databases are extremely
different in some important ways.
In particular, MS SQL Server uses a locking approach to concurrency,
whereas PostgreSQL uses an multi-version approach (more like Oracle).
You should usually be able to make the locking approach work in
PostgreSQL, but there *will* be differences you need to think about in
the way procedures interact when run concurrently. You may need to add
more explicit locking to correct for assumptions that're valid under MS
SQL Server but not under PostgreSQL, or adjust your logic to exploit
multi-versioning properly instead. You will get much better performance
if you adapt your code to the MVCC model instead of trying to stick to
using locking for concurrency control.
It doesn't help that PostgreSQL does not at present support true stored
procedures. There is no top-level CALLable procedure support; instead
PostgreSQL has very powerful functions. The most important difference
this makes is that you CAN NOT perform transaction control operations
(BEGIN, ROLLBACK, COMMIT) within any procedural function in PostgreSQL.
They are inherently wrapped in a transaction. You *can* RAISE EXCEPTION
from PL/PgSQL to trigger a rollback (unless the caller traps and handles
the exception), but there's no way to force a commit or begin a new and
distinct transaction.
OK, that's not absolutely 100% true. You can do it with dblink. You just
don't want to.
Anyway, if your MS SQL server stored procedures expect to be able to
BEGIN a transaction, do some work, COMMIT it, then BEGIN another and do
some more work before COMMITTING that second piece of work, you're going
to have to do some redesign.
> The second reason is because adding permissions doesn't just happen at
> project creation time.
That's fine. Nothing stops you from issuing something like:
BEGIN;
SELECT create_it(blah);
SELECT set_permissions(blah, perms);
COMMIT;
and later using:
SELECT set_permissions(blah, otherperms);
standalone or inside another, unrelated transaction.
The point is that if your initial create and the setting of the initial
permissions must succeed or fail together, they MUST be done within a
single transaction. That is, in fact, the fundamental point of database
transactions.
What you should avoid doing is:
TRANSACTION 1 TRANSACTION 2
BEGIN;
BEGIN;
SELECT create_it(blah);
SELECT set_permissions(blah, perms);
COMMIT;
COMMIT;
... because that just won't work. It sounds like you've got that right,
but you might be doing this:
TRANSACTION 1 TRANSACTION 2
BEGIN;
BEGIN;
SET transaction_isolation = SERIALIZABLE;
-- do something else that triggers
-- freezing of the transaction's snapshot,
-- even something like:
SELECT 1;
SELECT create_it(blah);
COMMIT;
SELECT set_permissions(blah, perms);
COMMIT;
... which will also fail.
> The software I work on is middleware for the
> actual client applications and the client can assign any user
> permissions to access the project just as soon as it knows the project's
> ID, which is one of the values returned by the project creation
> function.
When the project's creation function returns, the created project is not
yet visible to other transactions, even ones begun after the function
returns. It only becomes visible after the transaction in which the
create function was called COMMITs.
This is fine if your later manipulations of the permissions etc happen
within the same transaction as the initial create (as they should).
However, if you're trying to refer to the created record from another
transaction before the one that created the record has committed, it
won't yet be visible.
Furthermore, if the transaction trying to refer to the record created by
some other transaction is at the SERIALIZABLE isolation level, it won't
be able to see the created object at all if the first statement in the
transaction was issued before the record of interest was created. It's
likely that READ COMMITTED is what you want anyway, but you really
/must/ read the documentation on transaction isolation levels etc to
understand and be sure about what you need.
> If the issue is a difference in views because the requests
> come in on different connections, then there's a time window during
> which a valid and unanticipatable request from the client could fail if
> the request happens to use a connection to communicate with the database
> that is different from the one used to create the project.
It has nothing to do with which connection is used. It's all about the
transactions involved; a READ COMMITTED or SERIALIZABLE transaction
cannot see another's dirty (uncomitted) changes, and a SERIALIZABLE
transaction cannot see committed changes newer than its own start time.
> Anyway, while I agree that adding the logic to set permissions to the
> project create function seems the simplest approach to dealing with the
> issue, I'd really rather not change the logic until I've thoroughly
> explored all other options. I CAN guarantee that the other operations
> on a project definitely won't begin until the create is committed.
OK, then you should be having no issues if you're using the READ
COMMITTED isolation level, or if the transaction that performs those
other operations begins after the transaction that created the project
committed.
> So,
> is there any way of causing a commit to not return until all the views
> are consistent?
They're always internally consistent; a transaction's view of the
database state is never contaminated by uncommitted data, and changes
happen atomically between statements as far as the transaction is
concerned (if it's in READ COMMITTED mode) or never become visible at
all (in SERIALIZABLE mode).
Different concurrent transaction's views of the database state are not
DESIGNED to be consistent when compared with each other externally by
some side channel. Two READ COMMITTED transactions that have no dirty
changes will see the same view of the database, but two SERIALIZABLE
transactions started at different times won't. Nor will a SERIALIZABLE
transaction compared to a READ COMMITTED transaction when changes have
been committed after the SERIALIZABLE transaction's snapshot was taken.
And, of course, even two READ COMMITTED transactions don't see the same
view of the database state if either or both of them have made changes.
--
Craig Ringer
From | Date | Subject | |
---|---|---|---|
Next Message | Webb Sprague | 2008-11-04 19:46:54 | Re: Semi-customized queries? Subset of SQL? Accessing the parser? Injection? |
Previous Message | Steve Atkins | 2008-11-04 19:36:15 | Re: Semi-customized queries? Subset of SQL? Accessing the parser? Injection? |