Re: [HACKERS] CONSTRAINTS...

From: Vadim Mikheev <vadim(at)krs(dot)ru>
To: Jan Wieck <jwieck(at)debis(dot)com>
Cc: maillist(at)candle(dot)pha(dot)pa(dot)us, sferac(at)bo(dot)nettuno(dot)it, djackson(at)cpsgroup(dot)com, pgsql-hackers(at)hub(dot)org
Subject: Re: [HACKERS] CONSTRAINTS...
Date: 1999-01-15 05:58:58
Message-ID: 369ED921.D2B784BB@krs.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jan Wieck wrote:
>
> > Two things define data visibility: SnapShot & CommandId.
> > We would have to save them for deffered rules and restore them
> > before run rule actions. But there is one issue: for what
> > scans old visibility should be used? There are scans from
> > user query and there are scans added by rule action. Ok,
> > let's assume that for added scans current visibility will be used
> > - this is what we need for RI rules (actually, something more -
> > see below).
>
> I addressed that problem (different visibility required for
> scans in one command) also in my other mail.
>
> Anyway, I just checked what happens in the following case:
>
> T1: begin;
> T1: select ...
>
> T2: update ...
>
> T1: select ... (gets the same (old) values)
>
> That's the result as long as T1 doesn't run in READ COMMITTED
> mode. And that's fine, because it doesn't have to worry
> about concurrent transactions of others.
>
> So the only problem left is the different visability. I think
> it is possible to change the visibility code not to check
> against the global command counter. Instead it might look at
> a command counter value in the range table entry related to
> the scan node. So the rewrite system and tcop could place
> the correct values there during query rewrite/processing.

Why you talk about CommandID only? What about SnapShot data?
The difference between scans in SERIALIZABLE/READ COMMITTED
isolevels is that in SERIALIZABLE mode all queries use
the same SnapShot data (array of running xactions)
and in READ COMMITTED mode new SnapShot data to use is created
for each query.

CommandId defines visibility of self-changes.
SnapShot defines visibility of concurrent changes.

>
> The range table of a rules rewritten parsetree is a
> combination of the range tables from the original user query,
> applied view rules and the the rule itself. For deferred
> rules, only the those coming with the rule action itself must
> have the command counter at COMMIT. All others must get the

Actually, not at COMMIT but when SET CONSTRAINT IMMEDIATE
is called. COMMIT just imlicitly switches into immediate mode.

> command counter value that is there when the query that fired
> this rule get's executed.
>
> The deferred querytrees can first be held in a new list of
> the rewritten querytree for the original user statement. The
> rewrite system puts into the rangetable entries
> USE_CURRENT_CMDID or USE_COMMIT_CMDID depending on where they
> are coming from.
>
> Before tcop calls the executor, a new function in the rewrite
> system is called to set the actual values for the command
> counter to use into the rangetable entries for one query and
> it's deferred ones. Then it adds all the deferred queries to
> the global deferred list and runs the query itself.
>
> At commit time, when all the deferred queries have to get
> run, those RTE's in them having USE_COMMIT_CMDID are set to
> the command counter at commit before running the plans.
> Voila.
>
> > > And it's a problem I've came across just writing this note
> > > where MVCC already could have broken rewrite rule system
> > > semantics.
> >
> > How?
>
> Yes it did!
>
> If a transaction runs in READ COMMITTED mode, the scan for
> the rules actions (performed first) could have different
> results than that for the original query (performed last).
>
> For now I see only one solution. READ COMMITTED is forbidden
> for anything that invokes non-view rules. This check must be
> done in the tcop and SPI, because saved SPI plans can be run
> without invoking the rewrite system at any time. So the plan
> must remember somewhere if READ COMMITTED is allowed for it
> or not.

READ COMMITTED will be default mode when writers in this
mode will be supported...

The solution is to use the same SnapShot data for both
action' and original scans.

But there are other abilities for inconsistances in READ COMMITTED
mode:

create table t (x int);
create table tlog (xold int, xnew int);
insert into t values (1);
create rule r as on update to t do
insert into tlog values (old.x, new.x);

Now consider two concurrent

update t set x = 2 where x = 1;
and
update t set x = 3 where x = 1;

: two rows will be inserted into tlog - (1,2) and (1,3) -
by rule actions run BEFORE original queries, but actually
only one update will be succeeded - another one will see
not 1 in t.x after first update commit and so row will not
be updated by second update...

One approach is to force FOR UPDATE OF "OLD"-table
in action' INSERT INTO ... SELECT, another one is
to run rule action AFTER original query (wouldn't
it be possible having ability to directly set
scan visibilities?).

Actually, for non-locking RI implementation (using dirty read)
constraint checking must occure AFTER data changes are made
(i.e. deffered untill query end): this will allow PK deleters
notice that there are concurrent FK inserters, UK inserters see
that someone else tries to insert the same key, etc, wait for
concurrent COMMIT/ABORT and make appropriate things after that.

More letters will follow, but maybe after week-end only...

Vadim

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Charles Hornberger 1999-01-15 06:26:08 identical joins on identical tables produce different output
Previous Message Thomas G. Lockhart 1999-01-15 05:36:04 Time zones