Re: Proposal for a cascaded master-slave replication system

From: Christopher Browne <cbbrowne(at)acm(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Proposal for a cascaded master-slave replication system
Date: 2003-11-11 20:38:53
Message-ID: m3oeviskrm.fsf@wolfe.cbbrowne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

In the last exciting episode, JanWieck(at)Yahoo(dot)com (Jan Wieck) wrote:
> I look forward to your comments.

It is not evident from the paper what approach is taken to dealing
with the duplicate key conflicts.

The example:

UPDATE table SET col1 = 'temp' where col = 'A';
UPDATE table SET col1 = 'A' where col = 'B';
UPDATE table SET col1 = 'B' where col = 'temp';

I can think of several approaches to this:

1. The present eRserv code reads what is in the table at the time of
the 'snapshot', and so tries to pass on:

update table set col1 = 'B' where otherkey = 123;
update table set col1 = 'A' where otherkey = 456;

which breaks because at some point, col1 is not unique, irrespective
of what order we apply the changes in.

2. If the contents as at the time of the COMMIT are stored in the log
table, then we would do all three updates in the destination DB, in
order, as shown above.

Either we have to:
a) Store the updated fields in the replication tables somewhere, or
b) Make the third UPDATE wait for the updates to be stored in a
file somewhere.

3. The replication code requires that any given key only be updated
once in a 'snapshot', so that the updates may be unambiguously
partitioned:

UPDATE table SET col1 = 'temp' where col = 'A' ; -- and otherkey = 123
UPDATE table SET col1 = 'A' where col = 'B'; -- and otherkey = 456
-- Must partition here before hitting #123 again --
UPDATE table SET col1 = 'B' where col = 'temp'; -- and otherkey = 123

The third UPDATE may have to be held up until the "partition" is set
up, right?

4. I seem to recall a recent discussion about the possibility of
deferring the UNIQUE constraint 'til the END of a commit, with the
result that we could simplify to

update table set col1 = 'B' where otherkey = 123;
update table set col1 = 'A' where otherkey = 456;

and discover that the UNIQUE constraint was relaxed just long enough
for us to make the TWO changes that in the end combined to being
unique.

None of these look like they turn out totally happily, or am I missing
an approach?
--
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','ntlug.org').
http://www.ntlug.org/~cbbrowne/languages.html
"Java and C++ make you think that the new ideas are like the old ones.
Java is the most distressing thing to hit computing since MS-DOS."
-- Alan Kay

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rajesh Kumar Mallah 2003-11-11 20:40:40 Re: [off-topic] Bugtracker using PostgreSQL
Previous Message Richard Huxton 2003-11-11 20:13:02 Re: update slow

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2003-11-11 22:01:06 Re: Proposal for a cascaded master-slave replication system
Previous Message Rod Taylor 2003-11-11 19:57:37 Re: About the partial tarballs