Re: [HACKERS] database replication

From: Duane Currie <dcurrie(at)sandman(dot)acadiau(dot)ca>
To: DWalker(at)black-oak(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] database replication
Date: 1999-12-30 10:30:58
Message-ID: 199912301030.GAA18698@sandman.acadiau.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Guys,

Now for one of my REALLY rare posts.
Having done a little bit of distributed data systems, I figured I'd
pitch in a couple cents worth.

> 2) The replication system will need to add at least one field to each
> table in each database that needs to be re plicated. &nbsp;This
> field will be a date/time stamp which identifies the &quot; last
> update&quot; of the record. &nbsp;This field will be called PGR_TIME
> for la ck of a better name. &nbsp;Because this field will be used
> from within programs and triggers it can be longer so as to not
> mistake it for a user field.

I just started reading this thread, but I figured I'd throw in a couple
suggestions for distributed data control (a few idioms I've had to
deal with b4):
- Never use time (not reliable from system to system). Use
a version number of some sort that can stay consistent across
all replicas

This way, if a system's time is or goes out of wack, it doesn't
cause your database to disintegrate, and it's easier to track
conflicts (see below. If using time, the algorithm gets
nightmarish)

- On an insert, set to version 1

- On an update, version++

- On a delete, mark deleted, and add a delete stub somewhere for the
replicator process to deal with in sync'ing the databases.

- If two records have the same version but different data, there's
a conflict. A few choices:
1. Pick one as the correct one (yuck!! invisible data loss)
2. Store both copies, pick one as current, and alert
database owner of the conflict, so they can deal with
it "manually."
3. If possible, some conflicts can be merged. If a disjoint
set of fields were changed in each instance, these changes
may both be applied and the record merged. (Problem:
takes a lot more space. Requires a version number for
every field, or persistent storage of some old records.
However, this might help the "which fields changed" issue
you were talking about in #6)

- A unique id across all systems should exist (or something that
effectively simulates a unique id. Maybe a composition of the
originating oid (from the insert) and the originating database
(oid of the database's record?) might do it. Store this as
an extra field in every record.

(Two extra fieldss so far: 'unique id' and 'version')

I do like your approach: triggers and a separate process. (Maintainable!! :)

Anyway, just figured I'd throw in a few suggestions,
Duane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Theo Kramer 1999-12-30 11:16:33 Re: [HACKERS] Source code format vote
Previous Message Adriaan Joubert 1999-12-30 10:17:24 Re: [HACKERS] Source code format vote