Re: Replication identifiers, take 3

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Steve Singer <steve(at)ssinger(dot)info>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication identifiers, take 3
Date: 2014-10-04 20:13:30
Message-ID: 543054EA.50902@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/2/14, 7:28 AM, Robert Haas wrote:
> On Thu, Oct 2, 2014 at 4:49 AM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> >An origin column in the table itself helps tremendously to debug issues with
>> >the replication system. In many if not most scenarios, I think you'd want to
>> >have that extra column, even if it's not strictly required.
> I like a lot of what you wrote here, but I strongly disagree with this
> part. A good replication solution shouldn't require changes to the
> objects being replicated.
I agree that asking users to modify objects is bad, but I also think that if you do have records coming into one table from multiple sources then you will need to know what system they originated on.

Maybe some sort of "hidden" column would work here? That means users don't need to modify anything (including anything doing SELECT *), but the data is there.

As for space concerns I think the answer there is to somehow normalize the identifiers themselves. That has the added benefit of allowing a rename of a source to propagate to all the data already replicated from that source.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2014-10-04 20:25:15 Re: Aussie timezone database changes incoming
Previous Message Jim Nasby 2014-10-04 20:05:02 Re: Log notice that checkpoint is to be written on shutdown