Re: Catalog/Metadata consistency during changeset extraction from wal

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Andres Freund" <andres(at)2ndquadrant(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Florian Pflug" <fgp(at)phlo(dot)org>,<pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Catalog/Metadata consistency during changeset extraction from wal
Date: 2012-06-25 16:42:41
Message-ID: 4FE84EB10200002500048AA2@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> I bet for a lot of replication systems, the answer is "do a full
> resync". In other words, we either forbid the operation outright
> when the table is enabled for logical replication, or else we emit
> an LCR that says, in effect, "transaction 12345 monkeyed with the
> table, please resync". It strikes me that it's really the job of
> some higher-level control logic to decide what the "correct"
> behavior is in these cases; the decoding process doesn't really
> have enough information about what the user is trying to do to
> make a sensible decision anyway.

This is clearly going to depend on the topology. You would
definitely want to try to replicate the DDL for the case on which
Simon is focused (which seems to me to be essentially physical
replication of catalogs with logical replication of data changes
from any machine to all others). What you do about transactions in
flight is the hard part. You could try to suppress concurrent DML
of the same objects or have some complex matrix of rules for trying
to resolve the transactions in flight. I don't see how the latter
could ever be 100% accurate.

In our shop it is much easier. We always have one database which is
the only valid source for any tuple, although rows from many such
databases can be in one table, and one row might replicate to many
databases. Thus, we don't want automatic replication of DDL.

- When a column is going to be added to the source machines, we
first add it to the targets, with either a default or as
NULL-capable.

- When a column is going to be deleted from the source machines, we
make sure it is NULL-capable or has a default on the replicas.
We drop it from all replicas after it is gone from all sources.

- If a column is changing name or is changing to a fundamentally
different type we need to give the new column a new name, have
triggers to convert old to new (and vice versa) on the replicas,
and drop the old after all sources are updated.

- If a column is changing in a minor way, like its precision, we
make sure the replicas can accept either format until all sources
have been converted. We update the replicas to match the sources
after all sources are converted.

We most particularly *don't* want DDL to replicate automatically,
because the schema changes are deployed along with related software
changes, and we like to pilot any changes for at least a few days.
Depending on the release, the rollout may take a couple months, or
we may slam in out everywhere a few days after the first pilot
deployment.

So you could certainly punt all of this for any release as far as
Wisconsin Courts are concerned. We need to know table and column
names, before and after images, and some application-supplied
metadata.

I don't know that what we're looking for is any easier (although I
doubt that it's any harder), but I'm starting to wonder how much
mechanism they can really share. The 2Q code is geared toward page
format OIDs and data values for automated DDL distribution and
faster replication, while we're looking for something which works
between releases, architectures, and OSes. We keep coming back to
the idea of one mechanism because both WAL and a logical transaction
stream would have "after" tuples, although they need them in
different formats.

I think the need for truly logical replication is obvious, since so
many different people have developed trigger-based versions of that.
And it sure seems like 2Q has clients who are willing to pay for the
other.

Perhaps the first question is: Is there enough in common between
logical replication (and all the topologies that might be created
with that) and the proposal on the table (which seems to be based
around one particular topology with a vague notion of bolting
logical replication on to it after the fact) to try to resolve the
differences in one feature? Or should the "identical schema with
multiple identical copies" case be allowed to move forward more or
less in isolation, with logical replication having its own design if
and when someone wants to take it on? Two non-compromised features
might be cleaner -- I'm starting to feel like we're trying to design
a toaster which can also water your garden.

-Kevin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-06-25 17:09:38 Re: Catalog/Metadata consistency during changeset extraction from wal
Previous Message Tom Lane 2012-06-25 16:35:02 Re: [COMMITTERS] pgsql: Remove sanity test in XRecOffIsValid.