From: | Peter Wilson <petew(at)yellowhawk(dot)co(dot)uk> |
---|---|
To: | gordan(at)bobich(dot)net |
Subject: | Re: Replication Using Triggers |
Date: | 2008-01-18 16:34:07 |
Message-ID: | 4790D4FF.2090604@yellowhawk.co.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
gordan(at)bobich(dot)net wrote:
>
>
> On Fri, 18 Jan 2008, Erik Jones wrote:
>
>>> This is what I have in mind:
>>>
>>> Have a plperl function that creates connections to all servers in the
>>> cluster (replication partners), and issues the supplied write query
>>> to them, possibly with a tag of some sort to indicated it is a
>>> replicated query (to prevent circular replication).
>>>
>>> Have a post execute trigger that calls the above replication function
>>> if the query was issued directly (as opposed to replicated), and
>>> passes it the query it just executed if it was successful.
Not sure here if you mean literally the SQL query that was executed - in which
case you have all sorts of problems with sequences and functions returning
different values.
>>>
>>> If the replication failed on any node, the whole thing gets rolled back.
>>
>
>>> This would effectively give star topology synchronous replication
>>> with very little effort, and no need for any external code. Am I
>>> missing something obvious that would prevent this from working? It
>>> would give replication capabilities better than MySQL's (which can
>>> only handle ring based multi-master replication) for the sake of
>>> about 100 lines of code. None of the required functionality required
>>> is new to PostgreSQL, either.
But there are plenty of solutions that do a lot better than this. Slony-I is the
most polular. My favourite is a spin on the old db_mirror that used to be part
of the Postgres distribution.
I can't talk about how Slony works, but db_mirror uses a very fast 'C' function
to capture changes in a set of simple replication tables. A replication process
then takes data from those tables and replicates (using actual values not the
SQL statement) to any number of other servers. If one of the servers is down,
the data remains in the replication tables until that node returns (or is removed).
The problem with db_mirror was that the replication process was written in Perl.
This worked fine for simple tests but was ridiculously slow for replicating
tables holding big BYTEA structures. I re-wrote the replication code in 'C' and
it can replicate just about arbitrarily complex transactions is close to real-time.
You seem to be re-inventing the wheel, and the re-invention is not quite as
round as the existing wheel :-)
>>>
>>> Is there an existing implementation of this? Perhaps a perl program
>>> that creates the required triggers and stored procedures from looking
>>> at a schema?
>>
>> What you've described here would be pretty simple to implement.
>> However, I think you've greatly underestimated the performance issues
>> involved. If you need to push data to multiple databases before each
>> transaction commits I think you'll find that pretty slow.
>
> Only if transactions are used. I'm basing the requirements on "at least
> as good as MySQL", which this would meet without transactions. If
> transactions are wanted they could be enabled, otherwise it could just
> be fire and forget asynchronous replication a-la MySQL. Having a choice
> between transactions and speed is good. :-)
Synchronous replication tends to imply it works on all servers simultaneously or
not on any. If any server fails a transaction it's rolled back on all servers.
What you're describing sounds asynchronous to me.
>
> One thing I haven't quite thought of a good way to do with this approach
> is the equivalent of the useful (albeit deprecated) LOAD DATA FROM
> MASTER command, that gets the server in sync by dropping and re-loading
> all the tables from the master(s) (or rather, peers in a multi-master
> star replication), and enables it in the replication. It would be neater
> than requiring downtime or global write locks. But I guess that could
> wait until version 2. :)
That's one thing. The other problem that most trigger based replication systems
have problems with is propogating schema changes - because (I think) you can
attach triggers to schema changes.
>
> Thanks.
> Gordan
>
Pete
--
Peter Wilson : http://www.whitebeam.org
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2008-01-18 16:37:32 | Re: Forgot to dump old data before re-installing machine |
Previous Message | Dave Page | 2008-01-18 16:31:15 | Re: Forgot to dump old data before re-installing machine |