From: | Hannu Krosing <hannu(at)skype(dot)net> |
---|---|
To: | Alexey Klyukin <alexk(at)commandprompt(dot)com> |
Cc: | "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Marko Kreen <markokr(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Some questions about mammoth replication |
Date: | 2007-10-12 10:47:44 |
Message-ID: | 1192186064.16408.7.camel@hannu-laptop |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Ühel kenal päeval, R, 2007-10-12 kell 12:39, kirjutas Alexey Klyukin:
> Hannu Krosing wrote:
>
> > > We have hooks in executor calling our own collecting functions, so we
> > > don't need the trigger machinery to launch replication.
> >
> > But where do you store the collected info - in your own replication_log
> > table, or do reuse data in WAL you extract it on master befor
> > replication to slave (or on slave after moving the WAL) ?
>
> We don't use either a log table in database or WAL. The data to
> replicate is stored in disk files, one per transaction.
Clever :)
How well does it scale ? That is, at what transaction rate can your
replication keep up with database ?
> As Joshua said,
> the WAL is used to ensure that only those transactions that are recorded
> as committed in WAL are sent to slaves.
How do you force correct commit order of applying the transactions ?
> >
> > > > Do you make use of snapshot data, to make sure, what parts of WAL log
> > > > are worth migrating to slaves , or do you just apply everything in WAL
> > > > in separate transactions and abort if you find out that original
> > > > transaction aborted ?
> > >
> > > We check if a data transaction is recorded in WAL before sending
> > > it to a slave. For an aborted transaction we just discard all data collected
> > > from that transaction.
> >
> > Do you duplicate postgresql's MVCC code for that, or will this happen
> > automatically via using MVCC itself for collected data ?
>
> Every transaction command that changes data in a replicated relation is
> stored on disk. PostgreSQL MVCC code is used on a slave in a natural way
> when transaction commands are replayed there.
Do you replay several transaction files in the same transaction on
slave ?
Can you replay several transaction files in parallel ?
> > How do you handle really large inserts/updates/deletes, which change say 10M
> > rows in one transaction ?
>
> We produce really large disk files ;). When a transaction commits - a
> special queue lock is acquired and transaction is enqueued to a sending
> queue.
> Since the locking mode for that lock is exclusive a commit of a
> very large transaction would delay commits of other transactions until
> the lock is held. We are working on minimizing the time of holding this
> lock in the new version of Replicator.
Why does it take longer to queue a large file ? dou you copy data from
one file to another ?
> > > > Do you extract / generate full sql DML queries from data in WAL logs, or
> > > > do you apply the changes at some lower level ?
> > >
> > > We replicate the binary data along with a command type. Only the data
> > > necessary to replay the command on a slave are replicated.
> >
> > Do you replay it as SQL insert/update/delete commands, or directly on
> > heap/indexes ?
>
> We replay the commands directly using heap/index functions on a slave.
Does that mean that the table structures will be exactly the same on
both master slave ? That is, do you replicate a physical table image
(maybe not including transaction ids on master) ?
Or you just use lower-level versions on INSERT/UPDATE/DELETE ?
---------------------
Hannu
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2007-10-12 11:00:54 | Re: First steps with 8.3 and autovacuum launcher |
Previous Message | Magnus Hagander | 2007-10-12 10:41:17 | Re: ECPG regression tests |