Re: Log-based repliaction?

From: "Uwe C(dot) Schroeder" <uwe(at)oss4u(dot)com>
To: pgsql-general(at)postgresql(dot)org
Cc: Tobias Brox <tobias(at)nordicbet(dot)com>
Subject: Re: Log-based repliaction?
Date: 2006-10-21 04:41:23
Message-ID: 200610202141.23563.uwe@oss4u.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


For your immediate needs I'd recommend slony.
It's a quite reliable replication solution that works quite nicely.
It will sync the replica in nearly real-time and you won't have any
transactional problems either.

Uwe

On Friday 20 October 2006 21:22, Tobias Brox wrote:
> I would eventually like to:
>
> 1) find a way to (stress) test a mockup database server.
>
> 2) fix a near-realtime replication of our production database server
> for pulling out statistics and reports. Minor bugs would be
> permitable in such a setup. We don't need a full-fledged data
> warehouse solution just yet.
>
> 3) set up a failover solution *)
>
> Now, why not just get the production database to log all the queries,
> and feed them into a replication database? I guess this solution
> could be used for archieving any of the three things above. This idea
> is very trivial, but I never found any pointers while googling, so I
> assume there are some problems with this approach?
>
> Here is some things I can think of at the moment:
>
> - logging all queries at the production database will slow it down
> considerably (haven't done much testing here)
>
> - transactional model can easily be broken (postgres can log
> transaction starts, commits and rollbacks, and the transactions are
> also supposed to be serializable ... so I don't see the issue?)
>
> - disregarded due to the resource consumption on the replica server.
>
> - some of the transactions aren't really serializable, or relies on
> the exact timestamp for the operation. **)
>
> - unlike the wal, the log file doesn't get synced for every
> transaction, and this will cause delays and potentially data loss.
>
> ...anything else?
>
> The simplest thing would be to have one connection open towards the
> replica for every connection made towards the production database, run
> every query in order, and hope the best - should work good for problem
> 1 and problem 2 above.
>
> Still, maybe better (for 2) to filter out only queries altering the
> table and transactions ending with a commit - and do the transactions
> one by one, ordered by commit time. Would save quite some CPU cycles
> on the replica database compared to the suggestion above.
>
> I was administring a mysql database quite some years ago ... well, it
> was a simple hobby project and we didn't even need transactional
> operations. However, we did need point-in-time-recovery. The mysql
> way of fixinge this was to write a separate log containing only the
> queries involving writing to the database. This log was really nice
> to work with, and it could easily be used for replication as well.
>
>
>
> *) boss has decided that a failover solution is important and should
> be prioritied in 2007 ... even to the point where he's willing to of
> throw money at it. If anyone have experiences with failover-solutions
> built over a SAN, and can do consultancy services for us, please send
> me a private email ;-)
>
> **) We have had problems with transactions doing "if not exists in
> database then insert". Two such transactions can be run synchronously
> and result in duplicates.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
UC

--
Open Source Solutions 4U, LLC 1618 Kelly St
Phone: +1 707 568 3056 Santa Rosa, CA 95401
Cell: +1 650 302 2405 United States
Fax: +1 707 568 6416

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Richard Broersma Jr 2006-10-21 05:03:12 Re: Log-based repliaction?
Previous Message Tobias Brox 2006-10-21 04:22:44 Log-based repliaction?