From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Steve Singer <steve(at)ssinger(dot)info> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Replication Node Identifiers and crashsafe Apply Progress |
Date: | 2013-11-21 11:04:55 |
Message-ID: | 20131121110455.GH7240@alap2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2013-11-19 18:49:27 -0500, Steve Singer wrote:
> >But for that the receiving side needs to know up to where changes have
> >been applied. One relatively easy solution for that is that the
> >receiving side does something like:
> >UPDATE replication_progress SET lsn = '0/10000600' WHERE source_id = ...;
> >before the end of every replayed transaction. But that obviously will
> >quickly cause bloat.
>
> I don't see how this is going to cause any more bloat than what
> trigger-based slony does today with sl_confirm and I don't hear a lot of
> complaints about that being a big problem.
FWIW, bloat on slony's tables (including sl_confirm) is one of the major
reasons I've seen people move away from slony for production, and use it
only for upgrades.
It's only really a problem if you have longrunning transactions on the
standby, but that's a pretty major use-case of having replicas.
> This might be because slony doesn't do a commit on the replica for
> every transaction but groups the transactions together, logical slony
> will behave the same way where we would only commit on SYNC
> transactions.
But yes, the grouping of transactions certainly makes for a major
difference. I don't think we want to force solutions to commit
transactions in batches. Not the least because that obviously prohibits
using a standby as a synchronous replica.
> >* Do we want to allow setting (remote_lsn, remote_timestamp,
> > remote_node_id) via SQL? Currently the remote_node_id can be set as a
> > GUC, but the other's can't. They probably should be a function that
> > can be called instead of GUCs?
>
> A way of advancing the replication pointer via SQL would be nice, otherwise
> I'll just have to write my own C function that I will invoke via SQL (which
> sin't hard but everyone would need to do the same)
But don't you already essentially perform the actual inserts via C in
new slonys? That's mainly the reason I wasn't sure it's needed.
But then, providing a function to do that setup isn't hard.
> What does building up node_id key from (sysid, tlid, remote_dbid,
> local_dbid, name) get us over just mapping from an arbitrary name
> field to a 16 bit node_id ?
It avoids the need to manually assign ids to systems in many cases. I've
seen people complain about that a fair bit.
But it seems pretty clear that a more arbitrary identifier is preferred
so far, so I'll go for that.
Thanks for the comments,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2013-11-21 11:12:27 | Re: Replication Node Identifiers and crashsafe Apply Progress |
Previous Message | David Rowley | 2013-11-21 10:52:37 | b21de4e7b32f868a23bdc5507898d36cbe146164 seems to be two bricks shy of a load |