Re: Streaming replication status

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Greg Smith <greg(at)2ndQuadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming replication status
Date: 2010-01-12 20:49:01
Message-ID: 1263329342.19367.179665.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2010-01-12 at 15:11 -0500, Bruce Momjian wrote:
> Stefan Kaltenbrunner wrote:
> > Simon Riggs wrote:
> > > On Tue, 2010-01-12 at 08:24 +0100, Stefan Kaltenbrunner wrote:
> > >> Fujii Masao wrote:
> > >>> On Tue, Jan 12, 2010 at 1:21 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> > >>>> I don't think anybody can deploy this feature without at least some very
> > >>>> basic monitoring here. I like the basic proposal you made back in September
> > >>>> for adding a pg_standbys_xlog_location to replace what you have to get from
> > >>>> ps right now:
> > >>>> http://archives.postgresql.org/pgsql-hackers/2009-09/msg00889.php
> > >>>>
> > >>>> That's basic, but enough that people could get by for a V1.
> > >>> Yeah, I have no objection to add such simple capability which monitors
> > >>> the lag into the first release. But I guess that, in addition to that,
> > >>> Simon wanted the capability to collect the statistical information about
> > >>> replication activity (e.g., a transfer time, a write time, replay time).
> > >>> So I'd like to postpone it.
> > >> yeah getting that would all be nice and handy but we have to remember
> > >> that this is really our first cut at integrated replication. Being able
> > >> to monitor lag is what is needed as a minimum, more advanced stuff can
> > >> and will emerge once we get some actual feedback from the field.
> > >
> > > Though there won't be any feedback from the field because there won't be
> > > any numbers to discuss. Just "it appears to be working". Then we will go
> > > into production and the problems will begin to be reported. We will be
> > > able to do nothing to resolve them because we won't know how many people
> > > are affected.
> >
> > field is also production usage in my pov, and I'm not sure how we would
> > know how many people are affected by some imaginary issue just because
> > there is a column that has some numbers in it.
> > All of the large features we added in the past got finetuned and
> > improved in the following releases, and I expect SR to be one of them
> > that will see a lot of improvement in 8.5+n.
> > Adding detailed monitoring of some random stuff (I don't think there was
> > a clear proposal of what kind of stuff you would like to see) while we
> > don't really know what the performance characteristics are might easily
> > lead to us provding a ton of data and nothing relevant :(
> > What I really think we should do for this first cut is to make it as
> > foolproof and easy to set up as possible and add the minimum required
> > monitoring knobs but not going overboard with doing too many stats.
>
> I totally agree. If SR isn't going to be useful without being
> feature-complete, we might as well just drop it for 8.5 right now.
>
> Let's get a reasonable feature set implemented and then come back in 8.6
> to improve it. For example, there is no need for a special
> 'replication' user (just use super-user), and monitoring should be
> minimal until we have field experience of exactly what monitoring we
> need.
>
> The final commit-fest is in 5 days --- this is not the time for design
> discussion and feature additions. If we wait for SR to be feature
> complete, with design discussions, etc, we will hopelessly delay 8.5 and
> people will get frustrated. I am not saying we can't talk about design,
> but none of this should be a requirement for 8.5.

We can't add monitoring until we know what the performance
characteristics are. Hmmm. And how will we know what the performance
characteristics are, I wonder?

Anyway, I'll leave it to you now.

--
Simon Riggs www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stefan Kaltenbrunner 2010-01-12 21:02:38 Re: Streaming replication status
Previous Message Tom Lane 2010-01-12 20:42:57 Re: Streaming replication status