Re: Streaming replication status

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming replication status
Date: 2010-01-12 18:48:38
Message-ID: 4B4CC406.30207@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs wrote:
> On Tue, 2010-01-12 at 08:24 +0100, Stefan Kaltenbrunner wrote:
>> Fujii Masao wrote:
>>> On Tue, Jan 12, 2010 at 1:21 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
>>>> I don't think anybody can deploy this feature without at least some very
>>>> basic monitoring here. I like the basic proposal you made back in September
>>>> for adding a pg_standbys_xlog_location to replace what you have to get from
>>>> ps right now:
>>>> http://archives.postgresql.org/pgsql-hackers/2009-09/msg00889.php
>>>>
>>>> That's basic, but enough that people could get by for a V1.
>>> Yeah, I have no objection to add such simple capability which monitors
>>> the lag into the first release. But I guess that, in addition to that,
>>> Simon wanted the capability to collect the statistical information about
>>> replication activity (e.g., a transfer time, a write time, replay time).
>>> So I'd like to postpone it.
>> yeah getting that would all be nice and handy but we have to remember
>> that this is really our first cut at integrated replication. Being able
>> to monitor lag is what is needed as a minimum, more advanced stuff can
>> and will emerge once we get some actual feedback from the field.
>
> Though there won't be any feedback from the field because there won't be
> any numbers to discuss. Just "it appears to be working". Then we will go
> into production and the problems will begin to be reported. We will be
> able to do nothing to resolve them because we won't know how many people
> are affected.

field is also production usage in my pov, and I'm not sure how we would
know how many people are affected by some imaginary issue just because
there is a column that has some numbers in it.
All of the large features we added in the past got finetuned and
improved in the following releases, and I expect SR to be one of them
that will see a lot of improvement in 8.5+n.
Adding detailed monitoring of some random stuff (I don't think there was
a clear proposal of what kind of stuff you would like to see) while we
don't really know what the performance characteristics are might easily
lead to us provding a ton of data and nothing relevant :(
What I really think we should do for this first cut is to make it as
foolproof and easy to set up as possible and add the minimum required
monitoring knobs but not going overboard with doing too many stats.

Stefan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2010-01-12 18:54:27 Re: mailing list archiver chewing patches
Previous Message Greg Smith 2010-01-12 18:46:13 Re: Clearing global statistics