Re: Background Processes and reporting

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Oleg Bartunov <obartunov(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Vladimir Borodin <root(at)simply(dot)name>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Background Processes and reporting
Date: 2016-03-14 19:21:57
Message-ID: CA+TgmoYu4-TH7BVaY0eGSXq=zFc9K5DvJhfTs-8ceL8qXq25EA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 12, 2016 at 6:05 AM, Oleg Bartunov <obartunov(at)gmail(dot)com> wrote:
>> So?
>
> So, Robert already has experience with the subject, probably, he has bad
> experience with edb implementation and he'd like to see something better in
> community version. That's fair and I accept his position.

Bingo - though maybe "bad" experience is not quite as accurate as
"could be better".

> Wait monitoring is one of the popular requirement of russian companies, who
> migrated from Oracle. Overwhelming majority of them use Linux, so I suggest
> to have configure flag for including wait monitoring at compile time
> (default is no wait monitoring), or have GUC variable, which is also off by
> default, so we have zero to minimal overhead of monitoring. That way we'll
> satisfy many enterprises and help them to choose postgres, will get feedback
> from production use and have time for feature improving.

So, right now we can only display the wait information in
pg_stat_activity. There are a couple of other things that somebody
might want to do:

1. Sample the wait state information across all backends in the
system. On a large, busy system, this figures to be quite cheap, and
the sampling interval could be configurable.

2. Count every instance of every wait event in every backend, and roll
that up either via shared memory or additional stats messges.

3. Like #2, but with timing information.

4. Like #2, but on a per-query basis, somehow integrated with
pg_stat_statements.

The challenge with any of these except #1 is that they are going to
produce a huge volume of data, and, whether you believe it or not, #3
is going to sometimes be crushingly slow. Really. I tend to think
that #1 might be better than #2 or #3, but I'm not unwilling to listen
to contrary arguments, especially if backed up by careful benchmarking
showing that the performance hit is negligible. My reason for wanting
to get the stuff we already had committed first is because I have
found that it is best to proceed with these kinds of problems
incrementally, not trying to solve too much in a single commit. Now
that we have the basics, we can build on it, adding more wait events
and possibly more recordkeeping for the ones we have already - but
anything that regresses performance for people not using the feature
is a dead end in my book, as is anything that introduces overall
stability risks.

I think the way forward from here is that Postgres Pro should (a)
rework their implementation to work with what has already been
committed, (b) consider carefully whether they've done everything
possible to contain the performance loss, (c) benchmark it on several
different machines and workloads to see how much performance loss
there is, and (d) stop accusing me of acting in bad faith.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-03-14 19:26:46 Re: Fix for OpenSSL error queue bug
Previous Message Tom Lane 2016-03-14 19:10:13 Upcoming back-branch releases