Re: RFC: replace pg_stat_activity.waiting with something more descriptive

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Date: 2015-09-14 11:50:05
Message-ID: CAPpHfdt373MTh0NvOaTbExWCBbzBBoFKnHGonKPfLp9=fpZEXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 14, 2015 at 2:12 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:

> On Mon, Sep 14, 2015 at 2:25 PM, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
> wrote:
>
>> On Sat, Sep 12, 2015 at 2:05 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
>> wrote:
>>
>>> On Thu, Aug 6, 2015 at 3:31 PM, Ildus Kurbangaliev <
>>> i(dot)kurbangaliev(at)postgrespro(dot)ru> wrote:
>>> >
>>> > On 08/05/2015 09:33 PM, Robert Haas wrote:
>>> >>
>>> >>
>>> >> You're missing the point. Those multi-byte fields have additional
>>> >> synchronization requirements, as I explained in some detail in my
>>> >> previous email. You can't just wave that away.
>>> >
>>> > I see that now. Thank you for the point.
>>> >
>>> > I've looked deeper and I found PgBackendStatus to be not a suitable
>>> > place for keeping information about low level waits. Really,
>>> PgBackendStatus
>>> > is used to track high level information about backend. This is why
>>> auxiliary
>>> > processes don't have PgBackendStatus, because they don't have such
>>> information
>>> > to expose. But when we come to the low level wait events then auxiliary
>>> > processes are as useful for monitoring as backends are. WAL writer,
>>> > checkpointer, bgwriter etc are using LWLocks as well. This is
>>> certainly unclear
>>> > why they can't be monitored.
>>> >
>>>
>>> I think the chances of background processes stuck in LWLock is quite less
>>> as compare to backends as they do the activities periodically. As an
>>> example
>>> WALWriter will take WALWriteLock to write the WAL, but actually there
>>> will never
>>> be any much contention for WALWriter. In synchronous_commit = on, the
>>> backends themselves write the WAL so WALWriter won't do much in that
>>> case and for synchronous_commit = off, backends won't write the WAL so
>>> WALWriter won't face any contention unless some buffers have to be
>>> written
>>> by bgwriter or checkpoint for which WAL is not flushed which I don't
>>> think
>>> would lead to any contention.
>>>
>>
>> Hmm, synchronous_commit is per session variable: some transactions could
>> run with synchronous_commit = on, but some with synchronous_commit = off.
>> This is very popular feature of PostgreSQL: achieve better performance by
>> making non-critical transaction asynchronous while leaving critical
>> transactions synchronous. Thus, contention for WALWriteLock between
>> backends and WALWriter could be real.
>>
>>
> I think it is difficult to say that can lead to contention due to periodic
> nature of WALWriter, but I don't deny that there is chance for
> background processes to have contention.
>

We don't know if there could be contention in advance. This is why we need
monitoring.

> I am not denying from the fact that there could be some contention in rare
>>> scenarios for background processes, but I think tracking them is not as
>>> important as tracking the LWLocks for backends.
>>>
>>
>> I would be more careful in calling some of scenarios rare. As DBMS
>> developers we should do our best to evade contention for LWLocks: any
>> contention, not only between backends and background processes. One may
>> assume that high LWLock contention is rare scenario in general. Once we're
>> here we doesn't think so, though.
>> You claims that there couldn't be contention for WALWriteLock between
>> backends and WALWriter. This is unclear for me: I think it could be.
>>
>
> I think there would be more things where background processes could wait
> than LWLocks and I think they are important to track, but could be done
> separately
> from tracking them for pg_stat_activity. Example, we have a
> pg_stat_bgwriter
> view, can't we think of tracking bgwriter/checkpointer wait information in
> that
> view and similarly for other background processes we can track in other
> views
> if any related view exists or create a new one to track for all background
> processes.
>
>
>> Nobody opposes tracking wait events for backends and tracking them for
>> background processes. I think we need to track both in order to provide
>> full picture to DBA.
>>
>>
> Sure, that is good to do, but can't we do it separately in another patch.
> I think in this patch lets just work for wait_events for backends.
>

Yes, but I think we should have a design of tracking wait event for every
process before implementing this only for backends.

> Also as we are planning to track the wait_event information in
>>> pg_stat_activity
>>> along with other backends information, it will not make sense to include
>>> information about backend processes in this variable as pg_stat_activity
>>> just displays information of backend processes.
>>>
>>
>> I'm not objecting that we should track only backends information in
>> pg_stat_activity. I think we should have also some other way of tracking
>> wait events for background processes. We should think it out before
>> extending pg_stat_activity to evade design issues later.
>>
>>
> I think we can discuss if you see any specific problems or you want
> specific
> things to be clarified, but sorting out the complete design of waits
> monitoring
> before this patch can extend the scope of this patch beyond need.
>

I think we need to sort out at least some part of this design: where to
store current event information for every process, not only backend. Other
way, we can't be sure we're moving towards waits monitoring not backwards.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-09-14 12:03:55 Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Previous Message Alexander Korotkov 2015-09-14 11:41:31 Re: RFC: replace pg_stat_activity.waiting with something more descriptive