| From: | Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> |
|---|---|
| To: | Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Joel Jacobson <joel(at)trustly(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: pg_stat_activity.waiting_start |
| Date: | 2017-01-08 01:04:04 |
| Message-ID: | a548f389-918b-1c29-391f-1f4edd017eba@BlueTreble.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 12/28/16 10:26 PM, Robert Haas wrote:
> On Wed, Dec 28, 2016 at 1:06 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> writes:
>>> On 12/28/16 11:25 AM, Tom Lane wrote:
>>>> The idea of just capturing the wait start for heavyweight locks, and
>>>> not other lock types, still seems superior to any of the alternatives
>>>> that have been suggested ...
>>
>>> Is some kind of alarm a viable option for the others? If setting the
>>> alarm is cheap,
>>
>> Setting an alarm is certain to require a gettimeofday and/or a kernel
>> call. It is far from cheap.
>
> If one were OK with a really-really-rough value for when the wait
> started, you could have a slot for a timestamp in PGPROC that is
> cleared by pgstat_report_wait_end() but not set by
> pgstat_report_wait_start(). Then you could have a background process
> that goes through and sets it for all processes advertising a wait
> event every second, or every 10 seconds, or every half second, or
> whatever you want. Of course this doesn't eliminate the overhead but
> it pushes it into the background which, if there are idle CPUs, is
> almost as good.
>
> I previously proposed something similar to this as a way of profiling
> waits and I believe you weren't very supportive of it, but I still
> think these kinds of ideas have some legs. We can neither take the
> approach that timestamp overhead is irrelevant nor the position that
> timestamps are expensive so let's not have any. Some third way has to
> be found.
I like that idea as a compromise. I've spent a lot of time looking at
systems with no process > 90% CPU and IO < 90% yet individual backends
being slow and wished I had some idea of what the backend was doing that
wasn't being accounted for.
BTW, instead of a flag I'd consider using a counter. That would allow
the background process to notice if backends were going into waits that
ultimately got resolved in between scans of PGPROC. That's obviously not
useful for pg_stat_activity, but it would be meaningful for anything
that was trying to accumulate wait time stats.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jim Nasby | 2017-01-08 01:09:22 | Re: pg_stat_activity.waiting_start |
| Previous Message | Jim Nasby | 2017-01-08 00:52:39 | Re: increasing the default WAL segment size |