From: | Satoshi Nagayasu <snaga(at)uptime(dot)jp> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Qi Huang <huangqiyx(at)hotmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com> |
Subject: | Re: pg_stat_lwlocks view - lwlocks statistics, round 2 |
Date: | 2012-10-15 16:19:37 |
Message-ID: | 507C3799.9010803@uptime.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2012/10/15 1:43, Tom Lane wrote:
> Satoshi Nagayasu <snaga(at)uptime(dot)jp> writes:
>> (2012/10/14 13:26), Fujii Masao wrote:
>>> The tracing lwlock usage seems to still cause a small performance
>>> overhead even if reporting is disabled. I believe some users would
>>> prefer to avoid such overhead even if pg_stat_lwlocks is not available.
>>> It should be up to a user to decide whether to trace lwlock usage, e.g.,
>>> by using trace_lwlock parameter, I think.
>
>> Frankly speaking, I do not agree with disabling performance
>> instrument to improve performance. DBA must *always* monitor
>> the performance metrix when having such heavy workload.
>
> This brings up a question that I don't think has been honestly
> considered, which is exactly whom a feature like this is targeted at.
> TBH I think it's of about zero use to DBAs (making the above argument
> bogus). It is potentially of use to developers, but a DBA is unlikely
> to be able to do anything about lwlock-level contention even if he has
> the knowledge to interpret the data.
Actually, I'm not a developer. I'm just a DBA, and I needed such
instrument when I was asked to investigate storange WAL behavior
that produced unexpected/random commit delays under heavy workload.
And another patch (WAL dirty flush statistic patch) I have submitted
is coming from the same reason.
https://commitfest.postgresql.org/action/patch_view?id=893
Unfortunately, since I didn't have such instrument at that time,
I used SystemTap to investigate WAL behaviors, including calls and
waited time, but using SystemTap was really awful, and I thought
PostgreSQL needs to have some "built-in" instrument for DBA.
I needed to determine the bottleneck around WAL, such as lock contension
and/or write performance of the device, but I couldn't find anything
without an instrument.
I accept that I'm focusing on only WAL related lwlocks, and it is not
enough for ordinally DBAs, but I still need it to understand PostgreSQL
behavior without having deep knowledge and experience on WAL design and
implementation.
> So I feel it isn't something that should be turned on in production
> builds. I'd vote for enabling it by a non-default configure option,
> and making sure that it doesn't introduce any overhead when the option
> is off.
There is another option to eliminate performance overhead for this
purpose.
As I tried in the first patch, instead of reporting through pgstat
collector process, each backend could directly increment lwlock
counters allocated in the shared memory.
http://archives.postgresql.org/message-id/4FE9A6F5.2080405@uptime.jp
Here are another benchmark results, including my first patch.
[HEAD]
number of transactions actually processed: 3439971
tps = 57331.891602 (including connections establishing)
tps = 57340.932324 (excluding connections establishing)
[My first patch]
number of transactions actually processed: 3453745
tps = 57562.196971 (including connections establishing)
tps = 57569.197838 (excluding connections establishing)
Actually, I'm not sure why my patch makes PostgreSQL faster, :D
but the performance seems better than my second patch.
I think it still needs some trick to keep counters in "pgstat.stat"
over restarting, but it would be more acceptable in terms of
performance overhead.
Regards,
--
Satoshi Nagayasu <snaga(at)uptime(dot)jp>
Uptime Technologies, LLC. http://www.uptime.jp
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2012-10-15 16:31:09 | Re: BUG #7534: walreceiver takes long time to detect n/w breakdown |
Previous Message | Robert Haas | 2012-10-15 15:57:18 | Re: Truncate if exists |