Re: SLRU statistics

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SLRU statistics
Date: 2020-05-14 06:27:37
Message-ID: fd082235-c8dc-3d99-3703-1ba183dc252e@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/05/07 13:47, Fujii Masao wrote:
>
>
> On 2020/05/03 1:59, Tomas Vondra wrote:
>> On Sat, May 02, 2020 at 12:55:00PM +0200, Tomas Vondra wrote:
>>> On Sat, May 02, 2020 at 03:56:07PM +0900, Fujii Masao wrote:
>>>>
>>>> ...
>>>
>>>>>> Another thing I found is; pgstat_send_slru() should be called also by
>>>>>> other processes than backend? For example, since clog data is flushed
>>>>>> basically by checkpointer, checkpointer seems to need to send slru stats.
>>>>>> Otherwise, pg_stat_slru.flushes would not be updated.
>>>>>>
>>>>>
>>>>> Hmmm, that's a good point. If I understand the issue correctly, the
>>>>> checkpointer accumulates the stats but never really sends them because
>>>>> it never calls pgstat_report_stat/pgstat_send_slru. That's only called
>>>>> from PostgresMain, but not from CheckpointerMain.
>>>>
>>>> Yes.
>>>>
>>>>> I think we could simply add pgstat_send_slru() right after the existing
>>>>> call in CheckpointerMain, right?
>>>>
>>>> Checkpointer sends off activity statistics to the stats collector in
>>>> two places, by calling pgstat_send_bgwriter(). What about calling
>>>> pgstat_send_slru() just after pgstat_send_bgwriter()?
>>>>
>>>
>>> Yep, that's what I proposed.
>>>
>>>> In previous email, I mentioned checkpointer just as an example.
>>>> So probably we need to investigate what process should send slru stats,
>>>> other than checkpointer. I guess that at least autovacuum worker,
>>>> logical replication walsender and parallel query worker (maybe this has
>>>> been already covered by parallel query some mechanisms. Sorry I've
>>>> not checked that) would need to send its slru stats.
>>>>
>>>
>>> Probably. Do you have any other process type in mind?
>
> No. For now what I'm in mind are just checkpointer, autovacuum worker,
> logical replication walsender and parallel query worker. Seems logical
> replication worker and syncer have sent slru stats via pgstat_report_stat().

Let me go back to this topic. As far as I read the code again, logical
walsender reports the stats at the exit via pgstat_beshutdown_hook()
process-exit callback. But it doesn't report the stats while it's running.
This is not the problem only for SLRU stats. We would need to consider
how to handle the stats by logical walsender, separately from SLRU stats.

Autovacuum worker reports the stats at the exit via pgstat_beshutdown_hook(),
too. Unlike logical walsender, autovacuum worker is not the process that
basically keeps running during the service. It exits after it does vacuum or
analyze. So it's not bad to report the stats only at the exit, in autovacuum
worker case. There is no need to add extra code for SLRU stats report by
autovacuum worker.

Parallel worker is in the same situation as autovacuum worker. Its lifetime
is basically short and its stats is reported at the exit via
pgstat_beshutdown_hook().

pgstat_beshutdown_hook() reports the stats only when MyDatabaseId is valid.
Checkpointer calls pgstat_beshutdown_hook() at the exit, but doesn't report
the stats because its MyDatabaseId is invalid. Also it doesn't report the SLRU
stats while it's running. As we discussed upthread, we need to make
checkpointer call pgstat_send_slru() just after pgstat_send_bgwriter().

However even if we do this, the stats updated during the last checkpointer's
activity (e.g., shutdown checkpoint) seems not reported because
pgstat_beshutdown_hook() doesn't report the stats in checkpointer case.
Do we need to address this issue? If yes, we would need to change
pgstat_beshutdown_hook() or register another checkpointer-exit callback
that sends the stats. Thought?

Startup process is in the same situation as checkpointer process. It reports
the stats neither at the exit nor whle it's running. But, like logical
walsender, this seems not the problem only for SLRU stats. We would need to
consider how to handle the stats by startup process, separately from SLRU
stats.

Therefore what we can do right now seems to make checkpointer report the SLRU
stats while it's running. Other issues need more time to investigate...
Thought?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey M. Borodin 2020-05-14 06:44:01 Re: MultiXact\SLRU buffers configuration
Previous Message Kyotaro Horiguchi 2020-05-14 06:23:02 Re: PG 13 release notes, first draft