From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Nathan Bossart <nathandbossart(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, schneider(at)ardentperf(dot)com |
Subject: | Re: monitoring usage count distribution |
Date: | 2023-04-04 23:29:19 |
Message-ID: | 20230404232919.uibzbhjdylk3mlvp@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2023-04-04 14:31:36 -0400, Robert Haas wrote:
> On Mon, Jan 30, 2023 at 6:30 PM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
> > My colleague Jeremy Schneider (CC'd) was recently looking into usage count
> > distributions for various workloads, and he mentioned that it would be nice
> > to have an easy way to do $SUBJECT. I've attached a patch that adds a
> > pg_buffercache_usage_counts() function. This function returns a row per
> > possible usage count with some basic information about the corresponding
> > buffers.
> >
> > postgres=# SELECT * FROM pg_buffercache_usage_counts();
> > usage_count | buffers | dirty | pinned
> > -------------+---------+-------+--------
> > 0 | 0 | 0 | 0
> > 1 | 1436 | 671 | 0
> > 2 | 102 | 88 | 0
> > 3 | 23 | 21 | 0
> > 4 | 9 | 7 | 0
> > 5 | 164 | 106 | 0
> > (6 rows)
> >
> > This new function provides essentially the same information as
> > pg_buffercache_summary(), but pg_buffercache_summary() only shows the
> > average usage count for the buffers in use. If there is interest in this
> > idea, another approach to consider could be to alter
> > pg_buffercache_summary() instead.
>
> I'm skeptical that pg_buffercache_summary() is a good idea at all
Why? It's about two orders of magnitude faster than querying the equivalent
data by aggregating in SQL. And knowing how many free and dirty buffers are
over time is something quite useful to monitor / correlate with performance
issues.
> but having it display the average usage count seems like a particularly poor
> idea. That information is almost meaningless.
I agree there are more meaningful ways to represent the data, but I don't
agree that it's almost meaningless. It can give you a rough estimate of
whether data in s_b is referenced or not.
> Replacing that with a six-element integer array would be a clear improvement
> and, IMHO, better than adding yet another function to the extension.
I'd have no issue with that.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Smith | 2023-04-05 00:27:33 | CREATE SUBSCRIPTION -- add missing tab-completes |
Previous Message | Andres Freund | 2023-04-04 23:25:27 | Re: monitoring usage count distribution |