Re: relfilenode statistics

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: bertranddrouvot(dot)pg(at)gmail(dot)com
Cc: robertmhaas(at)gmail(dot)com, andres(at)anarazel(dot)de, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: relfilenode statistics
Date: 2024-06-11 06:35:23
Message-ID: 20240611.153523.2257649547287994807.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Mon, 10 Jun 2024 08:09:56 +0000, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote in
> Hi,
>
> On Fri, Jun 07, 2024 at 09:24:41AM -0400, Robert Haas wrote:
> > On Thu, Jun 6, 2024 at 11:17 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > If we just want to keep prior stats upon arelation rewrite, we can just copy
> > > the stats from the old relfilenode. Or we can decide that those stats don't
> > > really make sense anymore, and start from scratch.
> >
> > I think we need to think carefully about what we want the user
> > experience to be here. "Per-relfilenode stats" could mean "sometimes I
> > don't know the relation OID so I want to use the relfilenumber
> > instead, without changing the user experience" or it could mean "some
> > of these stats actually properly pertain to the relfilenode rather
> > than the relation so I want to associate them with the right object
> > and that will affect how the user sees things." We need to decide
> > which it is. If it's the former, then we need to examine whether the
> > goal of hiding the distinction between relfilenode stats and relation
> > stats from the user is in fact feasible. If it's the latter, then we
> > need to make sure the whole patch reflects that design, which would
> > include e.g. NOT copying stats from the old to the new relfilenode,
> > and which would also include documenting the behavior in a way that
> > will be understandable to users.
>
> Thanks for sharing your thoughts!
>
> Let's take the current heap_blks_read as an example: it currently survives
> a relation rewrite and I guess we don't want to change the existing user
> experience for it.
>
> Now say we want to add "heap_blks_written" (like in this POC patch) then I think
> that it makes sense for the user to 1) query this new stat from the same place
> as the existing heap_blks_read: from pg_statio_all_tables and 2) to have the same
> experience as far the relation rewrite is concerned (keep the previous stats).
>
> To achieve the rewrite behavior we could:
>
> 1) copy the stats from the OLD relfilenode to the relation (like in the POC patch)
> 2) copy the stats from the OLD relfilenode to the NEW one (could be in a dedicated
> field)
>
> The PROS of 1) is that the behavior is consistent with the current heap_blks_read
> and that the user could still see the current relfilenode stats (through a new API)
> if he wants to.
>
> > In my experience, the worst thing you can do in cases like this is be
> > somewhere in the middle. Then you tend to end up with stuff like: the
> > difference isn't supposed to be something that the user knows or cares
> > about, except that they do have to know and care because you haven't
> > thoroughly covered up the deception, and often they have to reverse
> > engineer the behavior because you didn't document what was really
> > happening because you imagined that they wouldn't notice.
>
> My idea was to move all that is in pg_statio_all_tables to relfilenode stats
> and 1) add new stats to pg_statio_all_tables (like heap_blks_written), 2) ensure
> the user can still retrieve the stats from pg_statio_all_tables in such a way
> that it survives a rewrite, 3) provide dedicated APIs to retrieve
> relfilenode stats but only for the current relfilenode, 4) document this
> behavior. This is what the POC patch is doing for heap_blks_written (would
> need to do the same for heap_blks_read and friends) except for the documentation
> part. What do you think?

In my opinion, it is certainly strange that bufmgr is aware of
relation kinds, but introducing relfilenode stats to avoid this skew
doesn't seem to be the best way, as it invites inconclusive arguments
like the one raised above. The fact that we transfer counters from old
relfilenodes to new ones indicates that we are not really interested
in counts by relfilenode. If that's the case, wouldn't it be simpler
to call pgstat_count_relation_buffer_read() from bufmgr.c and then
branch according to relkind within that function? If you're concerned
about the additional branch, some ingenuity may be needed.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2024-06-11 06:50:19 Re: Track the amount of time waiting due to cost_delay
Previous Message Stepan Neretin 2024-06-11 06:32:04 Re: Sort functions with specialized comparators