Re: pg_stat_io not tracking smgrwriteback() is confusing

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: pg_stat_io not tracking smgrwriteback() is confusing
Date: 2023-04-24 22:36:24
Message-ID: CAAKRu_aKWkUwCVFh-8=e7qiD3LonQv=-n_AZnpA7ZCCuQxLd7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 24, 2023 at 6:13 PM Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2023-04-24 17:37:48 -0400, Melanie Plageman wrote:
> > On Mon, Apr 24, 2023 at 02:14:32PM -0700, Andres Freund wrote:
> > > It starts blocking once "enough" IO is in flight. For things like an
> immediate
> > > checkpoint, that can happen fairly quickly, unless you have a very
> fast IO
> > > subsystem. So often it'll not matter whether we track smgrwriteback(),
> but
> > > when it matter, it can matter a lot.
> >
> > I see. So, it sounds like this is most likely to happen for checkpointer
> > and not likely to happen for other backends who call
> > ScheduleBufferTagForWriteback().
>
> It's more likely, but once the IO subsystem is busy, it'll also happen for
> other users ScheduleBufferTagForWriteback().
>
>
> > Also, it seems like this (given the current code) is only reachable for
> > permanent relations (i.e. not for IO object temp relation). If other
> backend
> > types than checkpointer may call smgrwriteback(), we likely have to
> consider
> > the IO context.
>
> I think we should take it into account - it'd e.g. interesting to see a
> COPY
> is bottlenecked on smgrwriteback() rather than just writing the data.
>

With the quick and dirty attached patch and using your example but with a
pgbench -T200 on my rather fast local NVMe SSD, you can still see quite
a difference.
This is with a stats reset before the checkpoint.

unpatched:

backend_type | object | context | writes | write_time |
fsyncs | fsync_time
---------------------+---------------+-----------+---------+------------+---------+------------
background writer | relation | normal | 443 | 1.408 |
0 | 0
checkpointer | relation | normal | 187804 | 396.829 |
47 | 254.226

patched:

backend_type | object | context | writes | write_time
| fsyncs | fsync_time
---------------------+---------------+-----------+---------+--------------------+--------+------------
background writer | relation | normal | 917 |
4.4670000000000005 | 0 | 0
checkpointer | relation | normal | 375798 |
977.354 | 48 | 202.514

I did compare client backend stats before and after pgbench and it made
basically no difference. I'll do a COPY example like you mentioned.

Patch needs cleanup/comments and a bit more work, but I could do with
a sanity check review on the approach.

- Melanie

Attachment Content-Type Size
v1-0001-Add-writeback-to-pg_stat_io-writes.patch text/x-patch 4.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-04-24 22:56:54 Re: pg_stat_io not tracking smgrwriteback() is confusing
Previous Message Andres Freund 2023-04-24 22:32:25 Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call