Re: Use pgBufferUsage for block reporting in analyze

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
Cc: Karina Litskevich <litskevichkarina(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Use pgBufferUsage for block reporting in analyze
Date: 2024-08-01 23:11:29
Message-ID: CAD21AoDcT-EHeXk=QOJOmUGxzDcKCQnujTETV+WskKaFetyedw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 31, 2024 at 11:27 PM Anthonin Bonnefoy
<anthonin(dot)bonnefoy(at)datadoghq(dot)com> wrote:
>
> On Wed, Jul 31, 2024 at 9:36 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > Meanwhile, I think we can push 0001 and 0002 patches since they are in
> > good shape. I've updated commit messages to them and slightly changed
> > 0002 patch to write "finished analyzing of table \"%s.%s.%s\" instead
> > of "analyze of table \"%s.%s.%s\".
>
> Wouldn't it make sense to do the same for autoanalyze and write
> "finished automatic analyze of table \"%s.%s.%s\"\n" instead of
> "automatic analyze of table \"%s.%s.%s\"\n"?

I think that the current style is consistent with autovacuum logs:

2024-08-01 16:04:48.088 PDT [12302] LOG: automatic vacuum of table
"postgres.public.test": index scans: 0
pages: 0 removed, 443 remain, 443 scanned (100.00% of total)
tuples: 0 removed, 100000 remain, 0 are dead but not yet removable
removable cutoff: 751, which was 0 XIDs old when operation ended
new relfrozenxid: 739, which is 1 XIDs ahead of previous value
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had
0 dead item identifiers removed
avg read rate: 0.000 MB/s, avg write rate: 1.466 MB/s
buffer usage: 905 hits, 0 reads, 4 dirtied
system usage: CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.02 s
2024-08-01 16:04:48.125 PDT [12302] LOG: automatic analyze of table
"postgres.public.test"
avg read rate: 5.551 MB/s, avg write rate: 0.617 MB/s
buffer usage: 512 hits, 27 reads, 3 dirtied
system usage: CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.03 s

Since ANALYZE command writes the start log, I think it makes sense to
write "finished" at the end of the operation:

=# analyze verbose test;
INFO: analyzing "public.test"
INFO: "test": scanned 443 of 443 pages, containing 100000 live rows
and 0 dead rows; 30000 rows in sample, 100000 estimated total rows
INFO: finished analyzing table "postgres.public.test"
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
buffer usage: 549 hits, 0 reads, 0 dirtied
system usage: CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.03 s
ANALYZE

>
> > Also, regarding 0003 patch, what is the main reason why we want to add
> > WAL usage to analyze reports? I think that analyze normally does not
> > write WAL records much so I'm not sure it's going to provide a good
> > insight for users.
>
> There was no strong reason except for consistency with VACUUM VERBOSE
> output. But as you said, it's not really providing valuable
> information so it's probably better to keep the noise down and drop
> it.

Okay. I think writing WAL usage would not be very noisy and probably
could help some cases where (auto)analyze unexpectedly writes many WAL
records (e.g., writing full page images much), and is consistent with
(auto)vacuum logs as you mentioned. So let's go with this direction
unless others think differently.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Maciek Sakrejda 2024-08-01 23:29:58 Re: Casts from jsonb to other types should cope with json null
Previous Message Thomas Munro 2024-08-01 22:54:02 Re: Why is citext/regress failing on hamerkop?