Re: Logging parallel worker draught

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Benoit Lobréau <benoit(dot)lobreau(at)dalibo(dot)com>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Subject: Re: Logging parallel worker draught
Date: 2025-01-28 23:41:39
Message-ID: CAA5RZ0voe52P0BidOJfRWvHU0naitf_z45-7k_NMP7qoOf4DMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I feel that observability is important, and I don't understand why we
> would want to have the information for only a portion of the
> functionality's usage (even if it's the most important).

In my opinion, the requirement for parallel usage in
the utility statement is different. In fact, I think the story
there needs improvement as is being discussed in [1]
by introducing something like a VERBOSE.

Here is the kind of case I think will be a bit odd if we
introduce the parallel logging for vacuum in V6_0003
and V6_0004.

postgres=# set client_min_messages = LOG;
SET
postgres=# set log_parallel_workers = "all";
SET
postgres=# vacuum (verbose, parallel 4) t ;
INFO: vacuuming "postgres.public.t"
INFO: launched 2 parallel vacuum workers for index vacuuming (planned: 2)
LOG: launched 2 parallel workers (planned: 2)
INFO: table "t": truncated 17242 to 0 pages
INFO: finished vacuuming "postgres.public.t": index scans: 1
pages: 17242 removed, 0 remain, 17242 scanned (100.00% of total)

There will both be an INFO ( existing behavior ) and LOG ( new behavior ).
This seems wrong to me and there should only really be one
mechanism to log parallel workers for utility statements.
Others may have different opinions.

I also have a few comments on V6

1/
For the new comments.

Change:
"workers is produced"
To
"workers is emitted"

"emit" is used mainly to describe logs.

Also, change "displays" to "emits"

2/
Should the function LoggingParallelWorkers do all the work,
including logging? This way the callers just need to
call the function without checking for the return
value? Currently, the patch just repeats the same
logging calls everywhere it's needed.

Maybe use a void returning function called
LogParallelWorkersIfNeeded for this purpose?

Regards,

Sami

[1] https://www.postgresql.org/message-id/CAA5RZ0trTUL6_vpvW79daGgkp7B-ZtWUc5yrPz5Sjm8Ns4KRgQ@mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-01-28 23:53:17 Re: [PATCH] Improve code coverage of network address functions
Previous Message Masahiko Sawada 2025-01-28 23:00:03 Re: Make COPY format extendable: Extract COPY TO format implementations