From: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
---|---|
To: | Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com> |
Cc: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: Track the amount of time waiting due to cost_delay |
Date: | 2024-12-09 11:48:30 |
Message-ID: | CAFiTN-t+=7v1zObd1v2hpvvbs9S0WUye8R8cj4Ddc4KP2=4K8g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Dec 9, 2024 at 2:51 PM Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2024-12-06 18:31, Bertrand Drouvot wrote:
> > Hi,
> >
> > On Thu, Dec 05, 2024 at 10:43:51AM +0000, Bertrand Drouvot wrote:
> >> Yeah, people would likely use this new field to monitor long running
> >> vacuum.
> >> Long enough that this error should be acceptable. Do you agree?
> >
> > OTOH, adding the 100% accuracy looks as simple as v9-0002 attached
> > (0001 is
> > same as for v8), so I think we should provide it.
>
This Idea looks good to me. Here are some comments
1.
+ Total amount of time spent in milliseconds waiting due to
<xref linkend="guc-vacuum-cost-delay"/>
+ or <xref linkend="guc-autovacuum-vacuum-cost-delay"/>. In case
of parallel
+ vacuum the reported time is across all the workers and the leader. The
+ workers update the column no more frequently than once per second, so it
+ could show slightly old values.
+ </para></entry>
I think this waiting is influenced due to cost delay as well as cost
limit GUCs because here we are counting total wait time and that very
much depends upon how frequently we are waiting and that's completely
driven by cost limit. Isn't it?
2.
+ if (IsParallelWorker())
+ {
+ instr_time time_since_last_report;
+
+ INSTR_TIME_SET_ZERO(time_since_last_report);
+ INSTR_TIME_ACCUM_DIFF(time_since_last_report, delay_end,
+ last_report_time);
+ nap_time_since_last_report += INSTR_TIME_GET_MILLISEC(delayed_time);
+
+ if (INSTR_TIME_GET_MILLISEC(time_since_last_report) >
WORKER_REPORT_DELAY_INTERVAL)
+ {
+ pgstat_progress_parallel_incr_param(PROGRESS_VACUUM_TIME_DELAYED,
+ nap_time_since_last_report);
+ nap_time_since_last_report = 0;
+ last_report_time = delay_end;
+ }
+ }
Does it make sense to track this "nap_time_since_last_report" in a
shared variable instead of local in individual workers and whenever
the shared variable crosses WORKER_REPORT_DELAY_INTERVAL we can report
this would avoid individual reporting from different workers. Since
we are already computing the cost balance in shared variable i.e.
VacuumSharedCostBalance, or do you think it will complicate the code?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Kirill Reshke | 2024-12-09 11:50:16 | Re: WARNING: missing lock on database "postgres" (OID 5) @ TID (0,4) |
Previous Message | John Naylor | 2024-12-09 11:39:36 | Re: Proposal for Updating CRC32C with AVX-512 Algorithm. |