Re: [PROPOSAL] VACUUM Progress Checker.

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Álvaro Herrera <alvherre(at)2ndquadrant(dot)com>, rahilasyed90(at)gmail(dot)com, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Masao Fujii <masao(dot)fujii(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, pokurev(at)pm(dot)nttdata(dot)co(dot)jp, Vinayak Pokale <vinpokale(at)gmail(dot)com>
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Date: 2015-12-10 11:46:47
Message-ID: CAB7nPqSWAkC2pZ+NCf79hUrzNuAELc-AH0150T3NuALrgji7Hw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 10, 2015 at 7:23 PM, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> On 2015/12/10 15:28, Michael Paquier wrote:
>> - The progress tracking facility adds a whole level of complexity for
>> very little gain, and IMO this should *not* be part of PgBackendStatus
>> because in most cases its data finishes wasted. We don't expect
>> backends to run frequently such progress reports, do we? My opinion on
>> the matter if that we should define a different collector data for
>> vacuum, with something like PgStat_StatVacuumEntry, then have on top
>> of it a couple of routines dedicated at feeding up data with it when
>> some work is done on a vacuum job.
>
> I assume your comment here means we should use stats collector to the
> track/publish progress info, is that right?

Yep.

> AIUI, the counts published via stats collector are updated asynchronously
> w.r.t. operations they count and mostly as aggregate figures. For example,
> PgStat_StatTabEntry.blocks_fetched. IOW, we never see
> pg_statio_all_tables.heap_blks_read updating as a scan reads blocks. Maybe
> that helps keep traffic to pgstat collector to sane levels. But that is
> not to mean that I think controlling stats collector levels was the only
> design consideration behind how such counters are published.
>
> In case of reporting counters as progress info, it seems we might have to
> send too many PgStat_Msg's, for example, for every block we finish
> processing during vacuum. That kind of message traffic may swamp the
> collector. Then we need to see the updated counters from other counters in
> near real-time though that may be possible with suitable (build?)
> configuration.

As far as I understand it, the basic reason why this patch exists is
to allow a DBA to have a hint of the progress of a VACUUM that may be
taking minutes, or say hours, which is something we don't have now. So
it seems perfectly fine to me to report this information
asynchronously with a bit of lag. Why would we need so much precision
in the report?

>> In short, it seems to me that this patch needs a rework, and should be
>> returned with feedback. Other opinions?
>
> Yeah, some more thought needs to be put into design of the general
> reporting interface. Then we also need to pay attention to another
> important aspect of this patch - lazy vacuum instrumentation.

This patch has received a lot of feedback, and it is not in a
committable state, so I marked it as "Returned with feedback" for this
CF.
Regards,
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-12-10 11:56:18 Re: Error with index on unlogged table
Previous Message Michael Paquier 2015-12-10 11:29:41 Re: Error with index on unlogged table