Re: [PROPOSAL] VACUUM Progress Checker.

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: "Syed, Rahila" <Rahila(dot)Syed(at)nttdata(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Date: 2015-08-10 15:20:26
Message-ID: CANP8+jKYNeX0P1BR3zt+826RGfWDrOMLVMS8h-5-ZRSdqHG5aw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10 August 2015 at 15:59, Syed, Rahila <Rahila(dot)Syed(at)nttdata(dot)com> wrote:

> Hello,
>
> >When we're in Phase2 or 3, don't we need to report the number of total
> page scanned or percentage of how many table pages scanned, as well?
> The total heap pages scanned need to be reported with phase 2 or 3.
> Complete progress report need to have numbers from each phase when
> applicable.
>
> > Phase 1. Report 2 integer counters: heap pages scanned and total heap
> > pages,
> > 1 float counter: percentage_complete and progress message.
> > Phase 2. Report 2 integer counters: index pages scanned and total
> > index pages(across all indexes) and progress message.
> > Phase 3. 1 integer counter: heap pages vacuumed.
>
> Sorry for being unclear here. What I meant to say is, each phase of a
> command will correspond to a slot in COMMAND_NUM_SLOTS. Each phase will be
> a separate array element and
> will comprise of n integers, n floats, string. So , in the view reporting
> progress, VACUUM command can have 3 entries one for each phase.
>

VACUUM has 3 phases now, but since phases 2 and 3 repeat, you can have an
unbounded number of phases. But that assumes that we don't count truncation
as a 4th phase of VACUUM...

SELECT statements also have a variable number of phases, hash, materialize,
sorts all act as blocking nodes where you cannot progress to next phase
until it is complete and you don't know for certain how much data will come
in later phases.

I think the best you'll do is an array of pairs of values [(current blocks,
total blocks), ... ]

Knowing how many phases there are is a tough problem. I think the only way
forwards is to admit that we will publish our best initial estimate of
total workload size and then later we may realise it was wrong and publish
a better number (do until complete). It's not wonderful, but la vida es
loca.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-08-10 15:31:04 Re: Precedence of standard comparison operators
Previous Message Tom Lane 2015-08-10 15:19:56 Re: Using contrib modules in check (Re: pgsql: Fix BRIN to use SnapshotAny during summarization)