Re: [HACKERS] CLUSTER command progress monitor

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tatsuro Yamada <tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: [HACKERS] CLUSTER command progress monitor
Date: 2019-09-06 05:44:18
Message-ID: 20190906054418.GC1608@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 05, 2019 at 03:17:51PM -0400, Robert Haas wrote:
> Oops. Yeah, that's bogus (as are some of the other things you
> mention). I think we're going to have to fix this by passing down
> some flags to these functions to tell them what kind of progress
> updates to do (or to do none). Or else pass down a callback function
> and a context object, but that seems like it might be overkill.

One idea I got was to pass the command ID as an extra argument of the
update routine. I am not completely sure either if we need this level
of complication.

> Those are just weaknesses of the infrastructure. Perhaps there is a
> better solution, but there's no intrinsic reason that we can't avoid
> them by careful coding.

Perhaps. The current infra allows the addition of a progress report
in code paths which are isolated from other things. For CLUSTER, most
things are fine as long as the progress is updated in cluster_rel(),
the rest is too internal.

> Well, it might be OK to do that if we're clear that this is the index
> progress-reporting view and the command is CLUSTER but it happens to
> be building an index now so we're showing it here. But I don't see
> how it would work anyway: you can't reported cascading progress
> reports in shared memory because you've only got a fixed amount of
> space.

I don't see exactly why we could not switch to a fixed number of
slots, say 8, with one code path to start a progress which adds an
extra report on the stack, one to remove one entry from the stack, and
a new one to reset the whole thing for a backend. This would not need
much restructuration of course.

Finally comes the question of what do we do for v12? I am adding in
CC Peter, Alvaro being already present, who have been involved in the
commits with CREATE INDEX and REINDEX. It would be sad to revert a
this feature, but well I'd rather do that now than regret later
releasing the feature as it is currently shaped.. Let's see what the
others think.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-09-06 05:48:41 Re: [bug fix] Produce a crash dump before main() on Windows
Previous Message Thomas Munro 2019-09-06 05:34:49 Re: Avoiding hash join batch explosions with extreme skew and weird stats