From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PROPOSAL] VACUUM Progress Checker. |
Date: | 2015-07-22 12:24:21 |
Message-ID: | CANP8+jLLLng=cc+jOcGzFPVy9H6Pg5Zv5s0eCM53obLb6GiPwg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 22 July 2015 at 13:00, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Jul 22, 2015 at 3:02 AM, Simon Riggs <simon(at)2ndquadrant(dot)com>
> wrote:
> > For me, the user workflow looks like these....
> >
> > Worried: "Task X is taking ages? When is it expected to finish?"
> > Ops: 13:50
> > <sometime later, about 14:00>
> > Worried: "Task X is still running? But I thought its ETA was 13:50?"
> > Ops: Now says 14:30
> > Worried: "Is it stuck, or is it making progress?"
> > Ops: Looks like its making progress
> > Worried: "Can we have a look at it and find out what its doing?"
>
> How does Ops know that it is making progress? Just because the
> completion percentage is changing?
>
You could, but that is not the way I suggested.
We need
* Some measure of actual progress (the definition of which may vary from
action to action, e.g. blocks scanned)
* Some estimate of the total work required
* An estimate of the estimated time of completion - I liked your view that
this prediction may be costly to request
> > In terms of VACUUM specifically: VACUUM should be able to assess
> beforehand
> > whether it will scan the indexes, or it can just assume that it will
> need to
> > scan the indexes. Perhaps VACUUM can pre-scan the VM to decide how big a
> > task it has before it starts.
>
> Well, we can assume that it will scan the indexes exactly once, but
> the actual number may be more or less; and the cost of rescanning the
> heap in phase 2 is also hard to estimate.
>
> Maybe I'm worrying over nothing, but I have a feeling that if we try
> to do what you're proposing here, we're gonna end up with this:
>
> https://xkcd.com/612/
>
> Most of the progress estimators I have seen over the ~30 years that
> I've been playing with computers have been unreliable, and many of
> those have been unreliable to the point of being annoying. I think
> that's likely to happen with what you are proposing too, though of
> course like all predictions of the future it could turn out to be
> wrong.
Almost like an Optimizer then. Important, often annoyingly wrong, needs
more work.
I'm not proposing this feature, I'm merely asking for it to be defined in a
way that makes it work for more than just VACUUM. Once we have a way of
reporting useful information, other processes can be made to follow that
mechanism, like REINDEX, ALTER TABLE etc.. I believe those things are
important, even if we never get such information for user queries. But I
hope we do.
I won't get in the way of your search for detailed information in more
complex forms. Both things are needed.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Teodor Sigaev | 2015-07-22 13:52:52 | stringify MAKE_SQLSTATE() |
Previous Message | Kouhei Kaigai | 2015-07-22 12:13:44 | Re: fdw_scan_tlist for foreign table scans breaks EPQ testing, doesn't it? |