Re: [PROPOSAL] VACUUM Progress Checker.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Date: 2015-07-22 14:15:19
Message-ID: CA+TgmoYnWtNJRmVWAJ+wGLOB_x8vNOTrZnEDio=GaPi5HK73oQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 22, 2015 at 8:24 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> * An estimate of the estimated time of completion - I liked your view that
> this prediction may be costly to request

I'm saying it may be massively unreliable, not that it may be costly.
(Someone else may have said that it would be costly, but I don't think
it was me.)

>> Most of the progress estimators I have seen over the ~30 years that
>> I've been playing with computers have been unreliable, and many of
>> those have been unreliable to the point of being annoying. I think
>> that's likely to happen with what you are proposing too, though of
>> course like all predictions of the future it could turn out to be
>> wrong.
>
> Almost like an Optimizer then. Important, often annoyingly wrong, needs more
> work.

Yes, but with an important difference. If the optimizer mis-estimates
the row count by 3x or 10x or 1000x, but the plan is OK anyway, it's
often the case that no one cares. Except when the plan is bad, people
don't really care about the method used to derive it. The same is not
true here: people will rely on the progress estimates directly, and
they will really care if they are not right.

> I'm not proposing this feature, I'm merely asking for it to be defined in a
> way that makes it work for more than just VACUUM. Once we have a way of
> reporting useful information, other processes can be made to follow that
> mechanism, like REINDEX, ALTER TABLE etc.. I believe those things are
> important, even if we never get such information for user queries. But I
> hope we do.
>
> I won't get in the way of your search for detailed information in more
> complex forms. Both things are needed.

OK.

One idea I have is to create a system where we expose a command tag
(i.e. VACUUM) plus a series of generic fields whose specific meanings
are dependent on the command tag. Say, 6 bigint counters, 6 float8
counters, and 3 strings up to 80 characters each. So we have a
fixed-size chunk of shared memory per backend, and each backend that
wants to expose progress information can fill in those fields however
it likes, and we expose the results.

This would be sorta like the way pg_statistic works: the same columns
can be used for different purposes depending on what estimator will be
used to access them.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-07-22 14:18:39 Re: fdw_scan_tlist for foreign table scans breaks EPQ testing, doesn't it?
Previous Message Robert Haas 2015-07-22 14:00:17 Re: [PATCH] pg_upgrade fails when postgres/template1 isn't in default tablespace