From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: error context for vacuum to include block number (atomic progress update) |
Date: | 2019-12-29 20:17:47 |
Message-ID: | 20191229201747.GL12890@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Dec 28, 2019 at 07:21:31PM -0500, Robert Haas wrote:
> On Thu, Dec 26, 2019 at 10:57 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > I agree that's better.
> > I don't see any reason why the progress params need to be updated atomically.
> > So rebasified against your patch.
>
> I am not sure whether it's important enough to make a stink about, but
> it bothers me a bit that this is being dismissed as unimportant. The
> problem is that, if the updates are not atomic, then somebody might
> see the data after one has been updated and the other has not yet been
> updated. The result is that when the phase is
> PROGRESS_VACUUM_PHASE_VACUUM_INDEX, someone reading the information
> can't tell whether the number of index scans reported is the number
> *previously* performed or the number performed including the one that
> just finished. The race to see the latter state is narrow, so it
> probably wouldn't come up often, but it does seem like it would be
> confusing if it did happen.
What used to be atomic was this:
- hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
- hvp_val[1] = vacrelstats->num_index_scans + 1;
=> switch from PROGRESS_VACUUM_PHASE_VACUUM INDEX to HEAP and increment
index_vacuum_count, which is documented as the "Number of completed index
vacuum cycles."
Now, it 1) increments the number of completed scans; and, 2) then progresses
phase to HEAP, so there's a window where the number of completed scans is
incremented, and it still says VACUUM_INDEX.
Previously, if it said VACUUM_INDEX, one could assume that index_vacuum_count
would increase at least once more, and that's no longer true. If someone sees
VACUUM_INDEX and some NUM_INDEX_VACUUMS, and then later sees VACUUM_HEAP or
other later stage, with same (maybe final) value of NUM_INDEX_VACUUMS, that's
different than previous behavior.
It seems to me that a someone or their tool monitoring pg_stat shouldn't be
confused by this change, since:
1) there's no promise about how high NUM_INDEX_VACUUMS will or won't go; and,
2) index_vacuum_count didn't do anything strange like decreasing, or increased
before the scans were done; and,
3) the vacuum can finish at any time, and the monitoring process presumably
knows that when the PID is gone, it's finished, even if it missed intermediate
updates;
The behavior is different from before, but I think that's ok: the number of
scans is accurate, and the PHASE is accurate, even though it'll change a moment
later.
I see there's similar case here:
| /* report all blocks vacuumed; and that we're cleaning up */
| pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
| pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
| PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
heap_blks_scanned is documented as "Number of heap blocks SCANNED", and it
increments exactly to heap_blks_total. Would someone be confused if
heap_blks_scanned==heap_blks_total AND phase=='scanning heap' ? I think they'd
just expect PHASE to be updated a moment later. (And if it wasn't, I agree they
should then be legitimately confused or concerned).
Actually, the doc says:
|If heap_blks_scanned is less than heap_blks_total, the system will return to
|scanning the heap after this phase is completed; otherwise, it will begin
|cleaning up indexes AFTER THIS PHASE IS COMPLETED.
I read that to mean that it's okay if heap_blks_scanned==heap_blks_total when
scanning/vacuuming heap.
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2019-12-29 21:23:54 | Re: [HACKERS] Block level parallel vacuum |
Previous Message | Tom Lane | 2019-12-29 19:22:56 | Re: TAP testing for psql's tab completion code |