From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Kamigishi Rei <iijima(dot)yun(at)koumakan(dot)jp>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: BUG #17245: Index corruption involving deduplicated entries |
Date: | 2021-11-02 23:48:03 |
Message-ID: | CAH2-WzkkJeP8Aq7-Un7b=86PX5sGnmPbJ9XZHb3aLWypVNYWkg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Mon, Nov 1, 2021 at 9:25 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> Thank you for updating the patch. For parallel vacuum fix, the fix
> looks good to me.
I pushed the same patch earlier (with small tweaks).
> We cannot know whether indexes were actually vacuumed by
> parallel workers by checking the shared index stats. While thinking
> it's a good idea to use on an assertion in nbtdedup.c I'm concerned a
> bit about relying on another module to detect a vacuum bug.
I agree that it's not ideal that we're relying on the nbtdedup.c
assertion failure, but there are lots of ways that the test could fail
(with assertions enabled). We're relying on nbtdedup.c, but it's not
too delicate in practice.
> The problem with this bug is that leader process misses to vacuum
> indexes that must be processed by the leader.
I think that the underlying problem is that there is too much
indirection, for no real benefit. A more explicit state machine
approach would make testing easy, and would minimize divergence
between parallel and serial VACUUM. I think that it would be easier
and less risky overall to do this than it would be to have
comprehensive testing of the existing design.
As I said on the other thread, we should decide *what we're allowed to
do* up front, in the leader, probably when
compute_parallel_vacuum_workers() is called. After that point,
parallel workers (and the parallel leader) don't have to care about
*why* the required/allowed work is a certain way -- they just need to
actually do the work. We can explicitly remember "can [or cannot] have
ambulkdelete() performed by parallel worker", and "can [or cannot]
have amvacuumcleanup() performed by parallel worker" as per-index
immutable state, stored in shared memory. This immutable state is a
little like a query plan.
We can also have mutable state that describes the current status of
each index, like "still needs vacuuming" and "still needs cleanup" --
this tracks the overall progress for each index, plus maybe extra
instrumentation. It will then be easy to notice when invalid state
transitions seem to be necessary -- just throw an error at that point.
Also, we can directly check "Does what I asked for match the work that
parallel workers said they completed using the same data structure?",
at the end of each parallel VACUUM. That nails things down.
BTW, do we really need separate
do_serial_processing_for_unsafe_indexes() and do_parallel_processing()
functions? I think that we could just have one event loop function if
it was structured as a state machine. Parallel workers could see
indexes that can only be processed by the leader, but that would be
okay -- they'd understand not to touch those indexes (without caring
about the true reason).
Right now we really have two separate ambulkddelete (!for_cleanup) and
amvacuumcleanup (for_cleanup) parallel operations -- not a single
parallel VACUUM operation. That may be part of the problem. It would
also be a good idea to have just one parallel VACUUM operation, and to
handle the transition from ambulkdelete to amvacuumcleanup using state
transitions. That's more ambitious than adding more state to shared
memory, but it would be nice if workers (or the leader acting like a
worker) could just consume "a unit of work" in any that's convenient
(as long as it's also in an order that's correct, based on state
transition rules like "go from ambulkddelete directly to
amvacuumcleanup, but only when we know that it's the last call to
ambulkddelete for the VACUUM operation as a whole").
> So, another idea (but
> not a comprehensive approach) to detect this issue would be that we
> check if the leader processed all indexes that must be processed by
> leader at the end of do_serial_processing_for_unsafe_indexes(). That
> is, we check if all of both indexes whose entries in
> will_parallel_vacuum[] are false and unsafe indexes have been
> processed by leader.
I don't think that that would help very much. It still spreads out
knowledge of what is supposed to happen, rather than concentrating
that knowledge into just one function.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2021-11-03 02:03:05 | Re: BUG #17245: Index corruption involving deduplicated entries |
Previous Message | Bill MacArthur | 2021-11-02 23:38:58 | plpgsql ON CONFLICT clause creates ambiguous col reference with returns table |