From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com>, Bowen Shi <zxwsbg12138(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Alexander Lakhin <exclusion(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae |
Date: | 2024-06-21 00:33:59 |
Message-ID: | CAAKRu_Z50WSPWLYg-2NC4TDBSyTLMRL_jG=K+txByTAeu5nNXA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Thu, Jun 20, 2024 at 11:49 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Tue, Jun 18, 2024 at 6:51 PM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> >
> > Finally, upthread there is discussion of how we could end up doing a
> > catalog lookup after vacuum_get_cutoffs() and before the tuple
> > visibility check on 16. Assuming this is true, we would want to
> > backport the fix to 16 as well. I could use some help getting a repro
> > (using btree index deletion for example) of the infinite loop on 16.
>
> So, I ended up working on a new repro that works by forcing a round of
> index vacuuming after the standby reconnects and before pruning a dead
> tuple whose xmax is older than OldestXmin.
>
> At the end of the round of index vacuuming, _bt_pendingfsm_finalize()
> calls GetOldestNonRemovableTransactionId(), thereby updating the
> backend's GlobalVisState and moving maybe_needed backwards.
>
> Then vacuum's first pass will continue with pruning and find our later
> inserted and updated tuple HEAPTUPLE_RECENTLY_DEAD when compared to
> maybe_needed but HEAPTUPLE_DEAD when compared to OldestXmin.
>
> I make sure that the standby reconnects between vacuum_get_cutoffs()
> (vacuum_set_xid_limits() on 14/15) and pruning because I have a cursor
> on the page keeping VACUUM FREEZE from getting a cleanup lock.
>
> See the repros for step-by-step explanations of how it works.
>
> With this, I can repro the infinite loop on 14-16.
>
> Backporting 1ccc1e05ae fixes 16 but, with the new repro, 14 and 15
> error out with "cannot freeze committed xmax". I'm going to
> investigate further why this is happening. It definitely makes me
> wonder about the fix.
It turns out it was also erroring out on 16 (i.e. backporting
1ccc1e05ae did not fix anything), but I didn't notice it because the
perl TAP test passed. I also discovered we can hit this error in
master, so I started a thread about that here [1].
- Melanie
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2024-06-21 01:54:09 | Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)") |
Previous Message | Tom Lane | 2024-06-20 16:43:36 | Re: BUG #18517: Dropping a table referenced by an initially deferred foreign key fails with an error |