| From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
|---|---|
| To: | Bowen Shi <zxwsbg12138(at)gmail(dot)com> |
| Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Alexander Lakhin <exclusion(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
| Subject: | Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae |
| Date: | 2024-05-13 14:42:31 |
| Message-ID: | CAAKRu_bzcKmb1G4wY2AezsDbZ5QZubmrjrKgkPdbu_L6k6uUdQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
On Sun, May 12, 2024 at 11:19 PM Bowen Shi <zxwsbg12138(at)gmail(dot)com> wrote:
>
> Hi,
>>
>> Obviously we should actually fix this on back branches, but could we
>> at least make the retry loop interruptible in some way so people could
>> use pg_cancel/terminate_backend() on a stuck autovacuum worker or
>> vacuum process?
>
>
> If the problem happens in versions <= PG 16, we don't have a good solution (vacuum process holds the exclusive lock cause checkpoint hangs).
>
> Maybe we can make the retry loop interruptible first. However, since we are using START_CRIT_SECTION, we cannot simply use CHECK_FOR_INTERRUPTS to handle it.
As far as I can tell, in 14 and 15, the versions where the issue
reported here is present, there is not a critical section in the
section of code looped through in the retry loop in lazy_scan_prune().
We can actually fix the particular issue I reproduced with the
attached patch. However, I think it is still worth making the retry
loop interruptible in case there are other ways to end up infinitely
looping in the retry loop in lazy_scan_prune().
- Melanie
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Fix-vacuum-hang.patch | text/x-patch | 4.7 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alvaro Herrera | 2024-05-13 16:55:36 | Re: BUG #18377: Assert false in "partdesc->nparts >= pinfo->nparts", fileName="execPartition.c", lineNumber=1943 |
| Previous Message | David G. Johnston | 2024-05-13 14:34:09 | Re: ORDER BY two columns gives incorrect result on second column |