| From: | ChengWen Wu <drec(dot)wu(at)foxmail(dot)com> | 
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> | 
| Cc: | pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> | 
| Subject: | Re: Fix orphaned 2pc file which may casue instance restart failed | 
| Date: | 2024-10-09 07:51:59 | 
| Message-ID: | tencent_CA843A8385CB3130B9ABC1E55023FC4E4D05@qq.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hi Michael,
Is there any progress about this problem? I could give more detailed information if you need.
Best wishes,
Chengwen Wu
------------------ Original ------------------
From:                                                                                                                        "Michael Paquier"                                                                                    <michael(at)paquier(dot)xyz>;
Date: Wed, Sep 11, 2024 05:21 PM
To: "清浅"<drec(dot)wu(at)foxmail(dot)com>;
Cc: "pgsql-hackers"<pgsql-hackers(at)lists(dot)postgresql(dot)org>;
Subject: Re: Fix orphaned 2pc file which may casue instance restart failed
On Sun, Sep 08, 2024 at 01:01:37PM +0800, 清浅 wrote:
> Hi all,  I found that there is a race condition
> between two global transaction, which may cause instance restart
> failed with error 'could not access status of transaction
> xxx","Could not open file ""pg_xact/xxx"": No such file or
> directory'.
> 
> 
>   The scenery to reproduce the problem is:
>     1. gxact1 is doing `FinishPreparedTransaction` and checkpoint
>       is issued, so gxact1 will generate a 2pc file.
>     2. then gxact1 was removed from `TwoPhaseState-&gt;prepXacts` and
>       its state memory was returned to freelist.
>     3. but just before gxact1 remove its 2pc file, gxact2 is issued,
>       gxact2 will reuse the same state memory of gxact1 and will
>       reset `gxact-&gt;ondisk` to false.
>     4. gxact1 continue and found that `gxact-&gt;ondisk` is false, it won't
>       remove its 2pc file. This file is orphaned.
> 
>   If gxact1's local xid is not frozen, the startup process will remove
> the orphaned 2pc file. However, if the xid's corresponding clog file is
> truncated by `vacuum`, the startup process will raise error 'could not
> access status of transaction xxx', due to it could not found the
> transaction's status file in dir `pg_xact`.
Hmm.  I've not seen that in the field.  Let me check that..
--
Michael
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Daniel Gustafsson | 2024-10-09 07:52:14 | Re: pgindent fails with perl 5.40 | 
| Previous Message | Tender Wang | 2024-10-09 07:26:03 | Remove an unnecessary check on semijoin_target_ok() on postgres_fdw.c |