From: | Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PANIC in pg_commit_ts slru after crashes |
Date: | 2017-04-15 04:33:35 |
Message-ID: | CABOikdMdhS4nYX7xHaF+m=P=q_zAJBCYsZ++VN26AZzDRf_xFA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Apr 15, 2017 at 12:53 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>
> In the first statement executed after crash recovery, I sometimes get this
> error:
>
> PANIC: XX000: could not access status of transaction 207580505
> DETAIL: Could not read from file "pg_commit_ts/1EF0" at offset 131072:
> Success.
> LOCATION: SlruReportIOError, slru.c:918
> STATEMENT: create temporary table aldjf (x serial)
>
> Other examples:
>
> PANIC: XX000: could not access status of transaction 3483853232
> DETAIL: Could not read from file "pg_commit_ts/20742" at offset 237568:
> Success.
> LOCATION: SlruReportIOError, slru.c:918
> STATEMENT: create temporary table aldjf (x serial)
>
> PANIC: XX000: could not access status of transaction 802552883
> DETAIL: Could not read from file "pg_commit_ts/779E" at offset 114688:
> Success.
> LOCATION: SlruReportIOError, slru.c:918
> STATEMENT: create temporary table aldjf (x serial)
>
> Based on the errno, I'm assuming the read was successful but returned the
> wrong number of bytes (which was zero in the case I saw after changing the
> code to log short reads).
>
> It then goes through recovery again and the problem does not immediately
> re-occur if you attempt to connect again. I don't know why the file size
> would have changed between attempts.
>
>
> The problem bisects to the commit:
>
> commit 728bd991c3c4389fb39c45dcb0fe57e4a1dccd71
> Author: Simon Riggs <simon(at)2ndQuadrant(dot)com>
> Date: Tue Apr 4 15:56:56 2017 -0400
>
> Speedup 2PC recovery by skipping two phase state files in normal path
>
>
> It is not obvious to me how that is relevant. My test doesn't use
> prepared transactions (and leaves the guc at zero), and this commit doesn't
> touch the slru.c.
>
> I'm attaching the test harness. There is a patch which injects the
> crash-faults and also allows xid fast-forward, a perl script that runs
> until crash and assesses the consistency of the post-crash results, and a
> shell script which sets up the database and then calls the perl script in a
> loop. On 8 CPU machine, it takes about an hour for the PANIC to occur.
>
> The attached script bails out once it sees the PANIC (count.pl line 158)
> if it didn't do that then it will proceed to connect again and retry, and
> works fine. No apparent permanent data corruption.
>
> Any clues on how to investigate this further?
>
Since all those offsets fall on a page boundary, my guess is that we're
somehow failing to handle a new page correctly.
Looking at the patch itself, my feeling is that the following code
in src/backend/access/transam/twophase.c might be causing the problem.
1841
1842 /* update nextXid if needed */
1843 if (TransactionIdFollowsOrEquals(maxsubxid,
ShmemVariableCache->nextXid))
1844 {
1845 LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
1846 ShmemVariableCache->nextXid = maxsubxid;
1847 TransactionIdAdvance(ShmemVariableCache->nextXid);
1848 LWLockRelease(XidGenLock);
1849 }
The function PrescanPreparedTransactions() gets called at the start of the
redo recovery and this specific block will get exercised irrespective of
whether there are any prepared transactions or not. What I find
particularly wrong here is that we are initialising maxsubxid to current
value of ShmemVariableCache->nextXid when the function enters, but this
block would then again increment ShmemVariableCache->nextXid, when there
are no prepared transactions in the system.
I wonder if we should do as in attached patch.
It's not entirely clear to me why only CommitTS fails and not other slrus.
One possible reason could be that CommitTS is started before this function
gets called whereas CLOG and SUBTRANS get started after the function
returns.
Thanks,
Pavan
--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment | Content-Type | Size |
---|---|---|
fix_commit_ts_crash.patch | application/octet-stream | 579 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2017-04-15 04:42:41 | OK, so culicidae is *still* broken |
Previous Message | Tom Lane | 2017-04-15 04:07:33 | Re: Cutting initdb's runtime (Perl question embedded) |