From: | Sean Chittenden <sean(at)chittenden(dot)org> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | WAL partition ran out of space, pg_subtrans & pg_clog partially written... |
Date: | 2013-05-15 20:54:52 |
Message-ID: | CC85515E-4165-4C68-BF8D-487D702ECF1F@chittenden.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
This is an FYI re: a bug I ran across.
Background: RHEL 6 & ext4. PGDATA, a table space, and WAL logs are all on their own partitions.
The WAL partition filled up (wal_keep_segments was changed, but Pg hadn't been restarted), and a write happened, and it appears to have resulted in an index page being corrupt. REINDEX'ing the table didn't work with the following error:
> WARNING: concurrent delete in progress within table "tblA"
> WARNING: concurrent delete in progress within table "tblA"
> ERROR: could not access status of transaction 86081816
> DETAIL: Could not read from file "pg_subtrans/0521" at offset 131072: Success.
pg_subtrans/0521 was a 90112 byte file and pg_clog/0052 was 24576 bytes in size.
I haven't dug in to the code, but I did see the commit message for pgsql/src/backend/access/transam/slru.c 1.23.4.2, which I'm guessing is related. My WAG is that there's an assumption someplace that pg_subtrans and pg_clog are on the same partition as pg_xlog, or that creation of files in pg_subtrans and pg_clog will either absolutely succeed or absolutely fail. It could also be that Linux reported back a successful write(2), but it didn't actually have the space available (ext4).
Anyway, after extending pg_subtrans/0521 w/ zeros to its proper 256KB size, I was able to REINDEX the table, but there was a stream of WARNINGs about "concurrent inserts and deletes" that I didn't dig in to. Upon learning the WAL files were removed as a temporary solution to the space problem, I opted to dump, re-initdb, and load the data, which worked without any errors or warnings being reported.
I've saved the data if there are pointed questions about their contents.
-sc
--
Sean Chittenden
sean(at)chittenden(dot)org
From | Date | Subject | |
---|---|---|---|
Next Message | nelson | 2013-05-16 01:17:57 | BUG #8167: false EINVAL -22 for opening a file |
Previous Message | Tom Lane | 2013-05-15 20:39:17 | Re: BUG #8163: simultaneous nearly identical update queries execute extremely slowly |