Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

From: Harry Ambrose <harry(dot)ambrose(at)gmail(dot)com>
To: adsj(at)novozymes(dot)com
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100
Date: 2017-06-23 10:46:33
Message-ID: CAK4Knu_dxydaskUtvaH0Nk65psjYH3SnCretJf7TN0owAG-z5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Everyone,

Still trying to fathom this one. I have added quite a few log lines to a
copy of 9.4.12 and compiled it hoping to find the fault.

Below is from the log (at DEBUG5). Apologies for my name in the log lines,
it was the easiest way to grep them specifically I also apologise that its
a bit messy, i'm not a C dev.

This excerpt is without failure:

127.0.0.1 2017-06-23 09:45:26.083 BST 4061 594ccaa6.fdd postgres [unknown]
00000 DEBUG: HARRYAMBROSE - UPDATE - Old tuple is 0, new tuple is 0,
table is: 2345873096
127.0.0.1 2017-06-23 09:45:26.083 BST 4061 594ccaa6.fdd postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.083 BST 4061 594ccaa6.fdd postgres [unknown]
00000 DEBUG: HARRYAMBROSE - UPDATE - Old tuple is 0, new tuple is 0,
table is: 2345873096
127.0.0.1 2017-06-23 09:45:26.083 BST 4061 594ccaa6.fdd postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.083 BST 4061 594ccaa6.fdd postgres [unknown]
00000 DEBUG: HARRYAMBROSE - UPDATE - Old tuple is 0, new tuple is 0,
table is: 2345873096
127.0.0.1 2017-06-23 09:45:26.083 BST 4061 594ccaa6.fdd postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.083 BST 4061 594ccaa6.fdd postgres [unknown]
00000 DEBUG: HARRYAMBROSE - index_beginscan_internal scan: 57267920
127.0.0.1 2017-06-23 09:45:26.083 BST 4061 594ccaa6.fdd postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.083 BST 4061 594ccaa6.fdd postgres [unknown]
00000 DEBUG: HARRYAMBROSE - index_beginscan_internal scan: 57267920
127.0.0.1 2017-06-23 09:45:26.083 BST 4061 594ccaa6.fdd postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.083 BST 4061 594ccaa6.fdd postgres [unknown]
00000 DEBUG: HARRYAMBROSE - index_beginscan_internal scan: 57267920
127.0.0.1 2017-06-23 09:45:26.083 BST 4061 594ccaa6.fdd postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.084 BST 4061 594ccaa6.fdd postgres [unknown]
00000 DEBUG: CommitTransaction
127.0.0.1 2017-06-23 09:45:26.084 BST 4061 594ccaa6.fdd postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.084 BST 4061 594ccaa6.fdd postgres [unknown]
00000 DEBUG: name: unnamed; blockState: STARTED; state: INPROGR,
xid/subid/cid: 73603293/1/0 (used), nestlvl: 1, children:
127.0.0.1 2017-06-23 09:45:26.084 BST 4061 594ccaa6.fdd postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.084 BST 4061 594ccaa6.fdd postgres [unknown]
00000 DEBUG: StartTransaction
127.0.0.1 2017-06-23 09:45:26.084 BST 4061 594ccaa6.fdd postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.084 BST 4061 594ccaa6.fdd postgres [unknown]
00000 DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR,
xid/subid/cid: 0/1/0, nestlvl: 1, children:
127.0.0.1 2017-06-23 09:45:26.084 BST 4061 594ccaa6.fdd postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab

This is with the failure, interesting that there are multiple flush's
nearby, could be total coincidence though?

127.0.0.1 2017-06-23 10:28:25.862 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
**.*.**.*** 2017-06-23 10:28:25.996 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E7DA8000 flush 2F81/E7D90000 apply
2F81/E7D8FBA0
**.*.**.*** 2017-06-23 10:28:25.996 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E7DA8000 flush 2F81/E7DA8000 apply
2F81/E7D8FBA0

sent 16492 bytes received 24607 bytes 82198.00 bytes/sec
total size is 16777216 speedup is 408.21
2017-06-23 10:28:26.014 BST 24752 594bdf95.60b0 00000 DEBUG: archived
transaction log file "0000000100002F81000000E5"
2017-06-23 10:28:26.018 BST 24752 594bdf95.60b0 00000 DEBUG:
executing archive command "rsync -e ssh -arv
/wal/pg_xlog/0000000100002F81000000E6 postgres@<hostname
removed>:/wal/pg_xlog"
127.0.0.1 2017-06-23 10:28:26.101 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 DEBUG: HARRYAMBROSE - index_beginscan_internal scan: 630163208
127.0.0.1 2017-06-23 10:28:26.101 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
sending incremental file list

sent 69 bytes received 12 bytes 162.00 bytes/sec
total size is 16777216 speedup is 207126.12
2017-06-23 10:28:26.200 BST 24752 594bdf95.60b0 00000 DEBUG: archived
transaction log file "0000000100002F81000000E6"
2017-06-23 10:28:26.201 BST 24752 594bdf95.60b0 00000 DEBUG:
executing archive command "rsync -e ssh -arv
/wal/pg_xlog/0000000100002F81000000E7 postgres@<hostname
removed>:/wal/pg_xlog"
**.*.**.*** 2017-06-23 10:28:26.203 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E7DC8000 flush 2F81/E7DA8000 apply
2F81/E7DA7FC8
**.*.**.*** 2017-06-23 10:28:26.203 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E7DC8000 flush 2F81/E7DC8000 apply
2F81/E7DA7FC8
**.*.**.*** 2017-06-23 10:28:26.204 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E7DE8000 flush 2F81/E7DC8000 apply
2F81/E7DB1838
**.*.**.*** 2017-06-23 10:28:26.205 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E7DE8000 flush 2F81/E7DE8000 apply
2F81/E7DB4A60
**.*.**.*** 2017-06-23 10:28:26.206 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E7E08000 flush 2F81/E7DE8000 apply
2F81/E7DC4F08
**.*.**.*** 2017-06-23 10:28:26.207 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E7E08000 flush 2F81/E7E08000 apply
2F81/E7DC8178
127.0.0.1 2017-06-23 10:28:26.212 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 DEBUG: HARRYAMBROSE - Calling toast_insert_or_update from
rewriteheap.c raw_heap_insert
127.0.0.1 2017-06-23 10:28:26.212 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 10:28:26.212 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 DEBUG: HARRYAMBROSE - INSERT - new tuple is 8001, table is:
2345707504
127.0.0.1 2017-06-23 10:28:26.212 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 10:28:26.213 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 DEBUG: HARRYAMBROSE - INSERT - new tuple is 43493696, table is:
2345707504
127.0.0.1 2017-06-23 10:28:26.213 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 10:28:26.213 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 DEBUG: HARRYAMBROSE - called toast_fetch_datum from
heap_tuple_fetch_attr
127.0.0.1 2017-06-23 10:28:26.213 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 10:28:26.213 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 DEBUG: HARRYAMBROSE - Has failed is 0 before loop
127.0.0.1 2017-06-23 10:28:26.213 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 10:28:26.213 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 DEBUG: HARRYAMBROSE - About to start loop grabbing chunks.
127.0.0.1 2017-06-23 10:28:26.213 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 10:28:26.213 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 DEBUG: HARRYAMBROSE - Calling systable_beginscan_ordered with
toastrel: 2345700592 and toastidx: 2345702000.
127.0.0.1 2017-06-23 10:28:26.213 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 10:28:26.213 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 DEBUG: HARRYAMBROSE - index_beginscan_internal scan: 630163208
127.0.0.1 2017-06-23 10:28:26.213 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 10:28:26.213 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 DEBUG: HARRYAMBROSE - Called systable_beginscan_ordered. Returned
data is: 630163352.
127.0.0.1 2017-06-23 10:28:26.213 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
**.*.**.*** 2017-06-23 10:28:26.225 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E8000000 flush 2F81/E7E08000 apply
2F81/E7E079C0
**.*.**.*** 2017-06-23 10:28:26.227 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E8000000 flush 2F81/E8000000 apply
2F81/E7E079C0
sending incremental file list

sent 69 bytes received 12 bytes 162.00 bytes/sec
total size is 16777216 speedup is 207126.12
2017-06-23 10:28:26.380 BST 24752 594bdf95.60b0 00000 DEBUG: archived
transaction log file "0000000100002F81000000E7"
2017-06-23 10:28:26.440 BST 24749 594bdf95.60ad 00000 DEBUG: snapshot
of 1+0 running transaction ids (lsn 2F81/E871D980 oldest xid 73603616
latest complete 73603732 next xid 73603733)
**.*.**.*** 2017-06-23 10:28:26.502 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E871C000 flush 2F81/E8000000 apply
2F81/E7FFFB70
**.*.**.*** 2017-06-23 10:28:26.509 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E871C000 flush 2F81/E871C000 apply
2F81/E7FFFB70
**.*.**.*** 2017-06-23 10:28:26.625 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E8722000 flush 2F81/E871C000 apply
2F81/E856BE20
**.*.**.*** 2017-06-23 10:28:26.625 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E8722000 flush 2F81/E8722000 apply
2F81/E856DFB8
127.0.0.1 2017-06-23 10:28:26.669 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 DEBUG: unexpected chunk number 2287 (expected 3) for toast value
192719354 in pg_toast_192719329
127.0.0.1 2017-06-23 10:28:26.669 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 10:28:26.669 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 DEBUG: HARRYAMBROSE - Current chunk number is 2287, expecting 3.
127.0.0.1 2017-06-23 10:28:26.669 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 10:28:26.669 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 DEBUG: HARRYAMBROSE - Has failed is 1 after loop
127.0.0.1 2017-06-23 10:28:26.669 BST 1219 594cd5ba.4c3 postgres [unknown]
00000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 10:28:26.669 BST 1219 594cd5ba.4c3 postgres [unknown]
XX000 ERROR: unexpected chunk number exception, please see DEBUG.
127.0.0.1 2017-06-23 10:28:26.669 BST 1219 594cd5ba.4c3 postgres [unknown]
XX000 STATEMENT: VACUUM (FULL, ANALYZE) ctab
**.*.**.*** 2017-06-23 10:28:26.672 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E872C000 flush 2F81/E8722000 apply
2F81/E8722000
**.*.**.*** 2017-06-23 10:28:26.672 BST 24765 594bdf98.60bd postgres
walreceiver 00000 DEBUG: write 2F81/E872C000 flush 2F81/E872C000 apply
2F81/E8722000

It is also very interesting that the chunks are actually missing, not
returned in an incorrect order (0,1,2,2287). I have modified
toast_fetch_datum so that the exception occurs outside of the loop and to
print any chunks found after the error. Excerpt below:

int hasfailed;

hasfailed = 0;

(...)

while ((ttup = systable_getnext_ordered(toastscan, ForwardScanDirection))
!= NULL)

{

/*

* Have a chunk, extract the sequence number and the data

*/

residx = DatumGetInt32(fastgetattr(ttup, 2, toasttupDesc, &isnull));

Assert(!isnull);

chunk = DatumGetPointer(fastgetattr(ttup, 3, toasttupDesc, &isnull));

Assert(!isnull);

(...)

/*

* Some checks on the data we've found

*/

if (residx != nextidx)

{

/*

* Remove error status and convert to DEBUG5. Its important to see what
other values are returned. We should error outside the loop.

elog(ERROR, "unexpected chunk number %d (expected %d) for toast value %u in
%s",

residx, nextidx,

toast_pointer.va_valueid,

RelationGetRelationName(toastrel));

*/

elog(DEBUG5, "unexpected chunk number %d (expected %d) for toast value %u
in %s",

residx, nextidx,

toast_pointer.va_valueid,

RelationGetRelationName(toastrel));

hasfailed = 1;

}

(...)

/*

* Log the current chunk number and the expected chunk number

*/

if (hasfailed!=0)

{

elog(DEBUG5, "HARRYAMBROSE - Current chunk number is %d, expecting %d.",
residx, nextidx);

}

nextidx++;

} // End while loop

elog(DEBUG5, "HARRYAMBROSE - Has failed is %d after loop", hasfailed);

if (hasfailed!=0)

{

elog(ERROR, "unexpected chunk number exception, please see DEBUG.");

}

Happy to provide further info if required. I have tried to pick only the
interesting changes above ^.

Thanks again for your help!

Have a great weekend :)

Harry

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Daniel Verite 2017-06-23 11:16:19 Re: Suddenly - LOG: could not open file "postmaster.pid": No such file or directory
Previous Message Арсен Арутюнян 2017-06-23 10:34:07 BUG in Prepared transactions from C code using JSON