Re: 8.4b2 tsearch2 strange error

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>, "Tatsuo Ishii" <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: 8.4b2 tsearch2 strange error
Date: 2009-06-04 22:10:03
Message-ID: 20255.1244153403@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Hmm ... I'll rev up my old 32-bit-Intel machine. I suspect that the
> missing link here is some state change in the database above and beyond
> just loading the data, but since we don't know exactly what that is,
> we'll have to resort to working with Tatsuo's bitwise dump.

I poked around in the dump for awhile. I still can't reproduce the
failure from a standing start. It looks to me like Tatsuo's database
was possibly produced from separate schema and data load steps, followed
by some update operations. It would be nice to have a full script for
reproducing the state of the database.

I did find out some interesting stuff though. One thing that is
particularly striking is that the GIN index is pretty bloated.
If you load the data dump file as-is into an empty DB you get
an index size of 2564 pages (on a 32-bit machine). The actual size
of the index in Tatsuo's filesystem dump is 5459 pages! I tried to
see if I could duplicate that by doing separate schema and data load
(so the data is inserted into a pre-existing index). I got an index
size of 5356 pages, close but still less than the actual. This is
what makes me think there were some additional update operations
(well, that and the sequence counter being larger than the number of
rows in the table...)

The bogus TIDs are coming from index pages that are clearly corrupt.
For example, the "609" case comes from page 5439. I tweaked pg_filedump
to know something about printing out GIN leaf pages, and got this:

Block 5439 ********************************************************
<Header> -----
Block Offset: 0x02a7e000 Offsets: Lower 24 (0x0018)
Block: Size 8192 Version 4 Upper 8184 (0x1ff8)
LSN: logid 0 recoff 0x099cc544 Special 8184 (0x1ff8)
Items: 0 Free Space: 8160
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0000 ()
Length (including item array): 24

0000: 00000000 44c59c09 01000000 1800f81f ....D...........
0010: f81f0420 00000000 ... ....

GIN Data Section:
Right Bound: 0/0
Item 1: 1/3
Item 2: 64/5
Item 3: 67/4
Item 4: 90/3
Item 5: 100/5
Item 6: 106/3
Item 7: 106/5
Item 8: 114/4
Item 9: 189/5
Item 10: 204/6
Item 11: 300/7
Item 12: 302/5
Item 13: 302/6
Item 14: 309/5
Item 15: 355/2
Item 16: 355/4
Item 17: 407/3
Item 18: 472/4
Item 19: 480/6
Item 20: 483/1
Item 21: 486/1
Item 22: 486/3
Item 23: 499/5
Item 24: 560/6
Item 25: 584/6
Item 26: 588/3
Item 27: 589/3
Item 28: 660/6
Item 29: 667/4
Item 30: 718/2
Item 31: 719/4
Item 32: 738/4
Item 33: 760/6
Item 34: 763/2
Item 35: 764/2
Item 36: 784/6
Item 37: 844/7
Item 38: 912/4
Item 39: 913/3
Item 40: 913/5
Item 41: 916/5
Item 42: 930/5
Item 43: 945/1
Item 44: 945/4
Item 45: 945/5
Item 46: 973/7
Item 47: 994/5
Item 48: 1036/3
Item 49: 1046/6
Item 50: 1048/4
Item 51: 1048/6
Item 52: 1069/1
Item 53: 1192/4
Item 54: 1205/1
Item 55: 1205/3
Item 56: 1280/4
Item 57: 1317/7
Item 58: 1347/5
Item 59: 1363/1
Item 60: 1367/2
Item 61: 1389/5
Item 62: 1390/2
Item 63: 1393/2
Item 64: 1400/4
Item 65: 1417/5
Item 66: 1418/3
Item 67: 1513/3
Item 68: 1513/4
Item 69: 1513/5
Item 70: 1514/2
Item 71: 1614/4
Item 72: 1654/5
Item 73: 1666/4
Item 74: 1674/1
Item 75: 1690/6
Item 76: 1691/5
Item 77: 1697/5
Item 78: 1741/5
Item 79: 393216/609
Item 80: 327680/660
Item 81: 393216/819
Item 82: 327680/820

<Special Section> -----
GIN Index Section:
Flags: 0x00000003 (DATA|LEAF) Maxoff: 82
Blocks: RightLink (-1)

1ff8: ffffffff 52000300 ....R...

so the last four entries (according to the maxoff) are wrong.
Even more interesting, a binary dump of the page shows there
is data beyond the maxoff point:

Data array begins here, ItemPointerData apiece
0020: 00000100 03000000 40000500 00004300 (dot)(dot)(dot)(dot)(dot)(dot)(dot)(dot)(at)(dot)(dot)(dot)(dot)(dot)C(dot)
0030: 04000000 5a000300 00006400 05000000 ....Z.....d.....
0040: 6a000300 00006a00 05000000 72000400 j.....j.....r...
0050: 0000bd00 05000000 cc000600 00002c01 ..............,.
0060: 07000000 2e010500 00002e01 06000000 ................
0070: 35010500 00006301 02000000 63010400 5.....c.....c...
0080: 00009701 03000000 d8010400 0000e001 ................
0090: 06000000 e3010100 0000e601 01000000 ................
00a0: e6010300 0000f301 05000000 30020600 ............0...
00b0: 00004802 06000000 4c020300 00004d02 ..H.....L.....M.
00c0: 03000000 94020600 00009b02 04000000 ................
00d0: ce020200 0000cf02 04000000 e2020400 ................
00e0: 0000f802 06000000 fb020200 0000fc02 ................
00f0: 02000000 10030600 00004c03 07000000 ..........L.....
0100: 90030400 00009103 03000000 91030500 ................
0110: 00009403 05000000 a2030500 0000b103 ................
0120: 01000000 b1030400 0000b103 05000000 ................
0130: cd030700 0000e203 05000000 0c040300 ................
0140: 00001604 06000000 18040400 00001804 ................
0150: 06000000 2d040100 0000a804 04000000 ....-...........
0160: b5040100 0000b504 03000000 00050400 ................
0170: 00002505 07000000 43050500 00005305 ..%.....C.....S.
0180: 01000000 57050200 00006d05 05000000 ....W.....m.....
0190: 6e050200 00007105 02000000 78050400 n.....q.....x...
01a0: 00008905 05000000 8a050300 0000e905 ................
01b0: 03000000 e9050400 0000e905 05000000 ................
01c0: ea050200 00004e06 04000000 76060500 ......N.....v...
01d0: 00008206 04000000 8a060100 00009a06 ................
01e0: 06000000 9b060500 0000a106 05000000 ................
01f0: cd060500
first problem slot (79):
06000000 61020500 00009402 ........a.......
0200: 06000000 33030500 00003403
junk here (slots 83 and up):
05000000 ....3.....4.....
0210: 34030600 00003303 05000000 34030000 4.....3.....4...
0220: 00000000 00000000 00000000 00000000 ................
0230: 00000000 00000000 00000000 00000000 ................
... rest of the page is zeroes, up to the special section

What I'm guessing is that either insertion or removal of some entry
or entries on the page was done wrong. Perhaps maxoff is bigger than
it should be. But it's also curious that the incorrect data looks
like it might be valid data that's been shifted by two bytes from
where it should be. Maybe some part of the code is manipulating the
entry array on the assumption that it's an array of PostingItems
instead of ItemPointers?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-06-04 22:58:48 Re: It's June 1; do you know where your release is?
Previous Message Alvaro Herrera 2009-06-04 21:53:41 Re: [HACKERS] pull raw text of a message by message-id