From: | Jeff Amiel <becauseimjeff(at)yahoo(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | 3rd time is a charm.....right sibling is not next child crash. |
Date: | 2010-06-08 13:26:25 |
Message-ID: | 920117.63233.qm@web65511.mail.ac4.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
Not looking for help...just putting some data out there.
2 previous crashes caused by corrupt slony indexes
http://archives.postgresql.org/pgsql-general/2010-02/msg00022.php
http://archives.postgresql.org/pgsql-general/2009-12/msg01172.php
New one yesterday.
Jun 7 15:05:01 db-1 postgres[9334]: [ID 748848 local0.crit] [3989781-1] 2010-06-07 15:05:01.087 CDT 9334PANIC: right sibling 169 of block 168 is not next child of 249 in index "sl_seqlog_idx"
We are on the eve of switching off our SAN to some direct attached storage and upgrading postgres and slony in the process this weekend....so any thoughts that it might be hardware, driver or even postgres/slony should be alleviated by the fact that everything is changing.
That being said, the fact that each time this has happened, it has been a slony index that has been corrupt, I find it 'odd'. While I can't imagine a bug in slony corrupting postgres indexes...and I can't imagine a bug in postgres corrupting only slony indexes, I don't really know what to think. Just putting this out there in case anyone has similar issues or can use this data in some meaningful way.
Stack trace looks similar to last time.
Program terminated with signal 6, Aborted.
#0 0xfecba227 in _lwp_kill () from /lib/libc.so.1
(gdb) bt
#0 0xfecba227 in _lwp_kill () from /lib/libc.so.1
#1 0xfecb598f in thr_kill () from /lib/libc.so.1
#2 0xfec61ed3 in raise () from /lib/libc.so.1
#3 0xfec41d0d in abort () from /lib/libc.so.1
#4 0x0821b8a6 in errfinish (dummy=0) at elog.c:471
#5 0x0821c74b in elog_finish (elevel=22, fmt=0x82b7780 "right sibling %u of block %u is not next child of %u in index \"%s\"") at elog.c:964
#6 0x0809e1a0 in _bt_pagedel (rel=0x867bcd8, buf=139905, stack=0x86b3768, vacuum_full=0 '\0') at nbtpage.c:1141
#7 0x0809f835 in btvacuumscan (info=0x8043f70, stats=0x86b5c30, callback=0, callback_state=0x0, cycleid=29488) at nbtree.c:936
#8 0x0809fc65 in btbulkdelete (fcinfo=0x0) at nbtree.c:547
#9 0x0821f424 in FunctionCall4 (flinfo=0x0, arg1=0, arg2=0, arg3=0, arg4=0) at fmgr.c:1215
#10 0x0809a89f in index_bulk_delete (info=0x8043f70, stats=0x0, callback=0x812ffc8 <lazy_tid_reaped>, callback_state=0x86b5818) at indexam.c:573
#11 0x0812ff54 in lazy_vacuum_index (indrel=0x867bcd8, stats=0x86b5b70, vacrelstats=0x86b5818) at vacuumlazy.c:660
#12 0x0813055a in lazy_vacuum_rel (onerel=0x867b7f8, vacstmt=0x86659b8) at vacuumlazy.c:487
#13 0x0812e910 in vacuum_rel (relid=140925368, vacstmt=0x86659b8, expected_relkind=114 'r') at vacuum.c:1107
#14 0x0812f95a in vacuum (vacstmt=0x86659b8, relids=0x8665bc0) at vacuum.c:400
#15 0x08186e16 in AutoVacMain (argc=0, argv=0x0) at autovacuum.c:914
#16 0x08187278 in autovac_start () at autovacuum.c:178
#17 0x0818bfed in ServerLoop () at postmaster.c:1252
#18 0x0818d16d in PostmasterMain (argc=3, argv=0x833adc8) at postmaster.c:966
#19 0x08152cce in main (argc=3, argv=0x833adc8) at main.c:188
(gdb)
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Hunsberger | 2010-06-08 14:23:13 | Re: Cognitive dissonance |
Previous Message | Craig Ringer | 2010-06-08 13:00:33 | Re: >>relation with OID 1211822032 does not exist |
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Sabino Mullane | 2010-06-08 13:37:15 | Re: [BUGS] Invalid YAML output from EXPLAIN |
Previous Message | Stephen Frost | 2010-06-08 12:58:02 | Re: Idea for getting rid of VACUUM FREEZE on cold pages |