Re: catalog corruption bug

From: Jeremy Drake <pgsql(at)jdrake(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: catalog corruption bug
Date: 2006-01-07 19:27:17
Message-ID: Pine.LNX.4.63.0601071106090.15097@garibaldi.apptechsys.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 7 Jan 2006, Tom Lane wrote:

> Jeremy Drake <pgsql(at)jdrake(dot)com> writes:
> > Am I correct in interpreting this as the hash opclass for Oid?
>
> However, AFAICS the only consequence of this bug is to trigger
> that Assert failure if you've got Asserts enabled. Dead catcache
> entries aren't actually harmful except for wasting some space.
> So I don't think this is related to your pg_type duplicate key
> problem.
>
> One weak spot in this theory is the assumption that somebody was
> vacuuming pg_amop. It seems unlikely that autovacuum would do so
> since the table never changes (unless you had reached the point
> where an anti-XID-wraparound vacuum was needed, which is unlikely
> in itself). Do you have any background processes that do full-database
> VACUUMs?

No. Just the autovacuum, which is actually the process which had the
assert failure.

This appears to give the current xid
(gdb) p *s
$10 = {
transactionId = 13568516,
subTransactionId = 1,
name = 0x0,
savepointLevel = 0,
state = TRANS_COMMIT,
blockState = TBLOCK_STARTED,
nestingLevel = 1,
curTransactionContext = 0x9529c0,
curTransactionOwner = 0x92eb40,
childXids = 0x0,
currentUser = 0,
prevXactReadOnly = 0 '\0',
parent = 0x0
}

>
> I'll go fix CatCacheRemoveCList, but I think this is not the bug
> we're looking for.

Incidentally, one of my processes did get that error at the same time.
All of the other processes had an error
DBD::Pg::st execute failed: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

But this one had the DBD::Pg::st execute failed: ERROR: duplicate key
violates unique constraint "pg_type_typname_nsp_index"

It looks like my kernel did not have the option to append the pid to core
files ,so perhaps they both croaked at the same time but only this one got
to write a core file?

I will enable this and try again, see if I can't get it to make 2 cores.

BTW, nothing of any interest made it into the backend log regarding what
assert(s) failed.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Qingqing Zhou 2006-01-07 19:58:23 Re: Warm-up cache may have its virtue
Previous Message Joachim Wieland 2006-01-07 19:18:12 Re: CIDR/INET improvements