From: | daveg <daveg(at)sonic(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: error: could not find pg_class tuple for index 2662 |
Date: | 2011-08-04 19:52:07 |
Message-ID: | 20110804195206.GH14353@sonic.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Aug 04, 2011 at 12:28:31PM -0400, Tom Lane wrote:
> daveg <daveg(at)sonic(dot)net> writes:
> > Summary: the failing process reads 0 rows from 0 blocks from the OLD
> > relfilenode.
>
> Hmm. This seems to mean that we're somehow missing a relation mapping
> invalidation message, or perhaps not processing it soon enough during
> some complex set of invalidations. I did some testing with that in mind
> but couldn't reproduce the failure. It'd be awfully nice to get a look
> at the call stack when this happens for you ... what OS are you running?
To recap, a few observations:
When it happens the victim has recently been waiting on a lock for a
several seconds.
We create a lot of temp tables, hundreds of thousands a day.
There are catalog vacuum fulls and reindexes running on 30 odd other databases
at the same time. The script estimates the amount of bloat on each table and
index and chooses either reindex on specific indexes or vacuum full as needed.
This is a 32 core (64 with hype threading) 512GB host with several hundred
connections
We are seeing "cannot read' and 'cannot open' errors too that would be
consistant with trying to use a vanished file.
-dg
--
David Gould daveg(at)sonic(dot)net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.
From | Date | Subject | |
---|---|---|---|
Next Message | Joe Conway | 2011-08-04 20:09:13 | Re: possible new feature: asynchronous sql or something like oracles dbms_job.submit |
Previous Message | Bruce Momjian | 2011-08-04 19:50:36 | Reduce WAL logging of INSERT SELECT |