Re: ERROR: cache lookup failed for relation 17442 (repost)

From: Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>
To: Michael Guerin <guerin(at)rentec(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: ERROR: cache lookup failed for relation 17442 (repost)
Date: 2005-02-07 22:13:36
Message-ID: 4207E810.8030504@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Michael Guerin wrote:
> Hi All,
>
> I've been getting these errors ("ERROR: cache lookup failed for
> relation 17442") in my logs for a while now. It originally seemed
> like a hardware problem, however now we getting them pretty consistently
> on a couple servers. I've scalled down the schema to the one table and
> the function involved and included a code snipet to make a bunch of
> connections and loop around calling the same function. It usually
> takes 100-2000 iterations before these messages start appearing in the
> log. I've also included the original function, this takes 10,000
> iterations for the error to start showing. I should note, we've been
> getting these erros since version 7, this is the first time they were
> reproducable..
>
> With the original function, the log messages were slightly different and
> usually caused the server to reset:
> i.e.
> ERROR: type "t" already exists
> ERROR: duplicate key violates unique constraint
> "pg_type_typname_nsp_index"
> ERROR: duplicate key violates unique constraint
> "pg_type_typname_nsp_index"
> ERROR: duplicate key violates unique constraint
> "pg_type_typname_nsp_index"
> CONTEXT: SQL statement "create temp table tmp_children ( uniqid bigint,
> memberid bigint, membertype varchar(50), ownerid smallint, tag
> varchar(50), level int4 )"
> PL/pgSQL function "fngetcompositeids2" line 14 at SQL statement
> ERROR: duplicate key violates unique constraint
> "pg_type_typname_nsp_index"
> ERROR: cache lookup failed for type 2449707570
> FATAL: cache lookup failed for type 2449707570
>
> Environment info: Postgres v8, suse linix with latest kernal patches,
> filesystem: reiserfs.
>
> Please let me know if you need anymore information. No data is need,
> just the schema included.
>
> Thanks
> Michael
>

Michael,

The interesting thing about this bug is: We had the same thing on a
customer's machine some time ago. It actually occurred after a certain
script (nothing big) was run the 100.001st time (maybe) on an empty
database. So this one does not seem to be related to the schema - it is
more or less random ...
The interesting thing is: We copied the data directory from the customer
and we were not able to reproduce the same behaviour on a different machine.
The strange thing is: After doing a checkpoint and restarting the
database the problem still occurred. Starting the same binary thing on a
different machine did not show that error ...
We stepped through it with gdb but we could not find anything strange ...
Can you reliably reproduce the problem after a arbitrary amount of
iterations on a different machine? We couldn't ...

Looking at the code: This is a null pointer caught by the system ...
Something seems to corrupt memory ...

Hans

--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/660/816 40 77
www.cybertec.at, www.postgresql.at

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Alvaro Herrera 2005-02-07 22:29:41 Re: ERROR: cache lookup failed for relation 17442 (repost)
Previous Message Tom Lane 2005-02-07 22:10:57 Re: ERROR: cache lookup failed for relation 17442 (repost)