Re: Vacuum Error

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: abartley(at)evolvosystems(dot)com
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Vacuum Error
Date: 2004-01-12 23:20:23
Message-ID: 5931.1073949623@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Andrew Bartley" <abartley(at)evolvosystems(dot)com> writes:
> I believe this error usually comes about due to OID wrapping.

No; an OID collision would have occurred when you tried to create a
table. If two tables are present in pg_class then they have different
OIDs, and shouldn't have any conflicts in pg_statistic.

Since my last message I've thought of a possibly plausible explanation:
the bt_moveright bug that was isolated just a few months ago. If two
backends were analyzing some tables (not necessarily the same table) at
the same time, and one of them caused the first root-page split in
pg_statistic_relid_att_index, it would be possible for the other one to
miss spotting an existing row for the OID/attnum it wanted to insert.
I think this could lead to two valid entries for the same OID/attnum in
pg_statistic, and consequently a persistent error on every subsequent
attempt to analyze that table.

This seems moderately plausible because pg_statistic_relid_att_index
would fit in a single btree page up till about 300-400 entries, which
is about right for a moderate-sized database (I see 299 pg_statistic
entries in the current regression database, for example). First split
could easily happen in a database that had been running for awhile.

The relevant CVS log entry is

2003-07-29 18:18 tgl

* src/backend/access/nbtree/: nbtsearch.c (REL7_3_STABLE),
nbtsearch.c (REL7_2_STABLE), nbtsearch.c: Fix longstanding error in
_bt_search(): should moveright at top of loop not bottom.
Otherwise we fail to moveright when the root page was split while
we were "in flight" to it. This is not a significant problem when
the root is above the leaf level, but if the root was also a leaf
(ie, a single-page index just got split) we may return the wrong
leaf page to the caller, resulting in failure to find a key that is
in fact present. Bug has existed at least since 7.1, probably
forever.

(Note that although the patch was committed into 7.2 series, there has
been no 7.2 release since then. You could pull REL7_2_STABLE tip if you
wanted to build a 7.2-series server with this fix in place.)

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephan Szabo 2004-01-12 23:39:09 Re: Column qualifier issue
Previous Message Alex Satrapa 2004-01-12 23:15:47 Protection From Inference (was Re: Drawbacks of using BYTEA for PK?)