Re: Dangling operator family after DROP TYPE

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Yoran Heling <contact(at)yorhel(dot)nl>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Dangling operator family after DROP TYPE
Date: 2024-12-06 17:15:56
Message-ID: 2830528.1733505356@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Yoran Heling <contact(at)yorhel(dot)nl> writes:
> pg_upgrade was failing on one of my databases and, while digging a bit,
> I found that the database in question contained a dangling operator
> family for a type that didn't exist anymore.

> I've attached a script to recreate this situation: creating a new type,
> defining an operator class for it and then dropping the type causes the
> implicitly created operator family to remain.

Thanks for the report. I don't think it's wrong for the now-empty
operator family to stick around: it has no direct dependency on the
dropped type. Also, trying to make it go away would cause problems
if another operator class for another type had been added to the
family meanwhile. However, these things are bad:

> Attempting to drop this operator family results in an error. Attempting
> to do a dump/restore results in a syntax error on restore.

The problem seems to be that the pg_amproc entry for the opclass'
function 4 doesn't get dropped. Examining the pre-drop pg_depend
entries for the opclass and opfamily, we find

# select objid, pg_describe_object(classid,objid,objsubid) as obj, pg_describe_object(refclassid,refobjid,refobjsubid) as ref, deptype from pg_depend where ...
objid | obj | ref | deptype
-------+---------------------------------------------------------------------------------------------+-----------------------------------------------------+---------
...
45105 | operator family t_btree_ops for access method btree | schema public | n
45107 | operator 1 (t, t) of operator family t_btree_ops for access method btree: <(t,t) | operator <(t,t) | n
45107 | operator 1 (t, t) of operator family t_btree_ops for access method btree: <(t,t) | operator class t_btree_ops for access method btree | i
45108 | operator 2 (t, t) of operator family t_btree_ops for access method btree: <=(t,t) | operator <=(t,t) | n
45108 | operator 2 (t, t) of operator family t_btree_ops for access method btree: <=(t,t) | operator class t_btree_ops for access method btree | i
45109 | operator 3 (t, t) of operator family t_btree_ops for access method btree: =(t,t) | operator =(t,t) | n
45109 | operator 3 (t, t) of operator family t_btree_ops for access method btree: =(t,t) | operator class t_btree_ops for access method btree | i
45110 | operator 4 (t, t) of operator family t_btree_ops for access method btree: >=(t,t) | operator >=(t,t) | n
45110 | operator 4 (t, t) of operator family t_btree_ops for access method btree: >=(t,t) | operator class t_btree_ops for access method btree | i
45111 | operator 5 (t, t) of operator family t_btree_ops for access method btree: >(t,t) | operator >(t,t) | n
45111 | operator 5 (t, t) of operator family t_btree_ops for access method btree: >(t,t) | operator class t_btree_ops for access method btree | i
45112 | function 1 (t, t) of operator family t_btree_ops for access method btree: t_cmp(t,t) | function t_cmp(t,t) | n
45112 | function 1 (t, t) of operator family t_btree_ops for access method btree: t_cmp(t,t) | operator class t_btree_ops for access method btree | i
45113 | function 4 (t, t) of operator family t_btree_ops for access method btree: btequalimage(oid) | operator family t_btree_ops for access method btree | a
45106 | operator class t_btree_ops for access method btree | schema public | n
45106 | operator class t_btree_ops for access method btree | operator family t_btree_ops for access method btree | a
45106 | operator class t_btree_ops for access method btree | type t | n
...

pg_amproc OID 45112 (function 1) has a dependency on t_cmp(t,t), which
of course depends in turn on type t. It also has a dependency on
the operator class, which also depends on t, so for sure it's going
away during "DROP TYPE t". But look at 45113 (function 4). It would
have a dependency on btequalimage(), but we don't record that because
btequalimage() is a pinned built-in function. Its other dependency
is on the operator family not the operator class. This seems like the
wrong thing. It's intentional according to the code: in nbtvalidate.c
we have

if (op->is_func && op->number != BTORDER_PROC)
{
/* Optional support proc, so always a soft family dependency */
op->ref_is_hard = false;
op->ref_is_family = true;
op->refobjid = opfamilyoid;
}

But I think we copied that pattern from other index AMs without
thinking too hard about it. In AMs like GiST, the argument is

* Operator members of a GiST opfamily should never have hard
* dependencies, since their connection to the opfamily depends only on
* what the support functions think, and that can be altered. For
* consistency, we make all soft dependencies point to the opfamily,
* though a soft dependency on the opclass would work as well in the
* CREATE OPERATOR CLASS case.

It seems like maybe btree should be using soft dependencies on the
opclass for optional support functions? Not quite sure about that.
There were a lot of moving parts in these choices IIRC.

Now the big reason that the leftover pg_amproc entry causes problems
is that its amproclefttype/amprocrighttype entries are still
referencing the deleted type "t". That wouldn't really stop us from
deleting the opfamily, except that during DROP CASCADE we try to print
descriptions of all the dropped objects, and getObjectDescription
calls format_type_extended which fails. The leftover entry also
causes issues for pg_dump, which will emit something like

CREATE OPERATOR FAMILY public.t_btree_ops USING btree;
ALTER OPERATOR FAMILY public.t_btree_ops USING btree ADD
FUNCTION 4 (45086, 45086) btequalimage(oid);

which of course doesn't parse during restore.

So we really need to fix things so that the pg_amproc entry goes away.
Switching its dependency to be on the opclass would do. A different
approach that might solve more problems is to be careful to record
a dependency from a pg_amproc entry to the type(s) mentioned in it.
This would be redundant in the case where the referenced function
has those types as input, but we can't really assume that for
support functions.

At least in the back branches, I'm inclined to also fix
getObjectDescription to use FORMAT_TYPE_ALLOW_INVALID when
printing the types of a pg_amproc entry, so that you're not
quite so thoroughly hosed if you already have this situation
in your catalog.

Peter, any thoughts about this?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Geoghegan 2024-12-06 18:36:27 Re: Dangling operator family after DROP TYPE
Previous Message Yoran Heling 2024-12-06 15:15:20 Dangling operator family after DROP TYPE