RE: Is it expected behavior index only scan shows "OPERATOR(pg_catalog." for EXPLAIN?

From: <Masahiro(dot)Ikeda(at)nttdata(dot)com>
To: <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: <pgsql-hackers(at)lists(dot)postgresql(dot)org>, <Masao(dot)Fujii(at)nttdata(dot)com>
Subject: RE: Is it expected behavior index only scan shows "OPERATOR(pg_catalog." for EXPLAIN?
Date: 2024-07-09 05:19:49
Message-ID: TYWPR01MB109820B2D948278C167C74A75B1DB2@TYWPR01MB10982.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for your comments.

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
> > I honestly don't know if this is the correct solution. It seems to me
> > handling this at the EXPLAIN level might just mask the issue - it's
> > not clear to me why adding "indexqualorig" would remove the ambiguity
> > (if there's one). Perhaps it might be better to find why the
> > ruleutils.c code thinks the OPERATOR() is necessary, and then improve/fix that?
>
> As Masahiro-san already said, the root cause is that the planner removes the
> RelabelType that is on the indexed variable. So ruleutils sees that the operator is not
> the one that would be chosen by the parser given those input expressions (cf
> generate_operator_name) and it concludes it'd better schema-qualify the operator.
> While that doesn't really make any difference in this particular case, it is the right thing
> in typical rule-deparsing cases.

Yes.

The plan of index scan has a "RELABELTYPE" node, and it has "resulttype" so that
generate_operator_name() gets "operid =1054("="), arg1=1042(bpchar from resulttype),
arg2=1042(bpchar)". The plan of index only scan is removed it so that
generate_operator_name() gets "operid =1054("="), arg1=25(*text*), arg2=1042(bpchar)".

Though there is no entry "operid=1054("="), arg1=25(*text*), arg2=1042(bpchar)" in
pg_operator, it decided to check with the operator "=" for (text, text) because it is
coercion-compatible and preferred than operator "=" for (bpchar, bpchar).

But since the caller expected to use operator "=" for (bpchar, bpchar), it plus
OPERATOR() decoration sadly.

# the partial output of Index Scan plan
:indexqualorig (
{OPEXPR
:opno 1054 -- "=" operator
:opfuncid 1048
:opresulttype 16
:opretset false
:opcollid 0
:inputcollid 950
:args (
{RELABELTYPE
:arg
{VAR
:varno 1
:varattno 1
:vartype 25 -- text. But it will be evaluated as 1042(bpchar) from resulttype
:vartypmod -1
:varcollid 950
:varnullingrels (b)
:varlevelsup 0
:varnosyn 1
:varattnosyn 1
:location 33
}
:resulttype 1042 -- bpchar
:resulttypmod -1
:resultcollid 950
:relabelformat 1
:location 35
}
{CONST
:consttype 1042 -- bpchar
:consttypmod -1
:constcollid 100
:constlen -1
:constbyval false
:constisnull false
:location 46
:constvalue 7 [ 28 0 0 0 102 111 111 ]
}
)

# the partial output of Index Only Scan plan
:indexqual (
{OPEXPR
:opno 1054 -- "=" operator
:opfuncid 1048
:opresulttype 16
:opretset false
:opcollid 0
:inputcollid 950
:args (
{VAR
:varno -3
:varattno 1
:vartype 25 -- text
:vartypmod -1
:varcollid 950
:varnullingrels (b)
:varlevelsup 0
:varnosyn 1
:varattnosyn 1
:location 33
}
{CONST
:consttype 1042 -- bpchar
:consttypmod -1
:constcollid 100
:constlen -1
:constbyval false
:constisnull false
:location 46
:constvalue 7 [ 28 0 0 0 102 111 111 ]
}
)
:location 44
}
)

> I don't think this output is really wrong, and I'm not in favor of adding nontrivial overhead
> to make it better, so I don't like the proposed patch. Maybe generate_operator_name
> could use some other heuristic, but I'm unsure what. The case it wants to cover is that
> the operator is in the search path but is masked by some operator in an earlier schema,
> so simply testing if the operator's schema is in the path at all would be insufficient.

Yes, that's one of my concerns.

IIUC, the above case is not related to the search path but the arguments don't match. If so,
I think there are two ways to solve the issue.

1. make callers of generate_operator_name() check the arguments first.
The callers (e.g., get_oper_expr()) of generate_operator_name() decide whether they
should/can cast the arguments. The planner already decided what operator should be used
so that get_oper_expr() can cast always on the explain context if the arguments don't match
the operator's one, doesn't it?

2. make generate_operator_name() checks all operator candidate.
Currently generate_operator_name() checks only one operator which matches the operator's name
even if although there are other candidates (e.g., to call oper()->oper_select_candidate()).
func_select_candidate() selects operator "=" for (text, text) in the above case because "=" for
(btchar, btchar) is typispreferred=false. So, it seems to me that it can solve the issue
if generate_operator_name() checks all candidates.

I think the second solution is better because callers might expect to use specified operator
even if there are other candidates' operator which can handle the arguments.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2024-07-09 05:23:03 Re: Pluggable cumulative statistics
Previous Message Michael Paquier 2024-07-09 05:16:13 Re: Injection point locking