Why is FOR ORDER BY function getting called when the index is handling ordering?

From: Chris Cleveland <ccleveland(at)dieselpoint(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Why is FOR ORDER BY function getting called when the index is handling ordering?
Date: 2024-05-02 16:49:53
Message-ID: CABSN6Vc0VXr+1w+dn0iUJ--LgGUwM6XwBxcXN+vF15fMsYE9cg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry to have to ask for help here, but no amount of stepping through code
is giving me the answer.

I'm building an index access method which supports an ordering operator:

CREATE OPERATOR pg_catalog.<<=>> (
FUNCTION = rdb.rank_match,
LEFTARG = record,
RIGHTARG = rdb.RankSpec
);

CREATE OPERATOR CLASS rdb_ops DEFAULT FOR TYPE record USING rdb AS
OPERATOR 1 pg_catalog.<===> (record, rdb.userqueryspec),
OPERATOR 2 pg_catalog.<<=>> (record, rdb.rankspec) FOR ORDER BY
pg_catalog.float_ops;

Here is the supporting code (in Rust):

#[derive(Serialize, Deserialize, PostgresType, Debug)]
pub struct RankSpec {
pub fastrank: String,
pub slowrank: String,
pub candidates: i32,
}

#[pg_extern(
sql = "CREATE FUNCTION rdb.rank_match(rec record, rankspec
rdb.RankSpec) \
RETURNS real IMMUTABLE STRICT PARALLEL SAFE LANGUAGE c \
AS 'MODULE_PATHNAME', 'rank_match_wrapper';",
requires = [RankSpec],
no_guard
)]
fn rank_match(_fcinfo: pg_sys::FunctionCallInfo) -> f32 {
// todo make this an error
pgrx::log!("Error -- this ranking method can only be called when
there is an RDB full-text search in the WHERE clause.");
42.0
}

#[pg_extern(immutable, strict, parallel_safe)]
fn rank(fastrank: String, slowrank: String, candidates: i32) ->
RankSpec {
RankSpec {
fastrank,
slowrank,
candidates,
}
}

The index access method works fine. It successfully gets the keys and the
orderbys in the amrescan() method, and it dutifully returns the appropriate
scan.xs_heaptid in the amgettuple() method. It returns the tids in the
proper order. It also sets:

scandesc.xs_recheck = false;
scandesc.xs_recheckorderby = false;

so there's no reason the system needs to know the actual float value
returned by rank_match(), the ordering operator distance function. In any
case, that value can only be calculated based on information in the index
itself, and can't be calculated by rank_match().

Nevertheless, the system calls rank_match() after every call to
amgettuple(), and I can't figure out why. I've stepped through the code,
and it looks like it has something to do with ScanState.ps.ps.ps_ProjInfo,
but I can't figure out where or why it's getting set.

Here's a sample query. I have not found a query that does *not* call
rank_match():

SELECT *
FROM products
WHERE products <===> rdb.userquery('teddy')
ORDER BY products <<=>> rdb.rank();

I'd be grateful for any help or insights.

--
Chris Cleveland
312-339-2677 mobile

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2024-05-02 17:12:27 Re: Parallel CREATE INDEX for GIN indexes
Previous Message Tristan Partin 2024-05-02 16:23:06 Specify tranch name in error when not registered