Questions regarding Index AMs and natural ordering

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Questions regarding Index AMs and natural ordering
Date: 2023-11-23 17:16:31
Message-ID: CAEze2WgOAiKHoQFXujDd8kopq=bJLFVHh9mKfA502Rxf1Xh3dQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Over in [0] and [1] there are patches that touch on the topic of
'natual ordering' index retrieval, and [2] also touches on the topic.
For those patches, I've been looking at how the planner and executor
indicate to index AMs that they expects the output to be ordered, and
how this ordering should work.
I've mostly found how it works for index_key opr constant, but I've
yet to find a good mental model for how the planner handles indexes
that can expose the 'intrinsic order' of data, i.e. indexes with
`amcanorder=true`, because there is very little (if any) real
documentation on what is expected from indexes when it advertises
certain features, and how the executor signals to the AM that it wants
to make use of those features.

For example, btree ignores any ordering scan keys that it is given in
btrescan, which seems fine for btree because the ordering of a btree
is static (and no other order than that is expected apart from it's
reverse order), but this becomes problematic for other indexes that
could return ordered data but would prefer not to have to go through
the motions of making sure the return order is 100% correct, rather
than a k-sorted sequence, or just the matches to the quals (like
GIST). Is returning index scan results in (reverse) natural order not
optional but required with amcanorder? If it is required, why is the
am indicator called 'canorder' instead of 'willorder', 'doesorder' or
'isordered'?

Alternatively, if an am should be using the order scan keys from
.amrescan and natural order scans also get scan keys, is there some
place I find the selection process for ordered index AMs, and how this
ordering should be interepreted? There are no good examples available
in core code because btree is special-cased, and there are no other
in-tree AMs that have facilities where both `amcanorderbyop` and
`amcanorder` are set.

I did read through indexam.sgml, but that does not give a clear answer
on this distinction of 'amcanorder' having required ordered results or
not, nor on how to interpret amrescan's orderbys argument. I also
looked at planner code where it interacts with amcanorder /
amcanorderbyop, but I couldn't wrap my head around its interactions
with indexes, either (more specifically, the ordering part of those
indexes) due to the complexity of the planner and the many layers that
the various concepts are passed through. The README in
backend/optimizer didn't answer this question for me, either.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

[0] https://www.postgresql.org/message-id/flat/EB2AF704-70FC-4D73-A97A-A7884A0381B5%40kleczek.org
[1] https://www.postgresql.org/message-id/flat/CAH2-Wz%3DksvN_sjcnD1%2BBt-WtifRA5ok48aDYnq3pkKhxgMQpcw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/e70fa091-e338-1598-9de4-6d0ef6b693e2%40enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-11-23 17:30:58 Re: Use index to estimate expression selectivity
Previous Message Nathan Bossart 2023-11-23 16:51:09 Re: autovectorize page checksum code included elsewhere