Re: Questions regarding Index AMs and natural ordering

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Subject: Re: Questions regarding Index AMs and natural ordering
Date: 2023-11-27 14:55:08
Message-ID: CAEze2WiAU3iv=PYE56OLEW1F=b3R+8sAgcrtRJEtoj37EKpfgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 24 Nov 2023, 19:58 Tom Lane, <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Peter Geoghegan <pg(at)bowt(dot)ie> writes:
> > On Fri, Nov 24, 2023 at 8:44 AM Matthias van de Meent
> > <boekewurm+postgres(at)gmail(dot)com> wrote:
> >> Yes, the part where btree opclasses determine a type's ordering is
> >> clear. But what I'm looking for is "how do I, as an index AM
> >> implementation, get the signal that I need to return column-ordered
> >> data?" If that signal is "index AM marked amcanorder == index must
> >> return ordered data", then that's suboptimal for the index AM writer,
> >> but understandable. If amcanorder does not imply always ordered
> >> retrieval, then I'd like to know how it is signaled to the AM. But as
> >> of yet I've not found a clear and conclusive answer either way.
>
> > I suppose that amcanorder=true cannot mean that, since we have the
> > SAOP path key thing (at least for now).
>
> As things stand, amcanorder definitely means that the index always
> returns ordered data, since the planner will unconditionally apply
> pathkeys to the indexscan Paths generated for it (see plancat.c's
> get_relation_info which sets up info->sortopfamily, and
> build_index_pathkeys which uses that). We could reconsider that
> definition if there were a reason to, but so far it hasn't seemed
> interesting.

For GIST there is now a case for improving the support for optionally
ordered retrieval, as there is a patch that tries to hack ORDER BY
support into GIST. Right now that patch applies (what I consider) a
hack by implicitly adding an operator ordering clause for ORDER BY
column with the column type's btree ordering operator, but with
improved APIs that shouldn't need such a hacky approach.

> The hack in 807a40c5 is a hack, without a doubt, but
> that doesn't necessarily mean we should spend time on generalizing it,
> and even less that we should have done so in 2012. Maybe by now there
> are non-core index AMs that have cases where it's worth being pickier.
> We'd have to invent some API that allows the index AM to have a say in
> what pathkeys get applied.

I think that would be quite useful, as it would allow indexes to
return ordered results in other orders than the defined key order, and
it would allow e.g. BRIN to run its sort for ordered retrieval inside
the index scan node (rather than requiring its own sort node type).

CC: Tomas, maybe you have some ideas about this as well? What was the
reason for moving BRIN-assisted sort into its own node? Was there more
to it than "BRIN currently doesn't have amgettuple, and amgettuple
can't always be used"?

Kind regards,

Matthias van de Meent

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-11-27 14:56:22 Re: Emitting JSON to file using COPY TO
Previous Message David G. Johnston 2023-11-27 14:43:08 Re: Emitting JSON to file using COPY TO